Bringing "Dark Data" to Light
Caroline Chaboo regularly fields phone calls and emails from homeowners, gardeners and even U.S. customs officials who ask her to help identify bugs. The University of Kansas entomologist is a leading expert on beetles and performs research around the world, including in Kansas.
And Chaboo takes the time to help people with their insect-related curiosities and concerns.
“I ask them questions, and they send me pictures,” she said.
But now, a new grant from the National Science Foundation’s Advancing Digitization of Biological Collections program will enable Chaboo to put photos, data and maps relating to thousands of insects such as such as aphids, hoppers and cicadas (collectively known to scientists as Hemiptera) onto the Internet. Also, information about their host plants and parasites will be digitized and put on the Web. Anybody will be able to access the information with a few keystrokes.
“In the course of human evolution, we’ve asked these questions from the beginning,” said Chaboo. “We’ve always wanted to know what was around us, what things were useful to us, what was edible and what was poisonous. It’s a pretty fundamental part of the human experience. It’s probably part of our genetic code that we’re all taxonomists — we all want to know the names of things.”
Indeed, generations of scientists have collected specimens of plants and animals in the field and stored them in institutions around the world. For instance, the KU Insect Collection has one of the preeminent university assemblages of Hemiptera, Coleoptera (beetles) and Hymenoptera (ants, bees and wasps), amassed through the efforts of curators and students since it was established in 1870.
At all institutions, examples of biodiversity are labeled, and then preserved in boxes and drawers within climate-controlled, fireproof steel cabinets. Usually, new species are described and named in academic journals.
The problem is that much of the biological information is “dark data.” It hasn’t been made straightforwardly accessible to non-scholars, and is at times unavailable even for experts.
“If you know of a specialist working in an area, you would write to them — if they were still alive — and ask what have you gotten from Peru or South Africa of this particular group?” Chaboo said. “Or you would write to a collection and ask what they have of a certain species. But it’s skewed toward systematic and evolutionary biology and museum work.”
As an entomologist and curator at the KU Museum of Natural History and Biodiversity Institute, Chaboo herself has found difficulty hunting down information about insects in her field that were described by anthropologists, for instance, instead of evolutionary biologists.
“I’m unable to access certain kinds of literature — not because I’m not searching, but because I’m unaware that it exists,” she said.
The grant to KU is a subcontract of a larger $1.5 million NSF effort involving 15 botanical and 19 entomological collections around the nation. It is titled “Plants, Herbivores and Parasitoids: A Model System for the Study of Tri-Trophic Associations.” The effort will create online information and images for about 4 million specimens.
Chaboo will oversee specialists and undergraduate student workers as they verify species information in the KU Insect Collection and convert it into digital data and images. In the meantime, her colleague Craig Freeman, botany curator at KU’s MacGregor Herbarium, will lead a team digitizing information about the plants that are hosts of the insects.
The results will be made available online in an easy-to-use format, making publically available the collection’s implications for genetics, the ecology and biological diversity, as well as quenching people’s thirst for a better understanding of nature.
“For any end-user — for example you’re an amateur or farmer who just wants to know what bugs are in your garden or greenhouse — this will help you identify insects through photographs and also map where those things are,” Chaboo said.
Due to the strength of its insect collections, KU is involved in two of four NSF grants relating to digitization of biological records. A second KU entomologist, Andrew Short, is leading a separate effort funded by the same umbrella program called “InvertNet — An Integrative Platform for Research on Environmental Change, Species Discovery and Identification.” – Brendan Lynch