About .. .. ..... . . . .Partners. . .... . . .. .... Methods ... . . .. . . ... Data . .. ... .. .. . . Products

CalBug: Unlocking the data in Natural History Museums

Natural history collections house over a billion insect specimens worldwide collected over several centuries. Specimen labels encode data denoting species, location, and date captured and are used to study biogeographic patterns, spread of invasive species, and responses to land use, climate, and other environmental changes. However, access to data is impractical for most of the research community. Because of enormous collection sizes, entomology has lagged behind other disciplines in digitizing collections. In 2010 the Essig Museum began CalBug, a collaborative project among nine California museums* with a goal to digitize and geographically reference over one million specimens from target groups and localities.

Watch the video on CalBug and Environmental Change.

How you can help: Start transcribing label data at our Notes from Nature project for Citizen Scientists

What's new with CalBug?



Notes from Nature - A Citizen Science Project: In 2013 we partnered with Zooniverse to build “Notes From Nature” (NfN). The goals of this effort are to accelerate the rate of data entry through crowdsourcing and enhance public participation. Help us unlock data stored in our museums by joining our Notes from Nature team.



Recent iDigBio workshops that we have been involved with include the Public Participation in Digitization, augmenting OCR for specimen labels, and Digitizing Dry Insect Collections.

computer vision


We are collaborating with the Computer Vision group at UC San Diego to explore the use of OCR for automating data extraction from specimen labels. Serge Belongie directly oversees progress & technical design of the project. Phuc Nguyen is implementing and managing the text extraction algorithm - read his blog.




The BiGCB initiative at the UC Berkeley campus is involved in an effort to understand how biodiversity has changed in the past, with the ultimate goal of providing insights in to the future. To this end, the project involves integrating multiple data types, including specimen data from museum collections, ecological data from field stations, and geospatial layers. This effort is funded by the Keck and Moore foundations.
data model  

* CalBug collaborating institutions: California Academy of Sciences (CAS), California State Arthropod Collection (CSCA), Los Angeles County Museum (LACM), San Diego Natural History Museum (SDNHM), Santa Barbara Museum of Natural History (SBMNH), University of California, Berkeley - Essig Museum (UCB), University of California, Davis - Bohart Museum (UCD), University of California, Riverside - Entomology Research Museum (UCR), University of California, Santa Cruz - Museum of Natural History

Updated May 2013. For questions or comments please contact Peter Oboyski

credits: moth photo (Joyce Gross), honey bee photo (Peter Oboyski), CalBug logo design (Deanna Jackson)