Amanda Devine
Data Wrangler, Global Genome Initiative
27 July 2018
Girls Who Code Summer Immersion Program in Washington DC field trip to the Smithsonian National Museum of Natural History
Slides and Jupyter notebook available at https://github.com/amdevine/gwc-endangered-species
Winston Churchill High School (go Bulldogs!)
Dartmouth College (BA in Biology (Ecology) and Neuroscience
Lab technician (dermatology, infectious disease, coral reefs)
Data wrangler for the Global Genome Initiative
Smithsonian initiative
Collect all of life on Earth
Preserve in cryorepositories for genomic research
Sample data recorded in the Global Genome Biodiversity Network (GGBN) Data Portal
GGI Data Tools website (Django; https://www.globalgeno.me)
GGI Gap Analysis app (Shiny; https://ggidata.shinyapps.io/gapanalysis)
genetic_collections (Python library; https://github.com/MikeTrizna/genetic_collections)
Per Wikipedia: Transforming and mapping data from one raw data form into another format with the intent of making it [useful] for a variety of downstream purposes, such as analytics.
My favorite tools:
Powerful tool for cleaning messy data
Complex filtering, sorting, and grouping
Mass editing records
Special language (GREL) to filter and edit data with formulas
Administered by the U.S. Fish & Wildlife Service
Established in 1973 “to conserve and protect endangered and threatened species and their habitats”
Species are listed under the ESA in two ways:
What question are we trying to answer?
How have rates of listing species under the Endangered Species Act changed over time?
What summary or visualization do we want to produce at the end?
A bar graph showing the number of species listed by year
Database that serves reports on threatened and endangered species
Pre-generated reports available online here: https://ecos.fws.gov/ecp/species-reports
Let’s look at these data in a Jupyter notebook: https://github.com/amdevine/gwc-endangered-species/blob/master/US%20Endangered%20Species%20Data.ipynb
Automate the Boring Stuff with Python. https://automatetheboringstuff.com/
Python Data Science Handbook. https://jakevdp.github.io/PythonDataScienceHandbook/
Coursera: Using Python to Access Web Data. https://www.coursera.org/learn/python-network-data (Can choose to audit the course for free.)
Coursera: Using Databases with Python. https://www.coursera.org/learn/python-databases (Can choose to audit the course for free.)
Women Who Code DC. Meetup group for female-identifying coders in the Washington, DC area. Covers many different tech-related topics, frequent meetups. https://www.meetup.com/Women-Who-Code-DC/
Hear Me Code. Organization that offers beginner coding lessons for women in the Washington, DC area. Also has an excellent Google group that emails out about a lot of professional opportunities. https://hearmecode.com/
Data Carpentry. National organization that offers workshops on data wrangling. The website contains workshop materials if you can’t attend a workshop in person. https://datacarpentry.org/
Title Slide: Grey Crowned Cranes. Image from Pexels, CC0 License. https://www.pexels.com/photo/nature-bird-love-heart-45853/
About Me: Giant Panda. Photo by Cesar Aguilar from Pexels, Pexels License. https://www.pexels.com/photo/panda-1123765/
Bio: Personal photo.
Global Genome Initative: Tissue samples in the NMNH Biorepository. Photo by Adrian Van Allen, 2015.
Data Wrangling: Whale shark at the Georgia Aquarium. Photo by Zac Wolf; CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php?curid=3511009
Data Wrangling: Wonder Woman: Wonder Woman with Lasso. Image from AllPosters. https://www.allposters.ca/-sp/Wonder-Woman-Wonder-Woman-with-Lasso-posters_i13190262_.htm
Endangered Species Data: Bufo periglenes (Golden toad). Photo by Charles H. Smith. Retrieved from Wikipedia: https://commons.wikimedia.org/wiki/File:Bufo_periglenes1.jpg
Thanks: Rafflesia arnoldii. Image from lazypenguins.com, blog post “15 strangely beautiful flowers”. https://lazypenguins.com/15-strangely-beautiful-flowers/
Any Questions: Joes Apartment Cockroach GIF. Image from GIPHY. https://giphy.com/gifs/scarface-when-mtv-was-worth-watching-joes-apartment-CbY83hpLkcrZe