Making your data more accessible to you with Open Humans and the Genevieve Genome Report

by Ginger Tsueng

One of our pride points for being able to pool, standardize, and share gene, variant, and other “BioThings” annotation data as a service, is that our service is fast! The reason that MyGene.info and MyVariant.info are made with speed in mind is that we want them to be useful to bioinformaticians and tool/resource developers alike! How can we tell if we’ve successfully provided a useful service?

One measure we LOVE, is when a user builds something useful or amazing with our service--especially when the product has the potential to revolutionize the way science is conducted. With over 6000 members, the Open Humans Project is consists of DIY bio, data science, digital health, data sharing service, bioinformatics, and a strong community of enthusiasts. If you have never heard of the project you should definitely head over to https://www.openhumans.org/ and explore the different types of data you can collect and share to learn about yourself! Although our BioThings APIs may play only a tiny role in this amazing project, we nonetheless would like to spread the word about this project and the amazing community that has been built around it.

Mad Price Ball (@madprime), the executive director and visionary at the helm of the Open Humans Foundation and the primary developer of the Genevieve Genome Report was kind enough to answer our questions about this incredible endeavor.

openhumans-1

In one tweet or less, introduce us to Open Humans Project:

Open Humans helps people get and explore their health and personal data. We also enable projects that work with members and their data -- anyone can set one up, academics and citizen scientists alike! Data is private by default: you choose which projects to work with.

What was the original intent behind Open Humans Project? (How did Open Humans Project come about? How did the Genevieve Genome Report become part of the Open Humans Project? What was the motivation behind that?)

Open Humans has its origins in the Personal Genome Project (and similar, like openSNP), which had a "one size fits all" approach to donating genomes: make them public. It's a great start, but what we really wanted was data to be re-usable -- making data sharing easy while enabling people to decide when they want to share.

Genevieve is a personal project of mine -- it's not part of Open Humans, not any more than anyone else's would be! It uses all the same APIs that members have access to.

I built Genevieve based on my work with the PGP, applying my experience with helping participants understand the contents of their genome data. Genevieve reports are specialized to detect rare variants that have been reported to cause "Mendelian" diseases -- the sort of thing you hope to find in genome/exome data (although 23andMe still has interesting stuff!).

It looks like the errors warning section of Genevieve’s About page also strove to correct some common misperceptions about genome sequence data analysis. Do you expect that the Open Humans Project to be able to help address some of these issues?

Open Humans helps us open up our data -- for ourselves and others. My hope is that tools like Genevieve help people see the complexity in stuff like genetic data, that the meaning of data isn't always easy or obvious!

How has Open Humans Project since improved? (key improvements, not just GitHub commits)

One of my favorite expansions has been Jupyter Notebooks! Members can create little analyses -- and other members can run that same code on their own data. Building something like Genevieve has a lot of overhead: I had to build a web app, not just an analysis. Notebooks are easier.

There's a gallery of notebooks at https://exploratory.openhumans.org/ -- you'll see it goes well beyond genetic data. I made one for guessing eye color here: https://exploratory.openhumans.org/notebook/2/ and Kevin (mentioned below) has a notebook exploring ancestry with imputed data from his project: https://exploratory.openhumans.org/notebook/21/

And there's a lot of new data sources, from Fitbit to GPS data to Spotify. I need to credit Bastian Greshake Tzovaras with a lot of this. He's been working with Open Humans & got the notebooks running, and many of the data sources. (You might be familiar with Bastian from his own project -- openSNP!)

What improvements have since been made on Genevieve, and what technologies enabled these improvements?

I haven't done much to improve Genevieve, but it benefits from what others add!

There's a new project rolling out -- Kevin Arvai's Imputer project (if you don't know what "imputation" is, the "about" page has a great explanation): https://openimpute.com/

It was super easy to update Genevieve to accept this data. So now you can use Genevieve on imputed data -- just use Kevin's project first. That's the sort of thing Open Humans enables.

Who is currently the intended audience for Open Humans Project?

For Open Humans, it's diverse -- we love personal data explorers of all types, from genomes to fitbit users and more. We also love data donors and patient communities that want to see their data used for something more.

And data geeks! Open Humans is a way you can play with data -- and help others learn. We'd love to have you be part of it.

Folks are welcome to chat with us at slackin.openhumans.org

How does Open Humans Project use MyGene.info or MyVariant.info services?

It doesn't, but Genevieve does. ;-)

Genevieve checks the MyVariant database to get data about variants in a personal genome. This saves me the burden of maintaining all that information locally! Mainly it gets information about ClinVar (reported diseases for a variant), and frequency information (usually GnomAD).

The combination helps you discover very rare variants that may have dramatic consequences. My own report illustrates one such finding: https://genevieve.herokuapp.com/genome_report/660/ A frameshift in ASPM that causes microcephaly and intellectual disability in a recessive manner. I inherited this from my mom, she has two affected brothers.

What are some of Open Humans Project’s successes (news releases, papers published)?

We submitted a manuscript -- Bastian led this effort! You can find a copy here: https://www.biorxiv.org/content/early/2018/11/14/469189

What improvements are planned for Open Humans Project?

We want to improve communications -- especially to help people interact with each other! Open Humans members are amazing, and currently the site itself does very little to help members get to know each other and be a community.

Are there any plans to join/link the Open Humans Project with NIH’s All of Us effort? Why or Why not?

What we hope happens is that All of Us gives its participants data portability -- the ability to share their data with others. Open Humans would be well-positioned to receive that data and connect it to our ecosystem, enabling people to explore it & use it in new ways!

Additional information/links:

Genevieve: genevieve.herokuapp.com
Open Humans: www.openhumans.org
Open Humans community chat: slackin.openhumans.org

Editor's note: It didn't come up in the Q/A, but the Open Humans Foundation, which operates and manages www.openhumans.org, is a registered 501(c)(3) nonprofit corporation. You can donate to help support and expand their programs, furthering their mission of empowering individuals to explore and share their personal data for the purposes of education, health, and research.