New Data Release for MyVariant.info 201708

data release myvariant.info

Another fresh data release for MyVariant.info is out! In this data release, we have updated the data from ClinVar and UniProtKB to their latest versions, and also added variant annotation from CIViC and Cancer Genome Interpreter. Here are more details.

Data Sources Updated

ClinVar was updated to its latest (same version for both hg19 and hg38 assembly). And the variant annotations from UniProtKB were also updated to the latest (hg38 only):

Some numbers for GRCh37/hg19 variants:

last release new release # of variants
in last release
# of variants
in new release
ClinVar 2017-06 2017-08 307,101 310,280

Similarly, some numbers for GRCh38/hg38 variants:

last release new release # of variants
in last release
# of variants
in new release
UniProtKB 2017-03 2017-07 477,711 527,607
ClinVar 2017-06 2017-08 307,286 310,577

ClinVar annotations are available under "clinvar" subfields for each annotated variant. UniProtKB annotations are available under "uniprot" subfields for each annotated variant. MyVariant.info aggregates annotations from ClinVar, dbSNP, dbNSFP and other 17 sources for each variant, so you can access them all in one request.

The total number of unique variants is now over 424M (424,524,227), slightly higher than our previous release on June 2017, which is 424,519,520. More details about the variant data we provide from MyVariant.info are always available from our documentation. The programmatic access of this information is available from our metadata endpoint (and hg38 metadata).


New Data Sources Added

In this data release, we added variant annotations from CIViC and Cancer Genome Interpreter (CGI), through our collaborations with the GA4GH VICC working group. Both provide extensive annotations of cancer-associated genetic variants. And more specifically:

CIViC is an open access, open source, community-driven web resource for Clinical Interpretation of Variants in Cancer. The goal of CIViC is to enable precision medicine by providing an educational forum for the dissemination of knowledge and active discussion of the clinical significance of cancer genome alterations.

Cancer Genome Interpreter is designed to support the identification of tumor alterations that drive the disease and detect those that may be therapeutically actionable. CGI relies on existing knowledge collected from several resources and on computational methods that annotate the alterations in a tumor according to distinct levels of evidence.

You can access the data from CIViC under "civic" field. And note that "civic" field is only available for hg19 variants. Here are a few query examples:

curl 'http://myvariant.info/v1/variant/chr11:g.534285C%3ET?fields=civic'
curl 'http://myvariant.info/v1/variant/chr1:g.11187094G%3ET?fields=civic'
curl 'http://myvariant.info/v1/variant/chr17:g.7578455C%3EA?fields=civic'

You can access the data from Cancer Genome Interpreter under "cgi" field. And note that "cgi" field is only available for hg19 variants. Here are a few query examples:

curl 'http://myvariant.info/v1/variant/chr3:g.178936091G%3ET?fields=cgi'
curl 'http://myvariant.info/v1/variant/chr3:g.41266109C%3ET?fields=cgi'
curl 'http://myvariant.info/v1/variant/chr3:g.41266113C%3EG?fields=cgi'

You can also do some combined queries just like other data sources we have:


curl 'http://myvariant.info/v1/variant/chr2:g.29443600G%3ET?fields=civic%2Ccgi'
curl 'http://myvariant.info/v1/query?q=_exists_:civic%20AND%20_exists_:cgi&fields=civic%2Ccgi'  
curl 'http://myvariant.info/v1/query?q=civic.evidence_items.drugs.name:crizotinib&fields=civic'  
curl 'http://myvariant.info/v1/query?q=cgi.gene:ALK%20AND%20cgi.association:resistant&fields=cgi'

That's all! And as always, feel free to reach us at help@myvariant.info or @myvariantinfo if you have any questions or feedback.