COSMIC (Catalogue of Somatic Mutations in Cancer) is a data resource that is designed to store and display somatic mutation information and related details and contains information relating to human cancers. Data in COSMIC is curated from known Cancer Genes Literature and Systematic Screens. COSMIC data is freely downloadable in many formats on the project's FTP site: ftp://ftp.sanger.ac.uk/pub/CGP/cosmic.
If you use COSMIC annotations, please credit the project with the following acknowledgement:
The mutation data was obtained from the Sanger Institute Catalogue Of Somatic Mutations In Cancer web site, http://www.sanger.ac.uk/cosmic. Bamford et al (2004). The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br J Cancer, 91,355-358.
There are 3 data sources that you can use to annotate vtools project variants with those from the COSMIC project. There are 2 higher-level databases that annotate variants with information such as how many cancer samples have been documented to contain the variant. These databases include a database that annotates coding mutations (CosmicCodingMuts) and a database that annotates noncoding variants (CosmicNonCodingVariants).
There is also a more detailed database (CosmicMutantExport) that can be linked to these higher-level databases (e.g., vtools use CosmicMutantExport --linkedby CosmicCodingMuts.COSMIC_ID
or vtools use CosmicMutantExport --linked_by CosmicNonCodingVariants.COSMIC_ID
) to extract detailed information about the COSMIC variant (such as variant details and details of the samples the variant was detected in). See below for the available annotation fields from these databases.
% vtools use CosmicCodingMuts
% vtools show annotation CosmicCodingMuts -v2
Annotation database CosmicCodingMuts (version v61_260912)
Description: Cosmic coding mutation database. This data contains mutations
affecting 10 or less nucleotides in REF. The mutation data was obtained
from the Sanger Institute Catalogue Of Somatic Mutations In Cancer web site,
http://www.sanger.ac.uk/cosmic. Bamford et al (2004). The COSMIC (Catalogue
of Somatic Mutations in Cancer) database and website. Br J Cancer, 91,355-358.
Database type: variant
Number of records: 216,900
Number of distinct variants: 198,331
Reference genome hg19: ['chr', 'pos', 'ref', 'alt']
Field: chr
Type: string
Comment: Chromosome
Missing entries: 0
Unique Entries: 25
Field: pos
Type: integer
Comment: 1-based position
Missing entries: 0
Unique Entries: 193,076
Range: 8115 - 249212084
Field: COSMIC_ID
Type: string
Comment: cosmic id of mutation
Missing entries: 0
Unique Entries: 216,900
Field: ref
Type: string
Comment: Reference allele, '-' for insertion.
Missing entries: 0
Unique Entries: 1,241
Field: alt
Type: string
Comment: Alternative allele, '-' for deletion.
Missing entries: 0
Unique Entries: 1,138
Field: gene
Type: string
Comment: genename
Missing entries: 0
Unique Entries: 20,405
Field: strand
Type: string
Comment: strand
Missing entries: 0
Unique Entries: 2
Field: CDS
Type: string
Comment: CDS annotation
Missing entries: 0
Unique Entries: 65,794
Field: AA
Type: string
Comment: Peptide annotation
Missing entries: 0
Unique Entries: 111,311
Field: CNT
Type: integer
Comment: Number of samples with this mutation
Missing entries: 0
Unique Entries: 157
Range: 1 - 29906
% vtools use CosmicNonCodingVariants
% vtools show annotation CosmicNonCodingVariants -v2
Annotation database CosmicNonCodingVariants (version v61_260912)
Description: Cosmic non-coding mutation database. This data contains
mutations affecting 10 or less nucleotides in REF. The mutation data
was obtained from the Sanger Institute Catalogue Of Somatic Mutations
In Cancer web site, http://www.sanger.ac.uk/cosmic. Bamford et al
(2004). The COSMIC (Catalogue of Somatic Mutations in Cancer) database
and website. Br J Cancer, 91,355-358.
Database type: variant
Number of records: 108,713
Number of distinct variants: 104,410
Reference genome hg19: ['chr', 'pos', 'ref', 'alt']
Field: chr
Type: string
Comment: Chromosome
Missing entries: 0
Unique Entries: 24
Field: pos
Type: integer
Comment: 1-based position
Missing entries: 0
Unique Entries: 104,370
Range: 13663 - 249204167
Field: COSMIC_ID
Type: string
Comment: cosmic id of mutation
Missing entries: 0
Unique Entries: 108,713
Field: ref
Type: string
Comment: Reference allele, '-' for insertion.
Missing entries: 0
Unique Entries: 1,251
Field: alt
Type: string
Comment: Alternative allele, '-' for deletion.
Missing entries: 0
Unique Entries: 152
Field: gene
Type: string
Comment: genename
Missing entries: 88,900 (81.8% of 108,713 records)
Unique Entries: 7,501
Field: strand
Type: string
Comment: strand
Missing entries: 88,900 (81.8% of 108,713 records)
Unique Entries: 2
% vtools use CosmicMutantExport --linked_by CosmicCodingMuts.COSMIC_ID
% vtools show annotation CosmicMutantExport -v2
Annotation database CosmicMutantExport (version v61_260912)
Description: Cosmic mutant export. This data contains all coding
point mutations. The mutation data was obtained from the Sanger
Institute Catalogue Of Somatic Mutations In Cancer web site,
http://www.sanger.ac.uk/cosmic. Bamford et al (2004). The
COSMIC (Catalogue of Somatic Mutations in Cancer) database and
website. Br J Cancer, 91,355-358.
Database type: field
Number of records: 404,865
Number of distinct entries: 224,650
Reference genome *: ['COSMIC_ID']
Field: COSMIC_ID
Type: string
Missing entries: 0
Unique Entries: 224,650
Field: Gene_name
Type: string
Missing entries: 0
Unique Entries: 20,451
Field: Accession_Number
Type: string
Missing entries: 0
Unique Entries: 20,403
Field: Gene_CDS_length
Type: string
Missing entries: 0
Unique Entries: 2,220
Field: HGNC_ID
Type: string
Missing entries: 0
Unique Entries: 16,990
Field: Sample_name
Type: string
Missing entries: 0
Unique Entries: 179,301
Field: ID_sample
Type: string
Missing entries: 0
Unique Entries: 183,630
Field: ID_tumour
Type: string
Missing entries: 0
Unique Entries: 181,851
Field: Primary_site
Type: string
Missing entries: 0
Unique Entries: 44
Field: Site_subtype
Type: string
Missing entries: 0
Unique Entries: 185
Field: Primary_histology
Type: string
Missing entries: 0
Unique Entries: 91
Field: Histology_subtype
Type: string
Missing entries: 0
Unique Entries: 417
Field: Genomewide_screen
Type: string
Missing entries: 0
Unique Entries: 3
Field: Mutation_ID
Type: string
Missing entries: 0
Unique Entries: 224,650
Field: Mutation_CDS
Type: string
Missing entries: 0
Unique Entries: 69,434
Field: Mutation_AA
Type: string
Missing entries: 0
Unique Entries: 115,530
Field: Mutation_Description
Type: string
Missing entries: 0
Unique Entries: 17
Field: Mutation_zygosity
Type: string
Missing entries: 0
Unique Entries: 4
Field: Mutation_NCBI36_genome_position
Type: string
Missing entries: 0
Unique Entries: 35,240
Field: Mutation_NCBI36_strand
Type: string
Missing entries: 0
Unique Entries: 4
Field: Mutation_GRCh37_genome_position
Type: string
Missing entries: 0
Unique Entries: 198,031
Field: Mutation_GRCh37_strand
Type: string
Missing entries: 0
Unique Entries: 4
Field: Mutation_somatic_status
Type: string
Missing entries: 0
Unique Entries: 7
Field: Pubmed_PMID
Type: string
Missing entries: 0
Unique Entries: 7,690
Field: Sample_source
Type: string
Missing entries: 0
Unique Entries: 30
Field: Tumour_origin
Type: string
Missing entries: 0
Unique Entries: 9
Field: Comments
Type: string
Missing entries: 0
Unique Entries: 3,202