MapSNPs annotation summary report explained

Following is a description of MapSNPs annotation summary report. MapSNPs genomic SNP annotation tool is part of the PolyPhen-2 Batch query web service. Whenever you submit genomic SNPs in the form of chromosome coordinates/alleles, a report formatted as described below will appear under SNPs link on the Batch query results web page. It is a plain text tab-separated file with each line annotating a corresponding protein sequence variant (amino acid residue substitution) for each missense allelic variant found in input.

Columns 18-32 will contain ”?” placeholders for SNPs annotated as non-coding; columns 33-37 will have values only for SNPs annotated in dbSNP build 132.

Note: MapSNPs as a part of PolyPhen-2 Batch query service, filters SNP annotations in the output depending on the user selection of SNP functional categories via Annotations menu under Advanced Options section of the input form. Selecting All disables filtering and results in annotations for all SNP categories reported in pph2-snps.txt file. However, PolyPhen-2 predictions (reported in pph2-short.txt and pph2-full.txt files) are produced for missense SNPs only, regardless of the Annotations option selected.

1 snp_pos input SNP chromosome:position (chromosome coordinates are 1-based)
2 str transcript strand (”+” or ”-”)
3 gene gene symbol
4 transcript UCSC transcript name
5 ccid UCSC canonical cluster ID (number)
6 ccds NCBI CCDS cluster ID
7 cciden NCBI CCDS CDS similarity level by genomic overlap with the corresponding UCSC known gene transcript
8 refa reference allele / variant allele (”+” strand)
9 type SNP functional category (“coding-synon”, “intron”, “nonsense”, “missense”, “utr-3”, “utr-5”)
10 ntpos mutation position in the full transcript nucleotide sequence (in the direction of transcription)
11 nt1 reference nucleotide (transcript strand)
12 nt2 variant nucleotide (transcript strand)
13 flanks nucleotides flanking mutation position in the transcript sequence, enumerated in the direction of transcription (5'3')
14 trv transversion mutation (0 - transition, 1 - transversion)
15 cpg CpG context: 0 - non-CpG context retained, 1 - mutation removes CpG site, 2 - mutation creates new CpG site, 3 - CpG context retained: C(C/G)G
16 jxdon distance from mutation position to the nearest donor exon / intron junction (”-” for upstream, ”+” for downstream)
17 jxacc distance from mutation position to the nearest acceptor intron / exon junction (”-” for upstream, ”+” for downstream)
18 exon mutation in exon # / of total exons (exons are enumerated in the direction of transcription)
19 cexon same as above but for coding exons only
20 jxc mutation in a codon that is split across two exons (? - no, 1 - yes)
21 dgn degeneracy index for mutated codon position, by Nei & Kumar (2000) “Molecular Evolution and Phylogenetics”, page 64 (0 - non-degenerate, 2 - simple 2-fold degenerate, 3 - complex 2-fold degenerate, 4 - 4-fold degenerate)
22 cdnpos number of the mutated codon within transcript's CDS (1-base)
23 frame mutation position offset within the codon (0..2)
24 cdn1 reference codon nucleotides
25 cdn2 mutated codon nucleotides
26 aa1 wild type (reference) amino acid residue
27 aa2 mutant (substitution) amino acid residue
28 aapos position of amino acid substitution in protein sequence (1-base)
29 spmap CDS protein sequence similarity to known UniProtKB protein (? - no match)
30 spacc UniProtKB protein accession
31 spname UniProtKB protein entry name
32 refs_acc RefSeq protein accession
33 dbrsid dbSNP SNP rsID
34 dbobsrvd dbSNP observed alleles (transcript strand)
35 dbavHet dbSNP average heterozygosity from all observations
36 dbavHetSE dbSNP standard error for the average heterozygosity
37 dbRmPaPt dbSNP reference orthologous alleles in macaque (Rm), orangutan (Pa) and chimp (Pt)
38 Comments optional user comments, copied from input
Last modified: 2012/02/13 11:31
