====== MapSNPs annotation summary report explained ====== Following is a description of **MapSNPs** annotation summary report. **MapSNPs** genomic SNP annotation tool is part of the PolyPhen-2 **Batch query** web service. Whenever you submit genomic SNPs in the form of chromosome coordinates/alleles, a report formatted as described below will appear under **SNPs** link on the **Batch query** results web page. It is a plain text tab-separated file with each line annotating a corresponding protein sequence variant (amino acid residue substitution) for each missense allelic variant found in input. Columns 27-40 will contain "?" placeholders for SNPs annotated as non-coding; columns 41-45 will have values only for SNPs annotated in dbSNP build 138. __Note__: **MapSNPs**, run as part of PolyPhen-2 **Batch query** web service, filters SNP annotations in the output depending on the user selection of SNP functional categories made via ''Annotations'' menu, under ''Advanced Options'' section of the input form. Selecting ''All'' disables filtering and results in annotations for all SNP categories reported in ''pph2-snps.txt'' file. However, PolyPhen-2 predictions (reported in ''pph2-short.txt'' and ''pph2-full.txt'' files) are produced for missense SNPs only, regardless of the ''Annotations'' option selected. ^ Column\\ No. ^ Column\\ Name ^ Description ^ | 1 | query_no | input query ordinal | | 2 | snp_pos | input SNP chromosome:position (chromosome coordinates are 1-based) | | 3 | str | transcript strand ("+" or "-") | | 4 | gene | gene symbol | | 5 | transcript | UCSC transcript name (unique identifier) | | 6 | canon | UCSC knowCanonical representative transcript flag: 1\ -\ canonical, 0\ -\ alternative | | 7 | cid | UCSC knownCanonical cluster identifier (number) | | 8 | txcov | transcript coverage, the number of transcipts in UCSC cluster overlapping the mutation position / total number of transcripts in the cluster | | 9 | ccds | CCDS cluster identifier | | 10 | cciden | CCDS CDS similarity level by genomic overlap with the corresponding UCSC knownGene transcript | | 11 | refa | reference allele / variant allele ("+" strand) | | 12 | type | SNP functional category ("coding-synon", "intron", "stop-loss", "nonsense", "missense", "splice-5", "splice-3", "utr-5", "utr-3") | | 13 | ntlen | full transcript length (number of nucleotides) | | 14 | ntpos | mutation position in the full transcript nucleotide sequence (in the direction of transcription) | | 15 | nt1 | reference nucleotide (transcript strand) | | 16 | nt2 | variant nucleotide (transcript strand) | | 17 | PtGgPaNl | orthologous alleles in chimp\ (Pt), gorilla\ (Gg), Orangutan\ (Pa) and gibbon\ (Nl) if different from human reference allele, ?\ -\ otherwise; .\ -\ data not available | | 18 | dref | putative derived allele found in human reference, score: 0\ -\ no evidence, 1\ -\ variant allele matches orthologous ancestral allele, 2\ -\ dbSNP minor allele matches reference allele, 3\ -\ both dbSNP and orthologous evidence present, ?\ -\ not enough evidence to score | | 19 | gerprs | Genomic Evolutionary Rate Profiling (GERP++) position-specific conservation score, RS; 0\ -\ when alignment coverage is insufficient | | 20 | phylop | conservation scoring by phyloP (phylogenetic p-values) from the [[http://compgen.bscb.cornell.edu/phast/|PHAST package]] for multiple alignments of 99 vertebrate genomes to the human genome | | 21 | flanks | nucleotides flanking mutation position in the transcript sequence, enumerated in the direction of transcription (5'3') | | 22 | trv | transversion mutation flag: 0\ -\ transition, 1\ -\ transversion | | 23 | CpG | CpG context: 0\ -\ non-CpG context retained, 1\ -\ mutation removes CpG site, 2\ -\ mutation creates new CpG site, 3\ -\ CpG context retained: C(C/G)G substitution | | 24 | JXdon | distance from mutation position to the nearest donor exon / intron junction ("-" for upstream, "+" for downstream) | | 25 | JXacc | distance from mutation position to the nearest acceptor intron / exon junction ("-" for upstream, "+" for downstream) | | 26 | JXc | mutation in a codon that is split across two exons: ?\ -\ no, 1\ -\ yes | | 27 | exon | mutation in exon # / of total exons (exons are enumerated in the direction of transcription) | | 28 | cexon | same as above but only coding (CDS) exons are being enumerated | | 29 | cdnpos | number of the mutated codon within transcript's CDS (1-base) | | 30 | frame | mutation position offset within the codon (0..2) | | 31 | dgn | degeneracy index for mutated codon position, by Nei & Kumar (2000) "Molecular Evolution and Phylogenetics", page 64: 0\ -\ non-degenerate, 2\ -\ simple 2-fold degenerate, 3\ -\ complex 2-fold degenerate, 4\ -\ 4-fold degenerate | | 32 | cdn1 | reference codon | | 33 | cdn2 | mutated codon | | 34 | aapos | position of amino acid substitution in the protein sequence (1-base) | | 35 | aa1 | wild type (reference) amino acid residue | | 36 | aa2 | mutant (substitution) amino acid residue | | 37 | spmap | CDS protein sequence similarity to known UniProtKB protein (?\ -\ no match) | | 38 | spacc | UniProtKB protein accession | | 39 | spname | UniProtKB protein entry name | | 40 | refs_acc | RefSeq protein accession | | 41 | dbrsid | dbSNP SNP rsID | | 42 | dbobsrvd | dbSNP observed alleles (transcript strand) | | 43 | dbminor | dbSNP minor allele nucleotide (transcript strand) | | 44 | dbmaf | dbSNP minor allele frequency | | 45 | dbPtPaRm | dbSNP orthologous alleles in chimp\ (Pt), orangutan\ (Pa) and macaque\ (Rm) | | 46 | Comments | optional user comments, copied from input |