User Tools

Site Tools


appendix_b
no way to compare when less than two revisions

Differences

This shows you the differences between two versions of the page.


appendix_b [2012/02/13 11:31] (current) – created - external edit 127.0.0.1
Line 1: Line 1:
 +====== MapSNPs annotation summary report explained ======
 +
 +Following is a description of **MapSNPs** annotation summary report. **MapSNPs** genomic SNP annotation tool is part of the PolyPhen-2 **Batch query** web service. Whenever you submit genomic SNPs in the form of chromosome coordinates/alleles, a report formatted as described below will appear under **SNPs** link on the **Batch query** results web page. It is a plain text tab-separated file with each line annotating a corresponding protein sequence variant (amino acid residue substitution) for each missense allelic variant found in input.
 +
 +Columns 18-32 will contain "?" placeholders for SNPs annotated as non-coding; columns 33-37 will have values only for SNPs annotated in dbSNP build 132.
 +
 +__Note__: **MapSNPs** as a part of PolyPhen-2 **Batch query** service, filters SNP annotations in the output depending on the user selection of SNP functional categories via ''Annotations'' menu under ''Advanced Options'' section of the input form. Selecting ''All'' disables filtering and results in annotations for all SNP categories reported in ''pph2-snps.txt'' file. However, PolyPhen-2 predictions (reported in ''pph2-short.txt'' and ''pph2-full.txt'' files) are produced for missense SNPs only, regardless of the ''Annotations'' option selected.
 +
 +^  Column\\ No.  ^  Column\\ Name  ^ Description  ^
 +|   1 |  snp_pos  | input SNP chromosome:position (chromosome coordinates are 1-based)  |
 +|   2 |  str      | transcript strand ("+" or "-") |
 +|   3 |  gene     | gene symbol |
 +|   4 |  transcript  | UCSC transcript name |
 +|   5 |  ccid     | UCSC canonical cluster ID (number) |
 +|   6 |  ccds     | NCBI CCDS cluster ID |
 +|   7 |  cciden   | NCBI CCDS CDS similarity level by genomic overlap with the corresponding UCSC known gene transcript |
 +|   8 |  refa     | reference allele / variant allele ("+" strand) |
 +|   9 |  type     | SNP functional category ("coding-synon", "intron", "nonsense", "missense", "utr-3", "utr-5") |
 +|  10 |  ntpos    | mutation position in the full transcript nucleotide sequence (in the direction of transcription) |
 +|  11 |  nt1      | reference nucleotide (transcript strand) |
 +|  12 |  nt2      | variant nucleotide (transcript strand) |
 +|  13 |  flanks   | nucleotides flanking mutation position in the transcript sequence, enumerated in the direction of transcription (5'3') |
 +|  14 |  trv      | transversion mutation (0 - transition, 1 - transversion) |
 +|  15 |  cpg      | CpG context: 0\ -\ non-CpG context retained, 1\ -\ mutation removes CpG site, 2\ -\ mutation creates new CpG site, 3\ -\ CpG context retained: C(C/G)G |
 +|  16 |  jxdon    | distance from mutation position to the nearest donor exon / intron junction ("-" for upstream, "+" for downstream) |
 +|  17 |  jxacc    | distance from mutation position to the nearest acceptor intron / exon junction ("-" for upstream, "+" for downstream) |
 +|  18 |  exon     | mutation in exon # / of total exons (exons are enumerated in the direction of transcription) |
 +|  19 |  cexon    | same as above but for coding exons only |
 +|  20 |  jxc      | mutation in a codon that is split across two exons (? - no, 1 - yes) |
 +|  21 |  dgn      | degeneracy index for mutated codon position, by Nei & Kumar (2000) "Molecular Evolution and Phylogenetics", page 64 (0 - non-degenerate, 2 - simple 2-fold degenerate, 3 - complex 2-fold degenerate, 4 - 4-fold degenerate) |
 +|  22 |  cdnpos   | number of the mutated codon within transcript's CDS (1-base) |
 +|  23 |  frame    | mutation position offset within the codon (0..2) |
 +|  24 |  cdn1     | reference codon nucleotides |
 +|  25 |  cdn2     | mutated codon nucleotides |
 +|  26 |  aa1      | wild type (reference) amino acid residue |
 +|  27 |  aa2      | mutant (substitution) amino acid residue |
 +|  28 |  aapos    | position of amino acid substitution in protein sequence (1-base) |
 +|  29 |  spmap    | CDS protein sequence similarity to known UniProtKB protein (? - no match) |
 +|  30 |  spacc    | UniProtKB protein accession                |
 +|  31 |  spname   | UniProtKB protein entry name               |
 +|  32 |  refs_acc  | RefSeq protein accession                   |
 +|  33 |  dbrsid    | dbSNP SNP rsID                             |
 +|  34 |  dbobsrvd  | dbSNP observed alleles (transcript strand) |
 +|  35 |  dbavHet   | dbSNP average heterozygosity from all observations |
 +|  36 |  dbavHetSE | dbSNP standard error for the average heterozygosity |
 +|  37 |  dbRmPaPt  | dbSNP reference orthologous alleles in macaque\ (Rm), orangutan\ (Pa) and chimp\ (Pt) |
 +|  38 |  Comments  | optional user comments, copied from input |
  
appendix_b.txt · Last modified: 2012/02/13 11:31 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki