Differences

This shows you the differences between two versions of the page.

@@ Line 1: / Line 1: @@
+====== PolyPhen-2 annotation summary report explained ======
+Following is a description of **PolyPhen-2** annotation summary report. Reports in this format are produced by both PolyPhen-2 **Batch query** web service, as well as by **standalone** PolyPhen-2 software. It is a plain text tab-separated file with each line annotating single protein variant (amino acid residue substitution).
+Eleven columns highlighted below (1, 5-9, 12, 16-18, 56) are the ones included in the **Short** version of the report available via **Batch query** web page. These are sufficient if you are interested in PolyPhen-2 prediction outcome and prediction confidence scores. The rest of the columns in **Full** report version are mostly useful only if you want to investigate all features supporting the prediction in detail.
+^  Column\\ No.  ^  Column\\ Name  ^ Description  ^
+| **Original query** (as copied from user input):   |||
+^   1 |  o_acc       | original protein identifier |
+|   2 |  o_pos       | original substitution position in the protein sequence |
+|   3 |  o_aa1       | original wild type (reference) amino acid residue |
+|   4 |  o_aa2       | original mutant (substitution) amino acid residue |
+| **Annotated query**:   |||
+^   5 |  rsid        | dbSNP reference SNP identifier (rsID) if available |
+^   6 |  acc         | UniProtKB accession if known protein, otherwise same as o_acc |
+^   7 |  pos         | substitution position in UniProtKB protein sequence, otherwise same as o_pos |
+^   8 |  aa1         | wild type amino acid residue in relation to UniProtKB sequence  |
+^   9 |  aa2         | mutant amino acid residue in relation to UniProtKB sequence |
+|  10 |  nt1         | wild type (reference) allele nucleotide |
+|  11 |  nt2         | mutant allele nucleotide |
+| **PolyPhen-2 prediction outcome**:   |||
+^  12 |  prediction  | qualitative ternary classification appraised at 5%/10% (HumDiv) or 10%/20% (HumVar) FPR thresholds ("benign", "possibly damaging", "probably damaging")|
+| **PolyPhen-1 prediction description** (obsolete, please ignore):   |||
+|  13 |  based_on    | prediction basis |
+|  14 |  effect      | predicted substitution effect on the protein structure or function |
+| **PolyPhen-2 classifier outcome and scores**:   |||
+|  15 |  pph2_class  | probabilistic binary classifier outcome ("damaging" or "neutral") |
+^  16 |  pph2_prob   | classifier probability of the variation being damaging |
+^  17 |  pph2_FPR    | classifier model False Positive Rate (1 - specificity) at the above probability |
+^  18 |  pph2_TPR    | classifier model True Positive Rate (sensitivity) at the above probability |
+|  19 |  pph2_FDR    | classifier model False Discovery Rate at the above probability |
+| **UniProtKB/Swiss-Prot derived protein sequence annotations**:   |||
+|  20 |  site        | substitution SITE annotation |
+|  21 |  region      | substitution REGION annotation |
+|  22 |  PHAT        | PHAT matrix element for substitutions in the TRANSMEM region |
+| **Multiple sequence alignment scores**:   |||
+|  23 |  dScore      | difference of PSIC scores for two amino acid residue variants (Score1-Score2) |
+|  24 |  Score1      | PSIC score for wild type amino acid residue (aa1) |
+|  25 |  Score2      | PSIC score for mutant amino acid residue (aa2) |
+|  26 |  MSAv        | version of the multiple sequence alignment used in conservation scores calculations: 1 - pairwise BLAST HSP (obsolete), 2 - MAFFT-Leon-Cluspack (default), 3 - MultiZ CDS |
+|  27 |  Nobs        | number of residues observed at the substitution position in multiple alignment (without gaps) |
+| **Protein 3D structure features**:   |||
+|  28 |  Nstruct     | initial number of BLAST hits to similar proteins with 3D structures in PDB |
+|  29 |  Nfilt       | number of 3D BLAST hits after identity threshold filtering |
+|  30 |  PDB_id      | PDB protein structure identifier |
+|  31 |  PDB_ch      | PDB polypeptide chain identifier |
+|  32 |  length      | PDB sequence alignment length |
+|  33 |  PDB_pos     | position of substitution in PDB protein sequence |
+|  34 |  ident       | sequence identity between query sequence and aligned PDB sequence |
+|  35 |  dVol        | change in residue side chain volume |
+|  36 |  dProp       | change in solvent accessible surface propensity resulting from the substitution |
+|  37 |  SecStr      | DSSP secondary structure assignment |
+|  38 |  MapReg      | region of the phi-psi map (Ramachandran map) derived from the residue dihedral angles |
+|  39 |  NormASA     | normalized accessible surface area |
+|  40 |  B-fact      | normalized B-factor (temperature factor) for the residue |
+|  41 |  H-bonds     | number of hydrogen sidechain-sidechain and sidechain-mainchain bonds formed by the residue |
+|  42 |  AveNHet     | number of residue contacts with heteroatoms, average per homologous PDB chain |
+|  43 |  MinDHet     | closest residue contact with a heteroatom, Å |
+|  44 |  AveNInt     | number of residue contacts with other chains, average per homologous PDB chain |
+|  45 |  MinDInt     | closest residue contact with other chain, Å |
+|  46 |  AveNSit     | number of residue contacts with critical sites, average per homologous PDB chain |
+|  47 |  MinDSit     | closest residue contact with a critical site, Å |
+| **Nucleotide sequence context features**:   |||
+|  48 |  Transv      | whether substitution is a transversion |
+|  49 |  CodPos      | position of the substitution within a codon |
+|  50 |  CpG         | whether substitution changes CpG context: 0\ -\ non-CpG context retained, 1\ -\ removes CpG site, 2\ -\ creates new CpG site, 3\ -\ CpG context retained |
+|  51 |  MinDJnc     | substitution distance from closest exon / intron junction |
+| **Pfam protein family**:   |||
+|  52 |  PfamHit     | Pfam identifier of the query protein |
+| **Substitution scores**:   |||
+|  53 |  IdPmax      | maximum congruency of the mutant amino acid residue to all sequences in multiple alignment |
+|  54 |  IdPSNP      | maximum congruency of the mutant amino acid residue to the sequences in multiple alignment with the mutant residue |
+|  55 |  IdQmin      | query sequence identity with the closest homologue deviating from the wild type amino acid residue |
+| **Comments**:   |||
+^  56 |  Comments    | optional user comments, copied from input |