User Tools

Site Tools


appendix_a

Differences

This shows you the differences between two versions of the page.


appendix_a [2021/12/03 23:06] (current) – created - external edit 127.0.0.1
Line 1: Line 1:
 +====== PolyPhen-2 annotation summary report explained ======
 +
 +Following is a description of **PolyPhen-2** annotation summary report. Reports in this format are produced by both PolyPhen-2 **Batch query** web service, as well as by **standalone** PolyPhen-2 software. It is a plain text tab-separated file with each line annotating single protein variant (amino acid residue substitution).
 +
 +Eleven columns highlighted below (1, 5-9, 12, 16-18, 56) are the ones included in the **Short** version of the report available via **Batch query** web page. These are sufficient if you are interested in PolyPhen-2 prediction outcome and prediction confidence scores. The rest of the columns in **Full** report version are mostly useful only if you want to investigate all features supporting the prediction in detail.
 +
 +^  Column\\ No.  ^  Column\\ Name  ^ Description  ^
 +| **Original query** (as copied from user input):   |||
 +^   1 |  o_acc       | original protein identifier |
 +|   2 |  o_pos       | original substitution position in the protein sequence |
 +|   3 |  o_aa1       | original wild type (reference) amino acid residue |
 +|   4 |  o_aa2       | original mutant (substitution) amino acid residue |
 +| **Annotated query**:   |||
 +^   5 |  rsid        | dbSNP reference SNP identifier (rsID) if available |
 +^   6 |  acc         | UniProtKB accession if known protein, otherwise same as o_acc |
 +^   7 |  pos         | substitution position in UniProtKB protein sequence, otherwise same as o_pos |
 +^   8 |  aa1         | wild type amino acid residue in relation to UniProtKB sequence  |
 +^   9 |  aa2         | mutant amino acid residue in relation to UniProtKB sequence |
 +|  10 |  nt1         | wild type (reference) allele nucleotide |
 +|  11 |  nt2         | mutant allele nucleotide |
 +| **PolyPhen-2 prediction outcome**:   |||
 +^  12 |  prediction  | qualitative ternary classification appraised at 5%/10% (HumDiv) or 10%/20% (HumVar) FPR thresholds ("benign", "possibly damaging", "probably damaging")|
 +| **PolyPhen-1 prediction description** (obsolete, please ignore):   |||
 +|  13 |  based_on    | prediction basis |
 +|  14 |  effect      | predicted substitution effect on the protein structure or function |
 +| **PolyPhen-2 classifier outcome and scores**:   |||
 +|  15 |  pph2_class  | probabilistic binary classifier outcome ("damaging" or "neutral") |
 +^  16 |  pph2_prob   | classifier probability of the variation being damaging |
 +^  17 |  pph2_FPR    | classifier model False Positive Rate (1 - specificity) at the above probability |
 +^  18 |  pph2_TPR    | classifier model True Positive Rate (sensitivity) at the above probability |
 +|  19 |  pph2_FDR    | classifier model False Discovery Rate at the above probability |
 +| **UniProtKB/Swiss-Prot derived protein sequence annotations**:   |||
 +|  20 |  site        | substitution SITE annotation |
 +|  21 |  region      | substitution REGION annotation |
 +|  22 |  PHAT        | PHAT matrix element for substitutions in the TRANSMEM region |
 +| **Multiple sequence alignment scores**:   |||
 +|  23 |  dScore      | difference of PSIC scores for two amino acid residue variants (Score1-Score2) |
 +|  24 |  Score1      | PSIC score for wild type amino acid residue (aa1) |
 +|  25 |  Score2      | PSIC score for mutant amino acid residue (aa2) |
 +|  26 |  MSAv        | version of the multiple sequence alignment used in conservation scores calculations: 1 - pairwise BLAST HSP (obsolete), 2 - MAFFT-Leon-Cluspack (default), 3 - MultiZ CDS |
 +|  27 |  Nobs        | number of residues observed at the substitution position in multiple alignment (without gaps) |
 +| **Protein 3D structure features**:   |||
 +|  28 |  Nstruct     | initial number of BLAST hits to similar proteins with 3D structures in PDB |
 +|  29 |  Nfilt       | number of 3D BLAST hits after identity threshold filtering |
 +|  30 |  PDB_id      | PDB protein structure identifier |
 +|  31 |  PDB_ch      | PDB polypeptide chain identifier |
 +|  32 |  length      | PDB sequence alignment length |
 +|  33 |  PDB_pos     | position of substitution in PDB protein sequence |
 +|  34 |  ident       | sequence identity between query sequence and aligned PDB sequence |
 +|  35 |  dVol        | change in residue side chain volume |
 +|  36 |  dProp       | change in solvent accessible surface propensity resulting from the substitution |
 +|  37 |  SecStr      | DSSP secondary structure assignment |
 +|  38 |  MapReg      | region of the phi-psi map (Ramachandran map) derived from the residue dihedral angles |
 +|  39 |  NormASA     | normalized accessible surface area |
 +|  40 |  B-fact      | normalized B-factor (temperature factor) for the residue |
 +|  41 |  H-bonds     | number of hydrogen sidechain-sidechain and sidechain-mainchain bonds formed by the residue |
 +|  42 |  AveNHet     | number of residue contacts with heteroatoms, average per homologous PDB chain |
 +|  43 |  MinDHet     | closest residue contact with a heteroatom, Å |
 +|  44 |  AveNInt     | number of residue contacts with other chains, average per homologous PDB chain |
 +|  45 |  MinDInt     | closest residue contact with other chain, Å |
 +|  46 |  AveNSit     | number of residue contacts with critical sites, average per homologous PDB chain |
 +|  47 |  MinDSit     | closest residue contact with a critical site, Å |
 +| **Nucleotide sequence context features**:   |||
 +|  48 |  Transv      | whether substitution is a transversion |
 +|  49 |  CodPos      | position of the substitution within a codon |
 +|  50 |  CpG         | whether substitution changes CpG context: 0\ -\ non-CpG context retained, 1\ -\ removes CpG site, 2\ -\ creates new CpG site, 3\ -\ CpG context retained |
 +|  51 |  MinDJnc     | substitution distance from closest exon / intron junction |
 +| **Pfam protein family**:   |||
 +|  52 |  PfamHit     | Pfam identifier of the query protein |
 +| **Substitution scores**:   |||
 +|  53 |  IdPmax      | maximum congruency of the mutant amino acid residue to all sequences in multiple alignment |
 +|  54 |  IdPSNP      | maximum congruency of the mutant amino acid residue to the sequences in multiple alignment with the mutant residue |
 +|  55 |  IdQmin      | query sequence identity with the closest homologue deviating from the wild type amino acid residue |
 +| **Comments**:   |||
 +^  56 |  Comments    | optional user comments, copied from input |
  

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki