SNP2RFLP: Mouse SNPs Between Strains that Create RFLPs
Genetics Division, Brigham and Women's Hospital and Harvard Medical School
SNP2RFLP database is temporarily out of service. We apologize for the inconvenience, will be back online soon!
What is SNP2RFLP?
Single nucleotide polymorphism (SNP) markers give high resolution in genetic mapping in mouse because they are abundant and easily typed. Initial localization via a genome-wide SNP panel often defines a large chromosomal interval and insufficient informative markers with which to proceed with fine-mapping. To further refine this interval containing a mutation that is causative for a phenotype of interest, SNP2RFLP extracts region-specific SNPs from the NCBI mouse SNP database that are informative between the mouse strains used in the cross. SNP2RFLP then identifies those SNPs that create restriction fragment length polymorphisms (RFLPs) that can be easily assayed at the benchtop via restriction enzyme digestion of SNP-containing PCR products.
How does SNP2RFLP work?
The required input to SNP2RFLP is the following:
The two mouse strains in the cross.
The chromosomal region.
A set of restriction enzymes.
Options to control output.
SNP2RFLP uses a default list of commonly used enzymes:
Additional enzymes obtained from REBASE can be selected or the "select all" option can be checked to include all enzymes in the list. The program finds the SNPs in a local copy of the NCBI mouse SNP database (dbSNP) that are polymorphic between the two strains in the genomic region in question. The flanking sequences for each SNP are also extracted. The flanking sequences with the SNP itself are scanned to see if a restriction digest with an enzyme that is selected will produce an RFLP that can be assayed at the benchtop (ie., the enzyme cuts in one strain, but not in the other because the SNP alters the recognition site in one strain). The SNP containing sequences are fed into the Primer3 program using default values to design PCR primers flanking each SNP. The output of SNP2RFLP is the informative SNPs and their flanking sequences ordered by position, with the sites at which an enzyme cuts and left and right primers highlighted. There are many times when the number of informative SNPs is either too little or too many. There are multiple options to conrol the amount of output from SNP2RFLP.
Display validated SNPs only.
NCBI's dbSNP has many different types of validation information. If this option is selected then those SNPs in the database that have no validation information at all are excluded.
Set the desired density of SNPs returned.
The program can be told to keep all SNPs, 1 every 10 SNPs, 1 every 2 SNPs etc. If a specific interval returns a large number of informative SNPs which may be difficult to parse through then tell SNP2RFLP to keep 1 every 10 SNPs for example.
Display SNPs recorded in only one strain.
SNPs that the genotype is known in one strain but happen to not be recorded in the other strain will be included. This should only be selected if the number of informative SNPs is limited in the strains that are selected and additional methods could be used to find out the genotype in the other strain.
Disregard SNPs that are found in repeat regions.
Amplifying the region around a SNP with PCR is often diffifult if the SNP is found in a repeat region of the genome. If this option is chosen, the program will not display these SNPs. SNP2RFLP tests whether a SNP falls in a repeat region using the mouse genome premasked by RepeatMasker.
Produce a tab-delimited text file of the results.
Once the appropriate SNPs are found it may be useful to produce a tab-delimited text file containing the results. This file can easily be opened in some sort of spreadsheet. If this option is selected, a new window containing the file will open after the results have loaded. Use your browser's File menu to save this file to your computer. Make sure your browser does not block pop-ups from this site or the window will not appear.
There are 7,946,578 unique mouse SNPs in the database, and 67,936,519 known strain genotypes for those SNPs. The database contains genotypes for 99 different strains. The figure on the right shows the strains that have the highest number of known genotypes for SNPs in the database. All of these strains have millions of known genotypes in the database whereas the remaining strains only have a few thousand or a few hundred known genotypes. When using any of the strains shown in this figure with a large chromosomal interval it is highly likely that the output will take some time to load due to the large number of informative SNPs that could be found in the region. It is wise to restrict the output when using these strains by telling SNP2RFLP to only keep 1 every 5 SNPs for example. Also when crossing one of these strains shown in the figure with a strain that has few genotypes known then the "Display SNPs recorded in one strain" option can be used and the genotypes of the strain that has little information can be inferred using other methods.
Mouse Strain 1
Mouse Strain 2
Hold down the control (PC) or command (MAC)
key to select multiple enzymes
Base Position: From
Display validated SNPs only
Display SNPs that are only recorded in one strain but may be verified in the other
Don't display SNPs that are found in repeat regions
Produce a tab-delimited text file of the results
Density of SNPs (all, 1 every 5, 1 every 20, etc.)