** How to make your own mutation rate matrix for SCONE So you want to use SCONE to score sequence alignments, but you have an irrational hatred of mammals? Never fear! This how-to will instruct you in how to generate the four necessary files you need to use SCONE with whatever group of species you wish to work with. The four files are: a) A mutation rate matrix b,c) Two vectors of insertions and deletion rates d) A vector of trinucleotide frequencies. What you will need: 1. You will need a phylogenetic tree for your set of species. This is not necessary to generate the matrices, of course, but if you don't have it, you won't be going very far with SCONE. So get it first! 2. The mutation rate matrix is built for a particular reference species. In order to build the matrix, your phylogeny should contain a close sister species and a slightly more distant outgroup species for your reference. For example, the SCONE default mammalian matrix was built for the human lineage using chimp as the sister and baboon as the outgroup. It is vital that the sister species be close enough that double-hit mutations are negligible, otherwise a parsimonious estimate of mutation rates will be inaccurate (you may, of course, correct via ML estimates using your own methodology if you prefer). 3. Finally, you will need a multi-species alignment in MAF format containing the three indicated species, the reference, the sister, and the outgroup. Since SCONE models mutation rates in the absence of selection, the mutation rate matrix should be built using regions that don't show selection. Identifying such regions is extremely difficult, of course. We leave this as an exercise for you. A useful note is that approximation is okay, and as long as the regions are *mostly* non-functional, the computed mutation rates will be close to the actual raw mutation rate. Then: * Run indel.pl on the MAF file to determine the number of indel events at various sizes. Create a vector file (first line specifies the length of the vector, 5, each subsequent line contains the values) for the rate of insertions (# of events per nucleotide) and a vector file for deletions. * Run subst.pl on the MAF file to determine the mutation fluxes between trinucleotides. Run compute-matrix.pl on the output of this script to produce the mutation rate matrix. Run compute-freq.pl on the output of subst.pl to produce the nucleotide frequency file. You are now ready to cook.