Download

CBaSE v1.0

Update: The pentanucleotide model option was causing an error message and has been fixed, as well as two other rare exceptions. Note that previously in rare cases, individual highly positively selected genes may have not been included in the final output file (but shown in the log).


Running CBaSE

python CBaSE_v1.0.py input_filename path_aux_folder context model

Command line arguments:

(1)

input_filename:

File containing somatic mutation data (see below).

(2)

path_aux_folder:

Path to folder containing auxiliary files (default "Input").

(3)

context:

Context used to compute cancer-type-specific mutation matrix; 0=trinucleotides, 1=pentanucleotides.

(4)

model:

Model assumption for the distribution of expected synonymous mutation counts; one of [1,2,3,4,5,6].

Input file format:

Gene symbolMutation effect Mutated nucleotide Context index
ECE1 coding-synon C 26
SAMD11 missense A 53
TNFRSF4 nonsense A 52

(1) Gene symbol – corresponds to the official gene symbol as used in the UCSC knownGene track.
(2) Mutation effect – one of [“missense”, “nonsense”, “coding-synon”, “utr-3”, “utr-5”], denoting missense, nonsense (stop-gain and stop-loss), synonymous, 3'-UTR and 5'-UTR mutations, respectively.
(3) Mutated nucleotide – one of [A, C, G, T].
(4) Context index – 0-based indices of tri- or pentanucleotide contexts can be found here.

Output:

(1) Fitted model distribution of per-gene expected synonymous mutation counts.
(2) Gene-specific q-values of negative and positive selection.


CBaSE web tool

The CBaSE web tool can be found here.


How to cite

Please cite Weghorn & Sunyaev, Nature Genetics (2017).