funcSTAR

Non_synonymous SNPs that modify conserved residues

Non synonymous SNP is a SNP in where the variation produces different protein sequences.

Omega value, non-synonymous/synonymous (dN/dS) substitution rate, is an estimation of selective pressure on an amino acid replacement mutations for protein coding genes. Omega values that are less than 0.1 are selected as putatively pathological SNPs, that most likely effect protein function.

Estimates of selective pressures at a codon level are obtained through two different methods:

Omega values were calculated using pupasuite.

SNPs in rat genes orthologous of human disease or cancer genes

Rat and human orthologus genes, were obtained from Ensembl Compara database. These orthologous genes were matched against human disease and cancer genes. Finally all non intergenic SNPs discovered under the STAR project were mapped into rat orthologs of human disease and cancer genes.

SNPs that create new splice sites

All non intergenic SNPs were scored with GeneID searching for creation of new donor (dinucleotide GT) or acceptor (dinucleotide AG)splice sites. GeneID is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure.

In the first step, Position Weight Arrays (PWAs) are used to score splice sites, start and stop codons. In the second step, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the the log-likelihood ratio of a Markov Model for coding DNA. Finally, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons.

SNPs that effect microRNA targets

The microRNA target prediction program miRanda was used to scan rat microRNAs (miRBase) against all the SNPs in 3'UTR regions. miRanda is an algorithm that aims to predict microRNA targets using dynamic programming algorithm and thermodynamics.

SNPs that effect conserved TFBS

Promoter sequences are functional regions located upstream (200-2000 nt long) of transcription start site of the gene (TSS). Transcription Factors (TF) binds Transcription factors Binding sites (TFBS), specific motifs in the DNA (usually 5-15 nt). TF can bind to more than one TFBS and recruit RNA_polymerase II. The promoter regions from human and rat orthologous genes are obtained by extracting 1000 bps upstream from TSS (Ensembl). JASPAR 1.0 collection of matrices (PWMs for TFBS) is used to obtain the corresponding TF-maps for each gene (human and rat ortho) and detect TFBSs. Cross-species promotor Meta-alignments between the maps of each pair of orthologous human-rat genes are produced. SNPs that are overlapping conserved TFBSa are considered to have a putative effect in the expression of the gene.

Related article: Blanco et al. 2006

SNPs in promoter DNA Triplexes

DNA triplexes are formed when a polypurine-rich DNA duplex binds a single-stranded polynucleotide. Sequences longer that 10 polypurines (A;G) or polypyrimidines (T;C) are considered potential Triplex Target Sequences. SNPs located located in DNA triplexes are believed to effect the triplex formation and disrupt the gene regulation.

Related article: Goni et al. 2004