FannsDB 2.0-dev documentation

Performing queries

«  Data sources   ::   Contents   ::   Changes  »

Performing queries

There is a public web server hosted at our servers that allow to perform queries in a simple way. There are some limitations to avoid overuse:

Input format

To run the analysis you need to provide a list of mutations. The list of mutations should adhere to the following format. Each line represent a mutation, and for each mutation several fields can be specified separated by tabulators.

Note

In the case of the web text area you can use spaces or commas instead of tabulators.

The parser is flexible enough to support different sets of fields per mutation. Following there is the list of allowed set of fields to specify a mutation (fields between brackets ‘[]’ are optional):

  • CHR START [END] CHANGE [STRAND] [IDENTIFIER]
  • CHR START [END] [REF] ALT [STRAND] [IDENTIFIER]
  • PROTEIN AA_POS [AA_REF] AA_ALT [IDENTIFIER]
  • PROTEIN AA_CHANGE [IDENTIFIER]

where each field means:

  • CHR: the chromosome, with or without the ‘chr’ prefix.
  • START: Start position in the chromosome.
  • END: End position in the chromosome.
  • CHANGE: The nucleotid change as ‘R>A’ or ‘R/A’ where R refers to the reference genome nucleotide and A to the mutated nucleotide.
  • REF: The reference genome nucleotide.
  • ALT: The mutated nucleotide.
  • STRAND: The strand, any of the following values: ‘+’ or ‘+1’ or ‘1’ for positive strand and ‘-‘ or ‘-1’ for negative strand.
  • PROTEIN: The protein identifier where the mutation takes place. The allowed identifiers for proteins are: Ensembl Protein Id, SwissProt Accession and SwissProt ID.
  • AA_POS: The protein aminoacid position.
  • AA_REF: The wild protein aminoacid.
  • AA_ALT: The changed protein aminoacid.
  • AA_CHANGE: The protein aminoacid position, and wild and mutated aminoacids as ‘RPA’ where ‘R’ refers to the wild type aminoacid, ‘P’ to the aminoacid position of the mutation and ‘A’ to the changed aminoacid.
  • IDENTIFIER: An identifier associated to the mutation. The results will contain the same identifier.

Example (CHR START ALT IDENTIFIER):

9       32473058        A       S1
7       43918688        C       S2
7       38471790        A       S2
6       88372821        A       S3
5       41934236        G       S3

Warning

Only GRCh37 (hg19) coordinates are allowed right now.

Results format

The results of the query can be downloaded as a tabulated file and have the following columns:

  • IDENTIFIER: The identifier that was associated to the mutation from the input file.
  • CHR: The mutation’s chromosome.
  • STRAND: The mutations’s strand.
  • START: The mutation’s start position.
  • REF: The reference nucleotide.
  • ALT: The alternated nucleotide.
  • TRANSCRIPT: The Ensembl identifier of the transcript affected by the mutation.
  • PROTEIN: The Ensembl identifier of the protein.
  • AA_POS: The position of the mutation in protein coordinates.
  • AA_REF: The reference aminoacid.
  • AA_ALT: The alternated aminoacid.
  • GENE: The Ensembl identifier of the gene coded by the transcript.
  • SYMBOL: The HUGO symbol of the gene.
  • SWISSPROT_ID: The Uniprot identifier of the protein.
  • SIFT: SIFT score of the mutation as obtained from VEP (mutations whose consequence types are not prone to affect the sequence of the protein product have empty values).
  • PPH2: Polyphen2 score of the mutation as obtained from VEP (mutations whose consequence types are not prone to affect the sequence of the protein product have empty values).
  • MA: Mutation assessor score of the mutation as obtained from the Mutation assessor database (mutations whose consequence types are not prone to affect the sequence of the protein product have empty values).
  • Extra scores depending on the type of query

Condel

  • CONDEL: Calculated Condel score.
  • CONDEL_LABEL: The Condel label encoded as: 0.0 = Neutral, 1.0 = Deleterious.

TransFIC

  • TFIC_SIFT: The TransFIC score calculated for SIFT.
  • TFIC_PPH2: The TransFIC score calculated for Polyphen2.
  • TFIC_MA: The TransFIC score calculated for Mutation Assessor.
  • TFIC_SIFT_LABEL: The TransFIC label for SIFT encoded as: 0.0 = Low, 1.0 = Medium, 2.0 = High.
  • TFIC_PPH2_LABEL: The TransFIC label for Polyphen2 encoded as: 0.0 = Low, 1.0 = Medium, 2.0 = High.
  • TFIC_MA_LABEL: The TransFIC label for Mutation Assessor encoded as: 0.0 = Low, 1.0 = Medium, 2.0 = High.

«  Data sources   ::   Contents   ::   Changes  »