The making of Condel (CONsensus DELeteriousness Score)

Our article on a Consensus deleteriousness score of missense Single Nucleotide Variants was published yesterday online and will be included in the April issue of the American Journal of Human Genetics. We want to invite you to read it, and give you a glimpse at how we got the idea.

A few months back we started searching for computational methods to assess the effect of missense Single Nucleotide Variants (SNVs) on proteins. Our primary aim was to identify an array of methods that could be used by projects within the International Cancer Genome Consortium to help prioritize deleterious SNVs from the large collections that usually appear in cancer samples. We were looking for computational tools that fulfilled two main requirements: they had to be easily downloaded and installed locally, and they should perform well in the task of separating likely deleterious from likely neutral SNVs. While the first condition was easy to assess, we realized that the latter was rather cumbersome. There were no common benchmarking studies of the performance of different methods, and most of them had been tested on different datasets of experimentally known deleterious and neutral SNVs.

Therefore, our first idea was to benchmark all downloadable tools with a common dataset of SNVs. We chose a couple of datasets of several thousands of disease-related and non-damaging SNVs that had been culled by the authors of one of the methods we were interested in benchmarking. We used five tools that assess the probability that an aminoacid change be accepted in evolution to calculate the deleteriousness of each single SNV in the two datasets.

At some point of the process, we realized that since practically all SNVs were classified by at least three tools, we could implement a method that combined their classification in order to obtain a more accurate assessment of the deleteriousness of each SNV. We essayed several ways to integrate the outputs of the five tools, and finally found that a weighted average of the scores of the individual tools increased the accuracy of the classification of both datasets of SNVs to values around 90%, as show the ROC curves below. This weighted average of the scores of different tools may be regarded as a measurement of the degree of coherence of individual methods about the likelihood that a SNV is deleterious. We have therefore named it Consensus deleteriousness score of SNVs, or Condel.

In the figure the ROC curves that correspond to the five original tools (namely polyphen2, SIFT, LogR PFam E-value, MutationAssessor and MAPP)  are dotted lines; the integrated scores are continuous lines. PPH2, polyphen2; logre, LogR PFam E-value ; massess, Mutation Assessor; SVS, Simple Vote Score; WVS, Weighed Vote Score; SAS, Simple Average Score; WAS, Weighed Average Score.

At the end of the making, we realized that the rationale behind Condel could be applied in principle to any such array of methods that assess the likelihood of deleteriousness of SNVs. As a matter of fact, different arrays of methods may work better on different datasets.

If this story interested you, please don’t forget to visit Condel at