Last week I attended the Oncogenomics Workshop in Hinxton. This was a really interesting Workshop organized by the Industry program of the EBI. It addressed the issues of which data are becoming available and how to access them, how to best analyze the oncogenomics data and interpret them, and which are the challenges ahead in terms of translating these data and knowledge into therapeutic opportunities. I was invited to present IntOGen and I thought it would be of interest to some followers of our blog if I shared the slides of the talk.
I explained the IntOGen project, focusing on the analysis of Somatic Mutations detected by tumour genome re-sequencing projects. IntOGen aims to integrate data across projects and cancer sites to identify genes and pathways involved in cancer. I briefly exposed how we obtain the data, the methods that we have developed to perform the analyses (namely, OncodriveFM, OncodriveCLUST and transFIC) and how we present the results in the IntOGen web discovery tool (currently at http://beta.intogen.org).
A key feature for a project like IntOGen is that it uses methods that are scalable for the analysis of a large number of tumours. For that, we have developed novel methods that can identify cancer driver genes solely from the list of tumour somatic mutations, eliminating the need to download and process large and protected files (eg. BAM files), which would be impractical for a project like IntOGen and not be scalable for the analysis of much larger cohorts of tumours.
I also noted that the complete pipeline we use for the analysis of somatic mutations in IntOGen is available for other researchers to process their own data here.