Bioinformatics & Biostatistics

Bioinformatics and Biostatistics

While developing transcriptomic approaches during my PhD, I quickly felt the need to learn more tools from biostatistics in order to perform thorough analyses of my data.

I quickly initiated a sustained collaboration with mathematicians from Toulouse Math Institute, in particular:

    We first made an inventory of methods relevant to analyze transcriptomic data (Baccini et al., 2005).
In order to integrate transcriptomic and lipidomic data acquired during my PhD, Igniacio Gonzalez developed the regularized version of Canonical Correlation Analysis (Gonzalez et al., 2008).
This initial work was then taken in multiple directions by Kim-Anh Lê Cao who developed the first version of the mixomics package during her PhD (see for example Lê Cao et al., 2009) and still actively leads the development of this package and of data integration methods.

With Sébastien Déjean, we also developed methods to analyze transcriptomic data acquired during time-series experiments (Déjean et al., 2010).

This experience in biostatistics was extensively used in my research projects. Notably I developed a pipeline for the analysis of microarray data that is still the foundation of the statistical analyses performed by the GeT-TRiX transcriptomic facility.

Since 2015, I have gained more experience in bioinformatics for the analysis of next-generation sequencing (NGS) data, in particular using R and Bioconductor.
Based on my recent research work, I am developing my first R package named GeneNeighborhood. It allows to explore the orientation and proximity of the direct upstream and downstream neighbors of a predefined set of genes.


(2010). Clustering time-series gene expression data using smoothing spline derivatives.. EURASIP J Bioinform Syst Biol. 2007:70561..

Project Project PubMed

(2009). Sparse canonical methods for biological data integration: application to a cross-platform study.. BMC Bioinformatics. 2009 Jan 26;10:34..

Project PubMed

(2008). CCA: An R Package to Extend Canonical Correlation Analysis. J Stat Soft. 2008 Jan;23(12):1-14..

PDF Project DOI

(2008). Stratégies pour l'analyse statistique de données transcriptomiques. J-SFdS 2008 Jan;146(12):5-44..

PDF Project PDF


Core bioconductor packages for NGS data analysis
Thu, June 23, 2016