Public page for the WEHI Research Computing Platform (RCP)
mixOmics is a large R package that provides statistical methods to integrate omics data sets (e.g transcriptomics, proteomics, metabolomics, metagenomics) that simultaneously measure the activity of thousands of biological features (e.g transcripts, proteins, metabolites, bacteria). Data integration enables identification of specific biological relationships between these features (e.g. genes and proteins), to create new insights into molecular processes involved in health and disease. MixOmics includes 19 data integration methods, amongst which 13 were developed in our lab. These methods are all based on dimension reduction using Projection to Latent Structures (PLS).
Our users include computational biologists, molecular biologists and bioinformaticians who wish to integrate their data and identify signatures of genes, proteins etc. to explain or predict a disease outcome. The package (ranked in the top 5% package in Bioconductor) is easy to use because all methods use the same underlying PLS principles and produce numerous graphics for interpretation (Fig. 1). We continuously improve the mixOmics package based on the community feedback.
As this is a large project, the internship requires complementary skillsets to:
After the (steep) learning phase, there will be opportunities for students to propose new features and functionalities in the package if they wish.
Figure 1. Overview of the methods in mixOmics for data exploration and integration of multiple omics data sets (courtesy of Prof. LĂȘ Cao)
Skills and Pre-requisites:
Benefits for students whilst undertaking the internship include: