On the Methodological difficulties of analyzing the association between compositional microbiome data and outcomes: An example from an HIV study.
Conference
65th ISI World Statistics Congress
Format: IPS Abstract - WSC 2025
Keywords: bioinformatics, circulating_microbiota, metagenomics
Session: IPS 927 - Statistical Tools for Microbiome-Based Biomarker Identification and Disease Prediction
Thursday 9 October 2 p.m. - 3:40 p.m. (Europe/Amsterdam)
Abstract
Microbial translocation occurs when bacterial products access the blood stream from the gut due to gut barrier dysfunction, potentially triggering persistent immune activation and affecting treatment efficacy. Translocation can be studied using RNA-sequencing techniques to analyze the “meta-transcriptome”: all the RNA from bacteria, viruses and fungi from whole blood. Building on previous work to repurpose existing human sequencing datasets to extract microbiota-related information, the analyses were conducted on the plasma of patients with a treated HIV infection. A key challenge with these new data is to determine whether their metatranscriptome accounts for part of the variability observed in the patients’ response to the HIV treatment. Specifically, to identify meaningful associations between microbiome-derived explanatory variables and our outcome variable, the CD4+ cell count. However, analyzing microbiome data presents several challenges due to its compositional nature and the sparsity of the data matrix, which can lead to spurious correlations among explanatory variables. To address these issues and identify the most appropriate analytical approach, we propose a comparative study aimed at evaluating the impact of different transformations applied to explanatory variables, followed by variable selection methods. The effects of these transformations will be assessed by analyzing their impact on the raw data, facilitating interpretability. Finally, careful interpretation will be conducted to extract biologically meaningful insights from this complex dataset.