Analyse de données métagénomiques 16S

Module 20

Christelle Hennequet-Antier

MaIAGE

Cédric Midoux

PROSE & MaIAGE

June 8, 2026

Introduction

Practical informations

  • 9h00 - 17h00
  • 2 breaks morning and afternoon
  • Lunch at INRAE restaurant (not mandatory)
  • Questions are strongly encouraged
  • Everyone has something to learn from each other

Better know you

Who are you?

  • Institution / Laboratory / position

What is your scientific topic?

  • Studied ecosystem
  • Scientific question
  • Experimental design

What is your background?

  • Already treated shotgun data?
  • Background in bioinformatics?
  • Background in biostatistics?

Better know us

  • Open infrastructure dedicated to life sciences
    • Computing resources, tools, databanks…
  • Dissemination of expertise in bioinformatics
  • Design and development of applications
  • Data analysis

Data analysis service

https://documents.migale.inrae.fr/data-analysis.html

  • We are specialized in genomics/metagenomics
  • 5 Bioinformaticians and 2 Statisticians
  • More than 160 projects since 2016
  • LRQA certified process
  • 2 types of services
    • Classical collaboration (we perfom the analyses)
    • Accompaniment (we help you do the analysis yourself)

Aim of this training

After this 4 days training, you will:

  • Know the outlines, advantages and limits of amplicon sequencing data analysis
  • Be able to use FROGS (through Galaxy) and phyloseq (through easy16S) tools on the training data set
  • Be able to identify tools and parameters adapted to your own analyses

Aim of this training

Program

DAY 1

  • Introduction
  • Introduction to amplicon analysis
  • Introduction to Galaxy
  • Quality control
  • FROGS (1)

DAY 2

  • FROGS (2)
  • FROGSfunc

DAY 3

  • Introduction
  • Easy16S
  • Composition
  • \(\alpha\) and \(\beta\) diversities
  • Ordination

DAY 4

  • PERMANOVA and hypothesis tests
  • Differential abundance
  • Train on your own dataset or on another provided dataset

Program

Training with Easy16S

DAY 3

  • Introduction
  • Easy16S
  • Composition
  • \(\alpha\) and \(\beta\) diversities
  • Ordination

DAY 4

  • PERMANOVA and hypothesis tests
  • Differential abundance
  • Train on your own dataset or on another provided dataset

microbiome tools

Aims

Become familiar with {phyloseq} [2] R package and {Easy16S} [3] Shiny Web Application for the analysis of microbiome datasets.

Exploratory Data Analysis

  • \(\alpha\)-diversity: how diverse is my community?

  • \(\beta\)-diversity: how different are two communities?

  • Use a distance matrix to study structures:

    • Hierarchical clustering: how do the communities cluster?
    • Permutational ANOVA: Communities structured by some environmental factor?

Visual assessment of the data

  • bar plots: what is the composition of each community?
  • Multidimensional Scaling: how are communities related?
  • Heatmaps: are there interactions between species and (groups of) communities?
  • Differential Abundances: which taxa are differentially abundant?

easy16S

Shiny Web Application [3]

References

1. Liu Y-X, Qin Y, Chen T, Lu M, Qian X, Guo X, et al. A practical guide to amplicon and metagenomic analysis of microbiome data. Protein & Cell. 2020;12:315–30. doi:10.1007/s13238-020-00724-8.
2. McMurdie PJ, Holmes S. Phyloseq: An r package for reproducible interactive analysis and graphics of microbiome census data. PloS one. 2013;8:e61217.
3. Midoux C, Rué O, Chapleur O, Bize A, Loux V, Mariadassou M. Easy16S: A user-friendly shiny web-service for exploration and visualization of microbiome data. Journal of Open Source Software. 2024;9:6704. doi:10.21105/joss.06704.
4. Callahan BJ, Sankaran K, Fukuyama JA, McMurdie PJ, Holmes SP. Bioconductor workflow for microbiome data analysis: From raw reads to community analyses. F1000Research. 2016;5.
5. Mariadassou M. Phyloseq-extended: Various customs functions written to enhance the base functions of phyloseq. 2018. https://github.com/mahendra-mariadassou/phyloseq-extended.