MetaPDOCheese

Étude métaomique des fromages AOP français

Celine Delbes

UMRF

Françoise Irlinger

SayFood

Many people

Many places

July 17, 2024

Cow

Ewes / Goat

Project summary and objectives

In a nutshell

  • Establish an atlas of the microbial () diversity…
    • … of 44 refined Protected Designation of Origin (PDO) cheeses
  • Investigate the structuring factors shaping their microbiota…
    • …using multi-omics approaches
  • Identify a few keystone species…
    • …to study their domestication
    • …by comparing modern 🐕 to ancient or wild 🐺 strains
  • Identify shared ASVs
    • …to understand fluxes during the cheese making process

Experimental design

In theory

  • For each of the 44 PDO, up to 10 productions units (ateliers) were sampled
  • For each unit:
    • 1 milk sample
    • 3 cheese samples (cheese and/or cheese wheel location) divided in
      • rind
      • curd
  • For each sample:
    • 16S metabarcoding (V3-V4 region)
    • ITS2 metabarcoding
  • Spare DNA kept in the freezer for further analyses

In practice

Big figures 🧮

  • 200 farmers 🧑‍🌾️
  • 386 farmhouse or corporate PDO producers
  • 44 PDO
  • 2702 cheese 🧀 rind and curd samples
  • 1.2 billion sequencing reads

Metadata collection

For each production unit, cheese makers recorded 150 informations that were later cleaned and combined into 100 covariates.

They cover:

  • animal feeding patterns
  • milk conservation
  • bacterial seeding
  • cheese making process
  • ripening

Data curation

My state of mind during this part

Metadata curation

A lot of effort was spent to collect high-quality metadata:

But we suffered from many missing values and data validation was a a pain:

  • data were stored in Excel in an unstructured format
  • not the same template for all files 😡
  • temperature: 10°C, 17°C but also “cold” and “hot” 😭
  • dates coded as integers by Excel (💩-ware)
  • production units alternatively coded 1_CA in some files and 1.CA
  • ID swapping between samples

Metadata curation (II)

We decomposed the metadata into

  • PDO-level information (Species, Technology, geograpichal region)
  • Production-level information (almost everything)
  • Sample-level information (pH, microbial quantification)

And a lot of time was spent curating mistakes

Metabarcoding data

Sequencing strategy: Illumina
  • targetting 100 000 reads / sample for the 16S
  • new sequencing for samples with less than 20 000 reads
  • 16S: \(\sim\) 600 millions reads across all (cheese + milk) samples
  • ITS: \(\sim\) 600 millions reads across all (cheese + milk) samples
Analysis strategy: DADA2 🐴 + FROGS 🐸
  • ASVs are “bettter” than OTUs ✨
  • ASV can differ by a single nucleotide
  • ASV can work in cohort mode
  • Features are stable over time

A cautionary tale 🧙

🚨 DADA2 can confuse noise / systematic errors for signal

16S data

  • FI identified a few pairs of ASV that differed only by a deletion of an A on the 77th to last position.
  • OR identified one faulty spot on the machine.
  • A systematic search found 1487 such pairs (3974 ASVs) among 6500 ASVs with overall abundance > 500

ITS2 data

A naïve search found 300 000 ASVs, out of which only 30 000 satisfied

  • abundance > 1
  • prevalence > 1
  • annotated as "Fungi"

Filters

Cheese samples

Prevalence-based

Keep ASVs if prevalence at least

  • 100% in a production unit
  • 70% across productions
  • 40% in rind samples
  • 40% in paste samples
  • 50% across all samples
Abundance-based

Keep ASVs if abundance at least

  • 5e-5 in a production unit
Control-based

Remove ASVs if specificity at least

  • 70% in blank controls

Milk samples

Similar filters

Taxonomy curation

Extra steps required to go the species level

  • 16S: Affiliations against DairyDB and Silva 132
  • ITS: Affiliations against Unite
  • Manual curation for the most abundant taxa
  • Transfer curation to other concerned taxa

But no miracle 🌟

Many ASV are still ambiguous at species level:

Lacticaseibacillus Group casei paracasei zeae chiayiensis

At the end of the day 🎵

16S data: 2679 samples

  • median depth: 110 000 reads
  • 95.22% samples > 20 000

ITS2 data: 2680 samples

  • median depth: 206 000 reads
  • 99.78% samples > 20 000

16S + ITS: 2675 samples in common

library(MetaPDOCheese)
metapdocheese
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 8066 taxa and 2679 samples ]
sample_data() Sample Data:       [ 2679 samples by 126 sample variables ]
tax_table()   Taxonomy Table:    [ 8066 taxa by 7 taxonomic ranks ]
phy_tree()    Phylogenetic Tree: [ 8066 tips and 8064 internal nodes ]
refseq()      DNAStringSet:      [ 8066 reference sequences ]
metapdocheese_its
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 5590 taxa and 2680 samples ]
sample_data() Sample Data:       [ 2680 samples by 126 sample variables ]
tax_table()   Taxonomy Table:    [ 5590 taxa by 7 taxonomic ranks ]
refseq()      DNAStringSet:      [ 5590 reference sequences ]

My state of mind at that point

Diversity

A few key figures

16S data

8066 ASVs

  • 🥛 3220
  • 🧀 4956

1702 species

  • 🥛 1215
  • 🧀 810

661 genera

  • 🥛 543
  • 🧀 285
ITS2 data

5590 ASVs

  • 🥛 4197
  • 🧀 1662

1156 species

  • 🥛 1097
  • 🧀 273

544 genera

  • 🥛 528
  • 🧀 136

Abundant taxa

  • found 3 times
  • with rel. ab. > 0.1%

16S data:

  • 🥛 2235
  • 🧀 1304

ITS2 data:

  • 🥛 1630
  • 🧀 254

A primer on cheese making

Bacteria

Dominant species 🧀

  • well-known cheese colonizers

Represent together

  • < 50% of the rind
  • < 75% of the core (except for PLCF)

Dominant species 🥛

Much higher diversity than in 🧀

  • < 40% of the milk

Fungi

Dominant species 🧀

  • well-known cheese colonizers

Represent together

  • < 75% of the rind
  • < 75% of the core (except for PLCF)

Dominant species 🥛

Much higher diversity than in 🧀

  • < 40% of the milk
Richness varies by dairy species and technology

Less diversity
  • core than rind
  • fungi than prokaryotes
  • PLC[L|F] than other techno
Techno more constrated
  • rind than core
  • prokaryotes than fungi
So does the balance between bacteria and fungi

Balance prok / fungi
  • \(\simeq\) 1 in core
  • \(>\) 1 in rind
  • differences between techno

PMCL and PPC quite different from the rest.

Core Microbiota: milk but not cheese

Milk

Core species

  • abundance > 0.01%
  • prevalence = 100%

12 core species

  • 6 bacteria (/1367): 18% of reads
  • 6 fungi (/1230): 29% of reads

Cheese (rind)

Core species

  • abundance > 0.01%
  • prevalence = 100%

1 core species 😞

  • 0 bacteria
  • 1 fungi: 44-50% of reads (rind/core)

Techno-level core

  • 8 bacteria, 3 fungi, mostly cheese starters and ripening cultures
  • 22 - 43% of reads

Cheese (core)

Core species

  • abundance > 0.01%
  • prevalence = 100%

1 core species 😞

  • 0 bacteria
  • 1 fungi: 44-50% of reads (rind/core)

Techno-level core

  • 8 bacteria, 3 fungi, mostly cheese starters and ripening cultures
  • 31 - 76% of reads

Network analyses

Network reconstruction

Ecological niches

  • aggregate cheese replicate samples at the production level (386 batches)
  • average Bray-Curtis distance on 16S and ITS2
  • build 5 habitats using hierarchical clustering

Network reconstruction

  • aggregate ASV at the species level
  • keep species with prevalence > 10% (global) or >20% (habitat)
  • infer a network (with PLNnetwork) with rind/core and clusters covariates

Module reconstruction

  • Identify modules using blockmodels

Microbial Network

  • 132 Nodes: 75 bacteria and 57 fungi
  • 5 modules of size 11 to 35
  • Modules made up of similar taxa

Guild content

Structuring factors

Large differences between 🥛 and 🧀

NMDS projections, PERMANOVA analyses

Bacteria (\(R^2=12.9\%\))

Fungi (\(R^2=8.8\%\))

Bacteria Fungi

Separate analyses for 🥛 and 🧀

PDO-dependent factors shape the milk and cheese microbiome

Cheese structuring factors (I)

Cheese structuring factors (II)

Main drivers of diversity

PDO (\(R^2 \simeq 60\%\))

PDO prescribed variables:

  • region
  • ripening time
  • topography, dairy species

Cheese producton practices:

  • rind care practices
  • use of wooden board for ripening
  • salting method
  • humidity
  • pH at the end of ripening

For some techno only

  • season
  • type of production (farmhouse vs corporate dairy)
  • milk treatment

Focus on PPS PDO

  • Cheeses within the PPS techno split by PDO
  • Within PDO, large effects of atelier de production and season.

Focus on PPNC PDO

  • Cheeses within the PPNC techno split by PDO and rind treatment
  • For PDO38, rind treatment explains 65% of the observed dispersion

Milk structuring factors

Taxa flux

Identifying flux

Cheese taxa also found in the milk

  • 42.2% of the bacterial species (346/820) found in the milk
  • 63.6% of the fungal species (346/820) found in the milk

At the production level: 740 pairs 🥛 - 🧀

  • Over all productions
    • 147 shared bacterial ASVs
    • 178 shared fungal ASVs
  • Per production, on average,
    • 6.58 shared bacterial ASVs, 15% of the richness, 44% (rind) / 64% (core) of the reads
    • 16.8 shared fungal ASVs, 41% of the richness, 84% (rind) / 90% (core) of the reads

Identifying flux

Huge differences between techno

Bacteria
  • 15% of the richness
Fungi
  • 41% of the richness

About the shared ASVs

Some frequently shared ASVs not known as starters or ripeners

Sharedness modulated by technological factors

  • ASV128 detected exclusively on goat’s and cow’s milk and their cheese
  • Probability of being shared varies widely across PDO

Conclusions

Summary

  • Diversity
    • Many more species (\(\sim 1700\) for bacteria, \(\sim 1150\) for fungi) than introduced into dairy products (\(95\) for bacteria, \(40\) for fungi)
    • Indigenous species contribute to the typicality of PDO cheese
    • No overall difference in diversity between bacteria and fungi, except in the rind
  • Core microbiota
    • A core milk microbiome with 12 species: 6 fungal and 6 bacterial
      • but marked differences in composition
    • No French cheese core microbiota (reduced to Geotrichum candidum)
      • but techno specific core microbiota

Summary (II)

  • A Terroir effect on cheese
    • PDO is a strong structuring factor
      • If only by being confounded with other factors
    • Many factors (region, topography, season, practices, PDO know-how) are significant
  • Observed from the milk onwards
    • PDO second most structuring for milk after dairy species
    • Many factors (region, topography, animal breed and feed) are significant
    • Minor effect of season compared to udder hygiene and animal housing conditions

Summary (III)

  • Importance of the milk - cheese continuum
    • Milk is an important reservoir of fungal diversity (>40%)
    • The outcome of the transfer is strongly modulated by the techno
    • Most shared ASVs are not from starter cultures
  • With a few caveats
    • Low detection power in milk (low microbial load)
    • Limited resolution of ASVs
    • No information of cell viability

Thanks

SayFood
  • Françoise Irlinger
  • Eric Dugat-Bony
UMRF
  • Etienne Rifa
  • Sebastien Theil
  • Céline Delbès
MaIAGE
  • Olivier Rué
  • Valentin Loux
SPO
  • Cécile Neuvéglise
MICALIS
  • Pierre Renault
Genoscope
  • Corinne Cruaud
  • Valérie Barbe
  • Frederick Gavory
CNAOL
  • Ronan Lasbleiz
  • Céline Spelle
CNIEL
  • Frédéric Gaucheron


A work by Migale Bioinformatics Facility
Université Paris-Saclay, INRAE, MaIAGE, 78350, Jouy-en-Josas, France
Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, 78350, Jouy-en-Josas, France