Module 20
September 11, 2023
After this 4 days training, you will:
DAY 1
DAY 2
DAY 3
DAY 4
The perfect / ideal gene marker:
The genes that have been proposed for this task include those encoding :
Bacterial lineages vary in their genomic contents, which suggests that different genes might be needed to resolve the diversity within certain taxonomic groups.
Median of the number of 16S rRNA copies in 3,070 bacterial species according to data reported in rrnDB database – 2018
[B] The positions of sequence variation within 16S and 23S rRNA are shown along the gene organization of rrn operons. A total of 33 and 77 differences were identified in 16S rRNA and 23S rRNA, respectively.
[C] The number of bases that are different from the conserved sequence are shown for 16S and 23S rRNA for each rrn operon
ITS: Internal Transcribed Spacer
Size polymorphism of ITS (from 361 to 1475 bases in UNITE 7.1)
Highly conserved regions of the neighboring of ITS1 and ITS2
Lack of a generalist and abundant ITS databank (several small specialized databanks)
Multiple copies (14 to 1400 copies (mean at 113, median at 80))
FROGS deals very good with ITS [8]
Here, we showed that contaminant OTUs from extraction and amplification steps can represent more than half the total sequence yield in sequencing runs, and lead to unreliable results when characterizing tick microbial communities. We thus strongly advise the routine use of negative controls in tick microbiota studies, and more generally in studies involving low biomass samples
Compositions at the phylum level for Human gut and, using a range of different methods (separate subpanels within each group).
Quality parameters obtained with the seven bioinformatics pipelines. A) Recall rate (TP/(TP+FN)) reflects the capacity of the tools to detect expected species. B) Precision (TP/(TP+FP)) shows the fraction of relevant species among the retrieved species. C) Divergence rate is the Bray-Curtis distance between expected and observed species abundance. D. Percentage of perfectly reconstructed sequences is the fraction of predicted sequences with 100% of identity with the expected ones.
Module 20 - Metabarcoding