bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

The unbearable madness of microbiome

This is my attempt to collate the literature on how easy it is to introduce bias into microbiome studies.  I hope to crowd-source papers and add them below under each category.  PLEASE GET INVOLVED.  If this works well we can turn it into a comprehensive review and publish 🙂 Add new papers in the comments or Tweet me 🙂

Special mention to these blogs:

  1. Microbiomdigest’s page on Sample Storage
  2. Microbe.net’s Best practices for sample processing and storage prior to microbiome DNA analysis freeze? buffer? process?
  3. The-Scientist.com’s Spoiler Alert


UPDATE 7th August 2016


For prosperity the original blog post is below, but I am now trying to manage this through a ZOTERO GROUP LIBRARY.  Please continue to contribute – join the group, get involved!  I think you can join the group here.  I am trying to avoid just listing software papers BTW I would prefer to focus on papers that specifically demonstrate sources of bias.

I won’t be updating the text below.


  • “We propose that best practice should include the use of a proofreading polymerase and a highly processive polymerase, that sequencing primers that overlap with amplification primers should not be used, that template concentration should be optimized, and that the number of PCR cycles should be minimized.” (doi:10.1038/nbt.3601)

Sample collection and sample storage

  • “Samples frozen with and without glycerol as cryoprotectant indicated a major loss of Bacteroidetes in unprotected samples” (24161897)
  • “No significant differences occurred in the culture-based analysis between the fresh, snap or -80°C frozen samples.” (25748176)
  • “In seven of nine cases, the Firmicutes toBacteroidetes 16S rRNA gene ratio was significantly higher in fecal samples that had been frozen compared to identical samples that had not….. The results demonstrate that storage conditions of fecal samples may adversely affect the determined Firmicutes to Bacteroidetes ratio, which is a frequently used biomarker in gut microbiology.” (22325006)
  • “Our results indicate that environmental factors and biases in molecular techniques likely confer greater amounts of variation to microbial communities than do differences in short-term storage conditions, including storage for up to 2 weeks at room temperature” (10.1111/j.1574-6968.2010.01965.x)
  • “The trichloroacetic acid preserved sample showed significant loss of protein band integrity on the SDS-PAGE gel. The RNAlaterpreserved sample showed the highest number of protein identifications (103% relative to the control; 520 ± 31 identifications in RNAlater versus 504 ± 4 in the control), equivalent to the frozen control. Relative abundances of individual proteins in the RNAlater treatment were quite similar to that of the frozen control (average ratio of 1.01 ± 0.27 for the 50 most abundant proteins), while the SDS-extraction buffer, ethanol, and B-PER all showed significant decreases in both number of identifications and relative abundances of individual proteins.” (10.3389/fmicb.2011.00215)
  • “Optimized Cryopreservation of Mixed Microbial Communities for Conserved Functionality and Diversity” (10.1371/journal.pone.0099517)
  • “A previously known bias in FTA(®) cards that results in lower recovery of pure cultures of Gram-positive bacteria was also detected in mixed community samples. There appears to be a uniform bias across all five preservation methods against microorganisms with high G + C DNA. Overall, the liquid-based preservatives (DNAgard(™), RNAlater(®), and DESS) outperformed the card-based methods.” (22974342)
  • “Microbial composition of frozen and ethanol samples were most similar to fresh samples. FTA card and RNAlater-preserved samples had the least similar microbial composition and abundance compared to fresh samples.” (10.1016/j.mimet.2015.03.021)
  • “Bray-Curtis dissimilarity and (un)weighted UniFrac showed a significant higher distance between fecal swabs and -80°C versus the other methods and -80°C samples (p<0.009). The relative abundance of Ruminococcus and Enterobacteriaceae did not differ between the storage methods versus -80°C, but was higher in fecal swabs (p<0.05)” (10.1371/journal.pone.0126685)
  • “We experimentally determined that the bacterial taxa varied with room temperature storage beyond 15 minutes and beyond three days storage in a domestic frost-free freezer. While freeze thawing only had an effect on bacterial taxa abundance beyond four cycles, the use of samples stored in RNAlater should be avoided as overall DNA yields were reduced as well as the detection of bacterial taxa.” (10.1371/journal.pone.0134802)
  • “A key assumption in many studies is the stability of samples stored long term at −80 °C prior to extraction. After 2 years, we see relatively few changes: increased abundances of lactobacilli and bacilli and a reduction in the overall OTU count. Where samples cannot be frozen, we find that storing samples at room temperature does lead to significant changes in the microbial community after 2 days.” (10.1186/s40168-016-0186-x)

DNA extraction

  • “Caution should be paid when the intention is to pool and analyse samples or data from studies which have used different DNA extraction methods.” (27456340)
  • “Samples clustered according to the type of extracted DNA due to considerable differences between iDNA and eDNA bacterial profiles, while storage temperature and cryoprotectants additives had little effect on sample clustering” (24125910)
  • “Bifidobacteria were only well represented among amplified 16S rRNA gene sequences when mechanical disruption (bead-beating) procedures for DNA extraction were employed together with optimised “universal” PCR primers” (26120470)
  • Qiagen DNA stool kit is biased (misses biffidobacteria)  (26120470)
  • “Bead-beating has a major impact on the determined composition of the human stool microbiota.  Different bead-beating instruments from the same producer gave a 3-fold difference in the Bacteroidetes to Firmicutes ratio” (10.1016/j.mimet.2016.08.005)
  • “We observed that using different DNA extraction kits can produce dramatically different results but bias is introduced regardless of the choice of kit.” (10.1186/s12866-015-0351-6)

Sequencing strategy

  • Bifidobacteria were only well represented among amplified 16S rRNA gene sequences … with optimised “universal” PCR primers. These primers incorporate degenerate bases at positions where mismatches to bifidobacteria and other bacterial taxa occur” (26120470)
  • Anything other than 2x250bp sequencing of V4 region (approx 250bp in length) inflates number of OTUs (23793624)
  • “This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone.” (10.1128/AEM.02206-14)
  • “The effects of DNA extraction and PCR amplification for our protocols were much larger than those due to sequencing and classification” (10.1186/s12866-015-0351-6)
  • “Nested PCR introduced bias in estimated diversity and community structure. The bias was more significant for communities with relatively higher diversity and when more cycles were applied in the first round of PCR” (10.1371/journal.pone.0132253)
  • “pyrosequencing errors can lead to artificial inflation of diversity estimates” (19725865)
  • “Our findings suggest that when alternative sequencing approaches are used for microbial molecular profiling they can perform with good reproducibility, but care should be taken when comparing small differences between distinct methods” (25421243)

Data analysis strategy


Bioinformatics / database issues


Contamination in kits

  • “Reagent and laboratory contamination can critically impact sequence-based microbiome analyses” (25387460)
  • “Due to contamination of DNA extraction reagents, false-positive results can occur when applying broad-range real-time PCR based on bacterial 16S rDNA” (15722157)
  • “Relatively high initial densities of planktonic bacteria (10(2) to 10(3) bacteria per ml) were seen within [operating ultrapure water treatment systems intended for laboratory use]” (8517737)
  • “Sensitive, real-time PCR detects low-levels of contamination by Legionella pneumophila in commercial reagents” (16632318)
  • “Taq polymerase contains bacterial DNA of unknown origin” (2087233)


Commercial stuff








  1. You may have some of these already, but following may be useful (PubMed IDs given).

    Also may be worth considering statistical analyses being a source of bias: e.g. Number of feeding and clinical studies with no investigations of confounding/modifier effects. Have seen in one study that if you consider all evidence, diet drives difference between subjects not disease state. Can’t discuss in further detail here as unpublished.

  2. Hi Mick – I’ve kept a list for a while now. Not comprehensive (especially missing newer stuff, I don’t get to read as much these days) but here it is anyway:

    – Jonathan (@Klassenlab)

    Acinas, S.G., Sarma-Rupavtarm, R., Klepac-Ceraj, V. & Polz, M.F. (2005). PCR-Induced Sequence Artifacts and Bias: Insights from Comparison of Two 16S rRNA Clone Libraries Constructed from the Same Sample. Applied and Environmental Microbiology. 71 (12): 8966–8969.
    Adams, R.I., Amend, A.S., Taylor, J.W. & Bruns, T.D. (2013). A Unique Signal Distorts the Perception of Species Richness and Composition in High-Throughput Sequencing Surveys of Microbial Communities: A Case Study of Fungi in Indoor Dust. Microbial Ecology. 66 (4): 735–741.
    Ahn, J.-H., Kim, B.-Y., Song, J. & Weon, H.-Y. (2012). Effects of PCR cycle number and DNA polymerase type on the 16S rRNA gene pyrosequencing analysis of bacterial communities. Journal of microbiology (Seoul, Korea). 50 (6): 1071–4.
    Al-soud, W.A. & Radstrom, P. (1998). Capacity of Nine Thermostable DNA Polymerases To Mediate DNA Amplification in the Presence of PCR-Inhibiting Samples. Applied and Environmental Microbiology. 64 (10): 3748–3753.
    Arbeli, Z. & Fuentes, C.L. (2007). Improved purification and PCR amplification of DNA from environmental samples. FEMS microbiology letters. 272 (2): 269–75.
    Arezi, B., Xing, W., Sorge, J. a & Hogrefe, H.H. (2003). Amplification efficiency of thermostable DNA polymerases. Analytical Biochemistry. 321 (2): 226–235.
    Ashelford, K.E., Chuzhanova, N. a., Fry, J.C., Jones, A.J. & Weightman, A.J. (2006). New screening software shows that most recent large 16S rRNA gene clone libraries contain chimeras. Applied and Environmental Microbiology. 72 (9): 5734–5741.
    Ashelford, K.E., Chuzhanova, N. a., Fry, J.C., Jones, A.J. & Weightman, A.J. (2005). At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies. Applied and Environmental Microbiology. 71 (12): 7724–7736.
    Ben-Dov, E., Shapiro, O.H. & Kushmaro, A. (2012). ‘Next-base’ effect on PCR amplification. Environmental microbiology reports. 4 (2): 183–8.
    Berry, D., Ben Mahfoudh, K., Wagner, M. & Loy, A. (2011). Barcoded primers used in multiplex amplicon pyrosequencing bias amplification. Applied and Environmental Microbiology. 77 (21): 7846–9.
    Blazewicz, S.J., Barnard, R.L., Daly, R. a & Firestone, M.K. (2013). Evaluating rRNA as an indicator of microbial activity in environmental communities: limitations and uses. The ISME Journal. 7 (11): 2061–8.
    Bonnet, R., Suau, A., Dore, J., Gibson, G.R. & Collins, M.D. (2002). Differences in rDNA libraries of faecal bacteria derived from 10- and 25-cycle PCRs. International Journal of Systematic and Evolutionary Microbiology. 52: 757–763.
    Brakenhoff, R.H., Schoenmakers, J.G.G. & Lubsen, N.H. (1991). Chimeric cDNA clones: a novel PCR artifact. Nucleic acids research. 19 (8): 1949.
    Bru, D., Martin-Laurent, F. & Philippot, L. (2008). Quantification of the detrimental effect of a single primer-template mismatch by real-time PCR using the 16S rRNA gene as an example. Applied and Environmental Microbiology. 74 (5): 1660–3.
    Champlot, S., Berthelot, C., Pruvost, M., Bennett, E.A., Grange, T. & Geigl, E.-M. (2010). An efficient multistrategy DNA decontamination procedure of PCR reagents for hypersensitive PCR applications. PLOS One. 5 (9).
    Chandler, D.P., Fredrickson, J.K. & Brockman, F.J. (1997). Effect of PCR template concentration on the composition and distribution of total community 16S rDNA clone libraries. Molecular Ecology. 6: 475–482.
    Chang, S.-S., Hsu, H.-L., Cheng, J.-C. & Tseng, C.-P. (2011). An efficient strategy for broad-range detection of low abundance bacteria without DNA decontamination of PCR reagents. PLOS One. 6 (5): e20303.
    Chou, Q. (1992). Minimizing deletion mutagenesis artifact during Taq DNA polymerase PCR by E. coli SSB. Nucleic acids research. 20 (16): 4371.
    Cline, J., Braman, J.C. & Hogrefe, H.H. (1996). PCR fidelity of Pfu DNA polymerase and other thermostable DNA polymerases. Nucleic acids research. 24 (18): 3546–3551.
    Don, R.H., Cox, P.T., Wainwright, B.J., Baker, K. & Mattick, J.S. (1991). ‘Touchdown’ PCR to circumvent spurious priming during gene amplification. Nucleic acids research. 19 (14): 4008.
    Gál, J., Schnell, R. & Kálmán, M. (2000). Polymerase dependence of autosticky polymerase chain reaction. Analytical biochemistry. 282 (1): 156–8.
    Gaspar, J.M. & Thomas, W.K. (2013). Assessing the Consequences of Denoising Marker-Based Metagenomic Data. PLOS One. 8 (3).
    Gonzalez, J.M., Portillo, M.C., Belda-Ferre, P. & Mira, A. (2012). Amplification by PCR artificially reduces the proportion of the rare biosphere in microbial communities. PLOS One. 7 (1): e29973.
    Hansen, M.C., Tolker-Nielsen, T., Givskov, M. & Molin, S. (1998). Biased 16S rDNA PCR amplification caused by interference from DNA flanking the template region. FEMS Microbiology Ecology. 26: 141–149.
    Hoshino, T. & Inagaki, F. (2012). Molecular quantification of environmental DNA using microfluidics and digital PCR. Systematic and Applied Microbiology. 35 (6): 390–395.
    Huse, S.M., Huber, J. a, Morrison, H.G., Sogin, M.L. & Welch, D.M. (2007). Accuracy and quality of massively parallel DNA pyrosequencing. Genome biology. 8 (7): R143.
    Ishii, K. & Fukui, M. (2001). Optimization of Annealing Temperature To Reduce Bias Caused by a Primer Mismatch in Multitemplate PCR. Applied and Environmental Microbiology. 67 (8): 3753–3755.
    Judo, M.S.B., Wedel, A.B. & Wilson, C. (1998). Stimulation and suppression of PCR-mediated recombination. Nucleic Acids Research. 26 (7): 1819–1825.
    Kanagawa, T. (2003). Bias and Artifacts in Multitemplate Polymerase Chain Reactions (PCR). Journal of Bioscience and Bioengineering. 96 (4): 317–323.
    Kim, Y.H., Yang, I., Bae, Y.-S. & Park, S.-R. (2008). Performance evaluation of thermal cyclers for PCR in a rapid cycling condition. BioTechniques. 44 (4): 495–6, 498, 500 passim.
    Kitchin, P.A., Szotyori, Z., Fromholc, C. & Almond, N. (1990). Avoidance of false positives. Nature.
    Kopczymski, E.D., Bateson, M.M. & Ward, D.M. (1994). Recognition of chimeric small-subunit ribosomal DNAs composed of genes from uncultivated microorganisms. Applied and Environmental Microbiology. 746 (20): 746–748.
    Kreader, C.A. (1996). Relief of Amplification Inhibition in PCR with Bovine Serum Albumin or T4 Gene 32 Protein. Applied and Environmental Microbiology. 62 (3): 1102–1106.
    Kurata, S., Kanagawa, T., Magariyama, Y., Takatsu, K., Yamada, K., Yokomaku, T. & Kamagata, Y. (2004). Reevaluation and Reduction of a PCR Bias Caused by Reannealing of Templates. Applied and Environmental Microbiology. 70 (12): 7545–7549.
    Kwok, S. & Higuchi, R. (1989). Avoiding false positives with PCR. Nature.
    Liesack, W., Weyland, H. & Stackebrandt, E. (1991). Potential Risks of Gene Amplification by PCR as Determined by 16S rDNA Analysis of a Mixed-Culture of Strict Barophilic Bacteria. Microbial Ecology. 21: 191–198.
    Liu, Y., Döring, J. & Hurek, T. (2012). Bias in topoisomerase (TOPO)-cloning of multitemplate PCR products using locked nucleic acid (LNA)-substituted primers. Journal of microbiological methods. 91 (3): 483–6.
    Mathieu-Daudé, F., Welsh, J., Vogt, T. & Mcclelland, M. (1996). DNA rehybridization during PCR : the ‘Cot effect’ and its consequences. Nucleic Acids Research. 24 (11): 2080–2086.
    Mennerat, A. & Sheldon, B.C. (2014). How to Deal with PCR Contamination in Molecular Microbial Ecology. Microbial Ecology.
    Meyerhans, A., Vartanian, J.-P. & Wain-Hobson, S. (1990). DNA recombination during PCR. Nucleic Acids Research. 18 (7): 1687–1691.
    Osborne, C. a, Galic, M., Sangwan, P. & Janssen, P.H. (2005). PCR-generated artefact from 16S rRNA gene-specific primers. FEMS microbiology letters. 248 (2): 183–7.
    Pan, Y., Bodrossy, L., Frenzel, P., Hestnes, A.-G., Krause, S., Lüke, C., Meima-Franke, M., Siljanen, H., Svenning, M.M. & Bodelier, P.L.E. (2010). Impacts of inter- and intralaboratory variations on the reproducibility of microbial community analyses. Applied and Environmental Microbiology. 76 (22): 7451–8.
    Pinto, A.J. & Raskin, L. (2012). PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets. PLOS One. 7 (8): e43093.
    Polz, M.F. & Cavanaugh, C.M. (1998). Bias in Template-to-Product Ratios in Multitemplate PCR. Applied and Environmental Microbiology. 64 (10): 3724–3730.
    Porter, T.M. & Golding, G.B. (2012). Factors that affect large subunit ribosomal DNA amplicon sequencing studies of fungal communities: classification method, primer choice, and error. PLOS One. 7 (4): e35749.
    Qiu, X., Wu, L., Huang, H., McDonel, P.E., Palumbo, A. V, Tiedje, J.M. & Zhou, J. (2001). Evaluation of PCR-Generated Chimeras, Mutations, and Heteroduplexes with 16S rRNA Gene-Based Cloning. Applied and Environmental Microbiology. 67 (2): 880–887.
    Rådström, P., Knutsson, R., Wolffs, P., Lövenklev, M. & Löfström, C. (2004). Pre-PCR Processing. Molecular Biotechnology. 26: 133–146.
    Ralser, M., Querfurth, R., Warnatz, H.-J., Lehrach, H., Yaspo, M.-L. & Krobitsch, S. (2006). An efficient and economic enhancer mix for PCR. Biochemical and biophysical research communications. 347 (3): 747–51.
    Reysenbach, A., Giver, L.J., Wickham, G.S. & Pace, N.R. (1992). Differential Amplification of rRNA Genes by Polymerase Chain Reaction. Applied and Environmental Microbiology. 58 (10): 3417–3418.
    Rochelle, P.A., Cragg, B.A., Fry, J.C., Parkes, R.J. & Weightman, A.J. (1994). Effect of sample handling on estimation of bacterial diversity in marine sediments by 16S rRNA gene sequence analysis. FEMS Microbiology Ecology. 15: 215–225.
    Rock, C., Alum, A. & Abbaszadegan, M. (2010). PCR inhibitor levels in concentrates of biosolid samples predicted by a new method based on excitation-emission matrix spectroscopy. Applied and Environmental Microbiology. 76 (24): 8102–9.
    Salipante, S.J., Kawashima, T., Rosenthal, C., Hoogestraat, D.R., Cummings, L. a., Sengupta, D.J., Harkins, T.T., Cookson, B.T. & Hoffman, N.G. (2014). Performance Comparison of Illumina and Ion Torrent Next-Generation Sequencing Platforms for 16S rRNA-Based Bacterial Community Profiling. Applied and Environmental Microbiology. 80 (24): 7583–7591.
    Salter, S.J., Cox, M.J., Turek, E.M., Calus, S.T., Cookson, W.O., Moffatt, M.F., Turner, P., Parkhill, J., Loman, N.J. & Walker, A.W. (2014). Reagent contamination can critically impact sequence-based microbiome analyses. BMC Biology. 12: 87.
    Sarkar, G. & Sommer, S.S. (1990). Shedding light on PCR contamination. Nature.
    Schneider, S., Enkerli, J. & Widmer, F. (2009). A generally applicable assay for the quantification of inhibitory effects on PCR. Journal of microbiological methods. 78 (3): 351–3.
    Schrader, C., Schielke, a, Ellerbroek, L. & Johne, R. (2012). PCR inhibitors – occurrence, properties and removal. Journal of applied microbiology. 113 (5): 1014–26.
    Schwarz, K., Hansen-Hagge, T. & Bartram, C. (1990). Improved yields of long PCR products using gene 32 protein. Nucleic Acids Research. 18 (4): 1079.
    Sergeant, M.J., Constantinidou, C., Cogan, T., Penn, C.W. & Pallen, M.J. (2012). High-throughput sequencing of 16S rRNA gene amplicons: effects of extraction procedure, primer length and annealing temperature. PLOS One. 7 (5): e38094.
    Shuldiner, A.R., Nirula, A. & Roth, J. (1989). Hybrid DNA artifact from PCR of closely related target sequences. Nucleic Acids Research. 17 (11): 4409.
    Silkie, S.S., Tolcher, M.P. & Nelson, K.L. (2008). Reagent decontamination to eliminate false-positives in Escherichia coli qPCR. Journal of microbiological methods. 72 (3): 275–82.
    Sipos, R., Székely, A.J., Palatinszky, M., Révész, S., Márialigeti, K. & Nikolausz, M. (2007). Effect of primer mismatch, annealing temperature and PCR cycle number on 16S rRNA gene-targetting bacterial community analysis. FEMS Microbiology Ecology. 60 (2): 341–50.
    Sipos, R., Szekely, A., Revesz, S. & Marialigeti, K. (2010). Addressing PCR biases in environmental microbiology studies S. P. Cummings (ed.). Methods in Molecular Biology. 599: 37–58.
    Speksnijder, A.G.C.L., Kowalchuk, G.A., de Jong, S., Kline, E., Stephen, J.R. & Laanbroek, H.J. (2001). Microvariation Artifacts Introduced by PCR and Cloning of Closely Related 16S rRNA Gene Sequences. Applied and Environmental Microbiology. 67 (1): 469–472.
    Stevens, J.L., Jackson, R.L. & Olson, J.B. (2013). Slowing PCR ramp speed reduces chimera formation from environmental samples. Journal of Microbiological Methods. 93 (3): 203–205.
    Süss, B., Flekna, G., Wagner, M. & Hein, I. (2009). Studying the effect of single mismatches in primer and probe binding regions on amplification curves and quantification in real-time PCR. Journal of microbiological methods. 76 (3): 316–9.
    Suzuki, M.T. & Giovannoni, S.J. (1996). Bias Caused by Template Annealing in the Amplification of Mixtures of 16S rRNA Genes by PCR. Applied and Environmental Microbiology. 62 (2): 625–630.
    Suzuki, M., Rappé, M.S. & Giovannoni, S.J. (1998). Kinetic bias in estimates of coastal picoplankton community structure obtained by measurements of small-subunit rRNA gene PCR amplicon length heterogeneity. Applied and Environmental Microbiology. 64 (11): 4522–4529.
    Taylor, D.L., Herriott, I.C., Long, J. & O’Neill, K. (2007). TOPO TA is A-OK: a test of phylogenetic bias in fungal environmental clone library construction. Environmental microbiology. 9 (5): 1329–34.
    Thompson, J.R., Marcelino, L.A. & Polz, M.F. (2002). Heteroduplexes in mixed-template amplifications: formation, consequence and elimination by ‘reconditioning PCR’. Nucleic Acids Research. 30 (9): 2083–2088.
    v. Wintzingerode, F., Go, U.B. & Stackebrandt, E. (1997). Determination of microbial diversity in environmental samples: pitfalls of PCR-based rRNA analysis. FEMS Microbiology Reviews. 21: 213–229.
    van Doorn, R., Klerks, M.M., van Gent-Pelzer, M.P.E., Speksnijder, a G.C.L., Kowalchuk, G. a & Schoen, C.D. (2009). Accurate quantification of microorganisms in PCR-inhibiting environmental DNA extracts by a novel internal amplification control approach using Biotrove OpenArrays. Applied and Environmental Microbiology. 75 (22): 7253–60.
    Wang, G.C. & Wang, Y. (1997). Frequency of Formation of Chimeric Molecules as a Consequence of PCR Coamplification of 16S rRNA Genes from Mixed Bacterial Genomes. Applied and Environmental Microbiology. 63 (12): 4645–4650.
    Weiss, S., Amir, A., Hyde, E.R., Metcalf, J.L., Song, S. & Knight, R. (2014). Tracking down the sources of experimental contamination in microbiome studies. Genome Biology. 15 (12): 564.
    Wilson, I.G. (1997). Inhibition and Facilitation of Nucleic Acid Amplification. Applied and Environmental Microbiology. 63 (10): 3741–3751.
    Wright, E.S., Yilmaz, L.S., Ram, S., Gasser, J.M., Harrington, G.W. & Noguera, D.R. (2014). Exploiting extension bias in polymerase chain reaction to improve primer specificity in ensembles of nearly identical DNA templates. Environmental Microbiology. 16 (5): 1354–1365.
    Wu, J.-Y., Jiang, X.-T., Jiang, Y.-X., Lu, S.-Y., Zou, F. & Zhou, H.-W. (2010). Effects of polymerase, template dilution and cycle number on PCR based 16 S rRNA diversity analysis using the deep sequencing method. BMC microbiology. 10 (1): 255.
    Zhou, J., Wu, L., Deng, Y., Zhi, X., Jiang, Y.-H., Tu, Q., Xie, J., Van Nostrand, J.D., He, Z. & Yang, Y. (2011). Reproducibility and quantitation of amplicon sequencing-based detection. The ISME Journal. 5 (8): 1303–13.

  3. Hi Mick,

    This is a great idea! One of my many bug-bears is the lack of understanding about the influence of batch effects in microbiome studies. Another one is the number of studies I see that fail to perform replication! Totally agree with Lesley that diet is an overlooked component in many studies of disease state.

    Here are a couple of recent papers I have found.

    Goodrich et al (2014) Conducting a microbiome study. Cell 158(2):250-62
    This paper describes many of the factors that can confound human/animal microbiome studies. No mention of batch effects though.

    Jong et al (2016) Preservation Methods Differ in Fecal Microbiome Stability, Affecting Suitability for Field Studies. mSystems 10.1128/mSystems.00021-16
    This study tests FTA cards, RNAlater, OMNIgene Gut, 70% ethanol and 95% ethanol. 70% ethanol and RNAlater performed poorly. FTA cards seem to show a slightly increased taxonomic diversity. They recommend 95% ethanol, OmniGENE gut or FTA cards for storage out to 8 weeks. Interesting that Rob Knight, the senior author on this paper, is a co-author on the Hale et al (2015) paper (www.sciencedirect.com/science/article/pii/S0167701215001104) which concluded that FTA cards have the least similar microbial community profiles when compared to fresh samples, performing similarly to RNAlater. I’m not sure what to make of this! I’ll be testing a few things over the coming weeks.

    On keeping the number of PCR cycles down – this is generally accepted by most microbiome folks. I don’t know why the Earth Microbiome Project protocol recommends the use of 35 cycles! http://www.earthmicrobiome.org/emp-standard-protocols/16s/

    I’m part of a team that is going to apply for funding for a couple of large studies (>1000 people). I would like to know what others recommend in terms of low-cost and effective solutions for sample collection, transport and storage for DNA-based studies. Keep in mind that I’m in Australia – our cities are spread out and things heat up in summer, so anything shipped at ambient temperature is likely to take time to arrive and experience 35+ degrees!

    Finally, I’m keen to help out with this in any way I can. My background is in molecular biology so I consider the wet lab side of microbiome science to be vitally important in the progression of this fast moving field. I’m on the organising committee for an Australian microbiome symposium for EMCRs in December. This discussion would make a great topic for my talk – it’s an effective way to get the message out there and make people who are starting out in their careers consider all of the steps and objectively determine how what they are doing will influence the results they observe.

    Carly (@MicrobialMe)

  4. Mention of batch effect in https://www.ncbi.nlm.nih.gov/m/pubmed/22797518/ buried in the supplementary material, but it doesn’t look like they took it into account for analyses.

  5. Hey Mick–

    Here’s a few more references from my list, many of which are pretty new. Happy to help writing if you spin this into a review.

    pubmed ID – title

    27454739 – Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies.
    27391011 – GUTSS: An Alignment-Free Sequence Comparison Method for Use in Human Intestinal Microbiome and Fecal Microbiota Transplantation Analysis.
    27342980 – 16S rRNA gene sequencing of mock microbial populations- impact of DNA extraction method, primer choice and sequencing platform.
    27255739 – Challenges for case-control studies with microbiome data.
    27255738 – Compositional data analysis of the microbiome: fundamentals, tools, and challenges.
    27242673 – Sample Processing Impacts the Viability and Cultivability of the Sponge Microbiome.
    27184874 – Suddenly everyone is a microbiota specialist!
    27121074 – Mechanistic and Technical Challenges in Studying the Human Microbiome and Cancer Epidemiology.
    27113916 – Biological causal links on physiological and evolutionary time scales.
    26984526 – RiboFR-Seq: a novel approach to linking 16S rRNA amplicon profiles to metagenomes.
    26866392 – The Effects of Bowel Preparation on Microbiota-Related Metrics Differ in Health and in Inflammatory Bowel Disease and for the Mucosal and Luminal Microbiota Compartments.
    26819854 – The impact of freeze-drying infant fecal samples on measures of their bacterial community profiles and milk-derived oligosaccharide content.
    26572876 – Sample storage conditions significantly influence faecal microbiome profiles.
    26563586 – Intrinsic challenges in ancient microbiome reconstruction using 16S rRNA gene amplification.
    26148172 – Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data.
    26437933 – An accurate and efficient experimental approach for characterization of the complex oral microbiota.
    26512100 – Library preparation methodology can influence genomic and functional predictions in human microbiome research.
    26140923 – Stool metatranscriptomics: A technical guideline for mRNA stabilisation and isolation.
    26056565 – Strong spurious transcription likely contributes to DNA insert bias in typical metagenomic clone libraries.
    26024217 – The effect of sampling and storage on the fecal microbiota composition in healthy and diseased subjects.
    25880246 – The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies.
    25853934 – Average genome size estimation improves comparative metagenomics and sheds light on the functional ecology of the human microbiome.
    25741335 – Evaluating variation in human gut microbiota profiles due to DNA extraction method and inter-subject differences.
    25680374 – The most widespread problems in the function-based microbial metagenomics.
    25525895 – Inconsistent Denoising and Clustering Algorithms for Amplicon Sequence Data.
    24923665 – The bias associated with amplicon sequencing does not affect the quantitative assessment of bacterial community dynamics.
    24884524 – Processing faecal samples: a step forward for standards in microbial community analysis.
    24778776 – Influence of DNA extraction on oral microbial profiles obtained via 16S rRNA gene sequencing.
    24722376 – Development of a novel long-range 16S rRNA universal primer set for metagenomic analysis of gastrointestinal microbiota in newborn infants.
    24708850 – CopyRighter: a rapid tool for improving the accuracy of microbial community profiles through lineage-specific gene copy number correction.
    24708091 – Exome capture from saliva produces high quality genomic and metagenomic data.
    24475755 – Caught in the middle with multiple displacement amplification: the myth of pooling for avoiding multiple displacement amplification bias in a metagenome.
    24260553 – The approach to sample acquisition and its impact on the derived human fecal microbiome and VOC metabolome.


  6. biomickwatson

    7th August 2016 at 8:49 pm

    Thanks this is awesome!

    Any chance at all you can send me this as BibTex? Or list of pubmed IDs?

  7. biomickwatson

    7th August 2016 at 9:08 pm

    Thank you!

  8. biomickwatson

    7th August 2016 at 9:09 pm

    Thanks Carly 🙂

  9. biomickwatson

    7th August 2016 at 9:09 pm

    Awesome Marc, thank you!

  10. Emailed you .bib file.


  11. biomickwatson

    7th August 2016 at 9:24 pm

    Awesome thank you!

  12. Mick, I’m not sure if you have one yet- it’s marine- but it shows a source of bias if you’re going for more general microbial community studies.


  13. 19759898 – High prevalence of Methanobrevibacter smithii and Methanosphaera stadtmanae detected in the human gut using an improved DNA detection protocol. (underestimation of Archaea)

  14. Hi Mick,

    we’ve just published a review that covers one critical aspect that has been overlooked by many researchers, namely, how to properly analyze 16S rRNA microbial data obtained through next-generation sequencing (NGS). We discuss the general pipeline used in all marker gene analyses, going from DNA extraction through getting clean, ready-to-analyze data, including issues concerning multitemplate PCR, the selection of hypervariable 16S rRNA regions and primers, the amplicon sequencing by NGS platforms, the removing of dubious and chimeric sequences, the clustering and classification of operational taxonomic units, and the correction by 16S copy number.

    de la Cuesta-Zuluaga J and Escobar JS (2016) Considerations For Optimizing Microbiome Analysis Using a Marker Gene. Front. Nutr. 3:26. doi: 10.3389/fnut.2016.00026


Leave a Reply

© 2018 Opiniomics

Theme by Anders NorenUp ↑