bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

On Mars, Venus, Monkeys and Microbiomes

You would have to be dead not to realise that the major story today is the contamination paper, Salter et al, available here.  Read more about it here and here, including quotes from me.  I’ve been banging on about this paper for a while now, and arguably I think it’s more important than the authors do!

As I said in Ed Yong’s Nat Geo piece, Salter et al isn’t just about contamination in microbiome studies, it is about how we behave as scientists.  Did we learn nothing from the XMRV fiasco?  Apparently not, because we are still finding probable contaminants and insisting they’re implicated in varied diseases such as HIV, cancer, sleep apnoea and stupidity.

My point is this: in any experiment (not just microbiome), and especially experiments involving deep sequencing, if you find something incredible, then it probably is exactly that (below is the definition of incredible, by the way) – “unconvincing; far fetched; hard to believe; scarecely credible” etc etcincredible

Yesterday, I presented at NGS Sheffield, and this is what I said about contamination:

If you travel to Mars, and upon your return, feel ill; you subsequently undergo a sequencing test that suggests you are infected with a virus from Venus; the conclusion should not be “Venusian virus found on Mars!”; rather, the question should be “have any of my lab been to Venus?” and  “Could any of my reagents include anything Venusian?”

In other words, when you find something that flies in the face of all previous evidence, or which defies belief, then the onus is on you to prove it is not a false positive.  Sometimes I think researchers are too quick to pick up the phone to Nature and Science – because incredible results get high impact papers, and to hell with the idea that it might not actually be true.

To torture another metaphor, the infinite monkey theorem states that an infinite number of monkeys hitting keys at random on a keyboard will eventually type the complete works of Shakespeare.  Well, with current high-throughput sequencing technologies, we are approaching the equivalent of infinite monkeys, which means occasionally you’re going to find something that looks like Shakespeare.  It isn’t though, and we all need to be big enough to admit that our perfect results might turn out to be complete crap.



  1. Reblogged this on Kurui's blog and commented:
    Is publish or perish the cause of these errors? Where research is more focused on getting a paper out that validations and checks are ignored. Recent ongoing controversy on contamination of cell lines and how thousands of peer review papers continue to be published based on these cell-lines cast doubt on the effectiveness of peer review.

    The worst this is when such cell-lines have been used to test cancer drugs only to fail at the clinical trials because they were being tested on the wrong cell-line.

    This is just sad.

  2. The goal of most scientists is to leave their mark on humanity. Sometimes, they’re willing to disregard all logical thought to do just that.

  3. I learned about the infinite monkey theorem from this post and it reonate with my experience in past 5 years about a segment of bioinformatics studies
    should we restate the monkey theorem for those and say sth like,

    The infinite monkey theorem states that if a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the loci assosiated with T2D.

Leave a Reply

© 2018 Opiniomics

Theme by Anders NorenUp ↑