Proposal: identifying contaminated cancer cell lines

This morning, Dan Graur tweeted this explosive article:

I recommend everyone reads it.  tl;dr – lots of cancer cell lines are not what they’re supposed to be, having been contaminated and overtaken by other, perhaps more aggressive cell lines.

With the advent of NGS, this seems like something we could tackle relatively easily.  For example, cell lines will have (i) signature gene expression profiles; (ii) signature SNP profiles; and (iii) signature CNV profiles.

It shouldn’t be too difficult to set up a service, linked to the public databases, that can check all submitted data against known (contaminant) cell lines and which could identify datasets that perhaps come from a different cell line to that which is reported.

I propose that the funding agencies immediately fund EBI/NCBI to set up such a service, attached to the major sequence repositories, that can identify possible cell line contamination.


  1. Contamination of human cell lines is anything but news. Rebecca Skloot covers the initial revelation in 1966 of widespread contamination of human cell lines by HeLa cells in Chapter 20 of her 2010 book “The Immortal Life of Henrietta Lacks” — “The HeLa Bomb” (http://books.google.co.uk/books?id=CE1xL3bxsCAC&printsec=frontcover&dq=hela+bomb&hl=en&sa=X&ei=Ve08VLPjKvKV7AaHmYH4BQ&ved=0CCIQ6AEwAA#v=onepage&q=%2220%20the%20hela%20bomb%20in%20september%22&f=false). Our institute already provides a cell line screening service, I suspect other institutes do as well.

    • Thanks Casey! Glad to hear the screening service is in place, however my point is that we should tie this to the large sequence and variant databases in the public domain and discover problems automatically when data are submitted.

  2. Mycoplasma contamination in sequencing data is a problem too: http://blog.genohub.com/mycoplasma-contamination-in-your-sequencing-data/. You can already scan various databases for this.

  3. Sebastian Boegel

    16th October 2014 at 8:02 am

    Thank you for the link and your ideas. Indeed, with NGS and the already existing bioinfo tools, such services/analysis could be easily done. One example is HLA type as barcode:

    – “An ID card for tumour cell lines: HLA typing can help” (http://www.ncbi.nlm.nih.gov/pubmed/11902535)
    – “A catalog of HLA type, HLA expression, and neo-epitope candidates in human cancer cell lines” (https://www.landesbioscience.com/journals/oncoimmunology/article/954893/)


  4. Reblogged this on uonphptlabtech and commented:
    Technology to revolutionize Health

  5. I am shocked to read this, particularly the story of Walter Nelson-Rees. These scientists are not even doing proper science and appear more concerned with their reputations than anything else!

    If this lack of rigour is the norm in cancer research, I am amazed that the field has made much progress at all.

  6. Hi, I’m the author of the article. Although I’m pleased Dan Graur thought it worth sharing, and you as well, I want to explain something. The article appeared on October 2 and is paywalled until December 2. During that time, the magazine hopes readers will be moved to either subscribe for a year, or at the least, pony up on ITunes or Google Play for an issue ($5.99). Piracy cuts into income stream for a struggling magazine. I can’t do these investigative pieces anymore at the word count/rate that is now available for this magazine–. Therefore, as a writer, my livelihood, and my ability to uncover real stories in science, whether achievement or malfeasance, is hampered. Of course, I can write for many publications, but I want to point out that waiting until the paywall lifts would be respectful and you’d still be able to discuss and then distribute the article. Generally, magazines paywall an article while an issue is on the real or virtual news stand, and then post it for free when the next issue hits the stands.

    Thanks for discussing it.
    Jill Neimark

