The internet is awash with completely unfounded speculation (read: utter crap) about the $1000 genome, and recently two things have made me a little “upset”.

The first, this paper’s assertion that “with high-throughput DNA sequencing costs dropping <$1000 for human genomes, data storage, retrieval and analysis are the major bottlenecks in biological studies”.  I’m sorry for being a traditionalist, but I thought statements in papers were supposed to be based on facts, referenced and backed up by evidence?  I know, what a dinosaur!

Update 22/06/2013: apologies to the PeerJ – who apparently never used the first claim officially, it seems to be a mutated meme that appeared on twitter.

The second, possibly 10 times worse, is the PeerJ‘s marketing which used to begin “If we can sequence a human genome for $100….”.  Possibly under pressure, Peerj now state “If we can set a goal to sequence the Human Genome for $99…”, but quite frankly the damage has already been done.

The standard that is generally accepted is that the human genome should be sequenced to 30X coverage, so that is what I will talk about below.

I’m going to try and lay this out in a completely technology neutral way, though I will have to mention different sequencing technologies at some point.  However, I am pretty convinced of this one fact: there is not a single sequencing technology out today that can deliver 30X of a human genome for anywhere near $1000.

Quite frankly, they struggle to get near twice that.  Feel free to disagree with me in the comments, but provide evidence please.

How sequencing is costed

This is pretty simple, but there are five facets to how much sequencing a human genome costs:

  1. Reagents and consumables.  We need to buy chips, flowcells, reagents etc to actually put on the machine.  These are bought directly from the relevant sequencing company.
  2. Staff time.  There is no magic machine where you put DNA in and get sequence out.  It takes time to prepare DNA and make sequencing libraries and run the machine.  This costs money, in salary, pensions etc.
  3. Equipment depreciation.  Basically, if I run a sequencing machine for 3 years, at the end of that 3 years I will probably need to buy a new one.  So the cost of the purchase of the new machine gets spread over the projects I run on the old one.  This is the only sustainable business model, unless you assume an investor will continually give you money, or that you have a rich benefactor who subsidises your business model.
  4. Bioinformatics/data storage.  The data need to be QC-ed and at the very least aligned.  The raw and aligned data need to be stored somewhere.
  5. Overheads.  We need to pay the rent, pay electricity and water bills etc.  I know, but they cost money and the money has to come from somewhere

What things actually cost

I’m not going to list each company and give you costs, but what I am going to say is this:

None of the current sequencing companies can deliver 30x of a human genome for less than $1000 reagent costs (using list prices)

Yes, that’s right – even ignoring points 2-5, even just buying reagents, the cost is greater than $1000 for a 30x human genome.

Now, it’s possible Broad, BGI, Sanger etc can get below $1000 for the reagents due to sheer economies of scale and special deals they have with sequencing companies – but then remember they have to add in those extra charges (2-5) above.

Obviously, Illumina don’t charge themselves list price for reagents, and nor do LifeTech, so it’s possible that they themselves can sequence 30x human genomes and just pay whatever it costs to make the reagents and build the machines; but this is not reality and it’s not really how sequencing is done today.  These guys want to sell machines and reagents, they don’t want to be sequencing facilities, plus they still have to pay the staff, pay the bills, make a profit and return money to investors.

Myths in the press

You may come across articles like this, which have blithe statements such as “Complete Genomics now routinely sequenced human genomes at 30x coverage for less than $1,000 in reagent costs”.  

Well, lets not forget that Complete Genomics’ business model completely failed, they never made money and had to be bought by BGI in order to survive.

This kind of article/statement is basically marketing for the company involved, because they want to be the one to reach the $1000 genome first.  Scratch beneath the surface though, and its all smoke and mirrors.

…., the $1,000,000 analysis

Utter crap.  Utter, utter crap.  EDIT 14:10 18/06/2013. There is a real question about why we compare detailed research data analysis costs to sequencing costs – we’ve always had to analyse the data and write papers, sequencing data is no different. Do we compare analysis costs to qPCR costs? Microarray costs? Why all of a sudden are we comparing the very expensive activity of “doing research” with sequencing costs?

Obviously I recognise that in some circumstances, the analysis can cost way more than the sequencing, but it’s really not as common as its made out to be.

Economies of scale

When I mentioned reagents costs above, I said “list price”.  Of course, you can achieve huge discounts if you buy lots of reagents, and so if you are sequencing say 10,000 human genomes then you will get a massive reduction on those reagents prices.  Huge projects such as this probably include the sequencing company as a partner, and in such arrangements of course it may be possible to do 30x human genomes at less than $1000.  But this would represent a completely unique scenario, a one-off, and wouldn’t affect the price the rest of us have to pay for human genomes.

Can we have some truth please?

My problem is, every time an article is published saying that it’s possible to do $1000 human genomes, we get collaborators who expect that price.  Your bullshit affects my life, and I get upset by that.  Why doesn’t everyone just tell the truth?  We know what it is.  We all know which company comes closest to delivering the $1000 genome, and we know which companies simply aspire to it.  We know that none of the companies have yet achieved it.  We know that they all want to, and hell, I want to too – I would love to deliver $1000 genomes, $100 genomes etc.  But we’re not there yet, and if you say we are, then you’re going to get struck off my Christmas list!

Update: 22/06/2013

This rather amusing piece turned up on the 19th: $1000 genome a mirage, in which Craig Venter and Eric Topol both completely agree with me 🙂  Well, they say I am wrong, but when you read what they have to say, they actually agree with me.  I commented on the article which I produce below:

The first thing worth noting is that the cost of sequencing is actually starting to go up:

http://genomebiology.com/2013/14/5/115

and the rate of change of the price reduction has been following an upwards trend for some time:

A pedantic look at the cost of sequencing

And my point about the Illumina genomes, which may very well cost $2500 to you is that they are SUBSIDIZED to make them that cheap. I can do you a $1 genome if I subsidize the costs, and we’re not taking about the $1 genome are we?