Adam Phillipy and others have published a really cool paper on this at arXiv, and I have to say I am really, really incredibly impressed by PacBio and all the advances bioinformaticians are making in this area. It’s really cool.
However, in the hype, it’s possible to lose sight of the advantages of the Illumina system, and there are certainly some uncertainties around cost – in the arXiv paper, we see the phrase:
“While the cost of multiplexed Illumina can be as low $300 per genome, the resulting assemblies are typically in hundreds of contigs”
Whilst I don’t have issue with the latter part of that sentence, the first part is perhaps worth questioning!
The rapid-run mode of the HiSeq 2500 is perfectly capable of producing 150 million 150bp paired-end reads. This equates to 45Gb of sequence data.
If we are sequencing 5Mb genomes, at 40X, we need 200Mb of sequence. 96 of those will therefore need ~20Gb of sequence, so as you see, a single lane of HiSeq 2500 easily copes.
ARK-Genomics runs a non-profit full cost recovery business model, which means we charge for reagents, staff time and equipment. So for that lane of sequencing, we would probably charge in the region of £2500.
We need to factor in the cost of libraries. In reality, we could make this cheaper via automation, but for the sake of ease, let’s say the library prep is £100 per sample. That’s £9600 on library prep.
That’s a total of £12100, or £126 per genome.
In reality, I think we could get library prep down to £50 per sample, This would bring the cost down to £76 per genome.
At present exchange rates, $300 is about £200, so you can see our costs are significantly cheaper than the costs in Adam’s paper.
I have less of an idea about Pac Bio costs – Adam’s paper suggests between $900 and $1200, but admits a different recipe is as high as $2200.
We have commissioned some PacBio work and the cost was about £1100 for a single sample.
Perhaps others can comment on this?
My conservative estimate is that PacBio is about 10 times more expensive per sample for bacterial genomes than Illumina, and in reality it is probably higher. Even taking my conservative estimate, the figure of “10 times” is significantly higher than the comparison implied in Adam’s paper. My worry is that Adam’s paper compares an expensive Illumina quote with a cheap PacBio quote.
Pros and Cons
Pros of PacBio are that you get a finished genome.
Pros of Illumina are
- Cost per sample is far cheaper
- Population level statistics – I’m not sure of the fold coverage one achieves with PacBio, but 40x Illumina coverage certainly lets you begin to see low-level variants in the population of cells being sequenced
- Scale – if you want to sequence 96 genomes, the only real option is Illumina – more people have consumables budgets of around £10k than have budgets around £100k
Horses for courses
I love what PacBio are doing, and I love what Adam and others are doing on the Informatics side. At the end of the day, we must choose the right technology for the right question. PacBio is great if you want complete genomes; Illumina is still the only viable alternative if you want to sequence hundreds of bacterial genomes at once.