Illumina have announced NovaSeq, an entirely new sequencing system that completely disrupts their existing HiSeq user-base. In my opinion, if you have a HiSeq and you are NOT currently engaged in planning to migrate to NovaSeq, then you will be out of business in 1-2 years time. It’s not quite the death knell for HiSeqs, but it’s pretty close and moving to NovaSeq over the next couple of years is now the only viable option if you see Illumina as an important part of your offering.
I’ve taken the stats from the spec sheet linked above and produced the following. If there are any mistakes let me know.
There are two machines – the NovaSeq 5000 and 6000 – and 4 flowcell types – S1, S2, S3 and S4. The 6000 will run all four flowcell types and the 5000 will only run the first two. Not all flowcell types are immediately available, with S4 scheduled for 2018 (See below)
|Reads per flowcell (billion)||1.6||3.3||6.6||10||2||2.8||3.44|
|Lanes per flowcell||2||2||4||4||8||8||8|
|Reads per lane (million)||800||1650||1650||2500||250||350||430|
|Throughput per lane (Gb)||240||495||495||750||62.5||105||129|
|Throughput per flowcell (Gb)||480||990||1980||3000||500||840||1032|
|Run Throughput (Gb)||960||1980||3960||6000||1000||1680||2064|
|Run Time (days)||2-2.5||2-2.5||2-2.5||2-2.5||6||3.5||3|
For X Ten, simply mutiply X figures by 10. These are maximum figures, and assume maximum read lengths.
Read lengths available on NovaSeq 2×50, 2×100 and 2x150bp. This is unfortunate as the sweet spot for RNA-Seq and exomes is 2x75bp.
As you can see from the stats, the massive innovation here is the cluster density, which has hugely increased. We also have shorter run times.
So what does this all mean?
Well let’s put this to bed straight away – HiSeq X installations are still viable. This from an Illumina tech on Twitter:
@biomickwatson HiSeqX will still be cheaper per genome until the S4 flow cell is launched. S4 currently scheduled for 2018
— Neil Ward (@GenomicsUK) January 9, 2017
We learn two things from this – first, that HiSeq X is still going to be cheaper for human genomes until S4 comes out, and S4 won’t be out until 2018.
So Illumina won’t sell any more HiSeq X, but current installations are still viable and still the cheapest way to sequence genomes.
I also have this from an un-named source:
speculation from Illumina rep “X’s will be king for awhile. Cost per GB on those will likely be adjusted to keep them competitive for a long time.”
So X is OK, for a while.
What about HiSeq 4000? Well to understand this, you need to understand 4000 and X.
The HiSeq 4000 and HiSeq X
First off, the HiSeq X IS NOT a human genome only machine. It is a genome-only machine. You have been able to do non-human genomes for about a year now. Anything you like as long as it’s a whole genome and it’s 30X or above. The 4000 is reserved for everything else because you cannot do exomes, RNA-Seq, ChIP-Seq etc on the HiSeq X. HiSeq 4000 reagents are more expensive, which means that per-Gb every assay is more expensive than genome sequencing on Illumina.
However, no such restrictions exist on the NovaSeq – which means that every assay will now cost the same on NovaSeq. This is what led me to say this on Twitter:
NovaSeq kills 4000 not X
— Mick Watson (@BioMickWatson) January 9, 2017
At Edinburgh Genomics, roughly speaking, we charge approx. 2x as much for a 4000 lane as we do for an X lane. Therefore, per Gb, RNA-Seq is approx. twice as expensive as genome sequencing. NovaSeq promises to make this per-Gb cost the same, so does that mean RNA-Seq will be half price? Not quite. Of course no-one does a whole lane of RNA-Seq, we multiplex multiple samples in one lane. When you do this, library prep costs begin to dominate, and for most of my own RNA-Seq samples, library prep is about 50% of the per-sample cost, and 50% is sequencing. NovaSeq promises to half the sequencing costs, which means the per-sample cost will come down by 25%.
These are really rough numbers, but they will do for now. To be honest, I think this will make a huge difference to some facilities, but not for others. Larger centers will absolutely need to grab that 25% reduction to remain competitive, but smaller, boutique facilities may be able to ignore it for a while.
Expect to get pay $985k for a NovaSeq 6000 and $850k for a 5000.
One supposedly big advantage is that NovaSeq takes 40 hours to run, compared to the existing 3 days for a HiSeq X. Comparing like with like that’s 40 hours vs 72 hours. This might be important in the clinical space, but not for much else.
Putting this in context, when you send your samples to a facility, they will be QC-ed first, then put in library prep queue, then put in sequencing queue, then QC-ed bioinformatically before finally being delivered. Let’s be generous and say this takes 2 weeks. Out of that sequencing time is 3 days. So instead of waiting 14 days, you’re waiting 13 days. Who cares?
Clinically having the answer 1 day earlier may be important, but let’s not forget, even on our £1M cluster, at scale the BWA+GATK pipeline itself takes 3 days. So again you’re looking at 5 days vs 6 days. Is that a massive advantage? I’m not sure. Of course you could buy one of the super-fast bioinformatics solutions, and maybe then the 40 hour run time will count.
Colours and quality
NovaSeq marks a switch from the traditional HiSeq 4 colour chemistry to the quicker NextSeq 2 colour chemistry. As Brian Bushnell has noted on this blog, NextSeq data quality is quite a lot worse than HiSeq 2500, so we may see a dip in data quality, though Illumina claim 85% above Q30.