Illumina have done it again, disrupted their own market under no competiton and produced some wonderful new machines with higher throughput and lower run times. Below is a brief summary of what I have learned so far.
Pretty basic, this is half of an X ten, but the reagents etc are going to be more expensive. $6million caital for an X5 and the headline figure appears to be $1400 per 30X human genome. The headline figure for X10 is $1000 per genome, so X5 may be 40% more expensive.
The 3000 is to the 4000 as the 1000 was to the 2000 and the 1500 to the 2500 – it’s a 4000 that can only run one flowcell instead of two. I expect it to be as popular as the 1000/1500s were – i.e. not very. No-one goes to a funder for capital investment and says “Give me millions of dollars so I can buy the second best machine”.
Details are scarce, but the 4000 (judging by the stats) will have 2 flowcells with 6 8 lanes each, do 2x150bp sequencing, it seems around 375 312 million clusters per lane in 3.5 days.
Here is how it stacks up against the other HiSeq systems:
Clusters per lane | Read length | Lanes | Days | Gb per lane | Gb total | Gb per day | |
V1 rapid | 150000000 | 2×150 | 4 | 2 | 45 | 180 | 90 |
V2 rapid | 150000000 | 2×250 | 4 | 2.5 | 75 | 300 | 120 |
V3 high output | 180000000 | 2×100 | 16 | 11 | 36 | 576 | 52 |
V4 high output | 250000000 | 2×125 | 16 | 6 | 62.5 | 1000 | 167 |
HiSeq 4000 | 312000000 | 2×150 | 16 | 3.5 | 93.6 | 1500 | 428 |
HiSeq X | 375000000 | 2×150 | 16 | 3 | 112.5 | 1800 | 600 |
These are headine figures and contain some guesses. How the machines behave in reality might differ.
If any of my figures are wrong, please leave a comment!
UPDATE: there appears to be some confusion over the exact config of the HiSeq 4000. The spec sheet says that 5 billion reads per run pass filter. The RNA-Seq dataset has 378million reads “from one lane”. 5 billion / 378 million is ~ 13 (lanes). My contact at Illumina says there are 8 lanes per flowcell. 5 billion clusters / 16 lanes would give us 312 million reads per lane. Possible the RNA-Seq dataset is overclustered!
A 387million paired RNA-Seq data set is here.