bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

Putting the HiSeq 4000 in context

Illumina have done it again, disrupted their own market under no competiton and produced some wonderful new machines with higher throughput and lower run times.  Below is a brief summary of what I have learned so far.

HiSeq X 5

Pretty basic, this is half of an X ten, but the reagents etc are going to be more expensive.  $6million caital for an X5 and the headline figure appears to be $1400 per 30X human genome.  The headline figure for X10 is $1000 per genome, so X5 may be 40% more expensive.

HiSeq 3000/4000

The 3000 is to the 4000 as the 1000 was to the 2000 and the 1500 to the 2500 – it’s a 4000 that can only run one flowcell instead of two.  I expect it to be as popular as the 1000/1500s were – i.e. not very.  No-one goes to a funder for capital investment and says “Give me millions of dollars so I can buy the second best machine”.

Details are scarce, but the 4000 (judging by the stats) will have 2 flowcells with 6 8 lanes each, do 2x150bp sequencing, it seems around 375 312 million clusters per lane in 3.5 days.

Here is how it stacks up against the other HiSeq systems:

Clusters per lane Read length Lanes Days Gb per lane Gb total Gb per day
V1 rapid 150000000 2×150 4 2 45 180 90
V2 rapid 150000000 2×250 4 2.5 75 300 120
V3 high output 180000000 2×100 16 11 36 576 52
V4 high output 250000000 2×125 16 6 62.5 1000 167
HiSeq 4000 312000000 2×150 16 3.5 93.6 1500 428
HiSeq X 375000000 2×150 16 3 112.5 1800 600

These are headine figures and contain some guesses. How the machines behave in reality might differ.

If any of my figures are wrong, please leave a comment!

UPDATE: there appears to be some confusion over the exact config of the HiSeq 4000.  The spec sheet says that 5 billion reads per run pass filter.  The RNA-Seq dataset has 378million reads “from one lane”.  5 billion / 378 million is ~ 13 (lanes).  My contact at Illumina says there are 8 lanes per flowcell.  5 billion clusters / 16 lanes would give us 312 million reads per lane.  Possible the RNA-Seq dataset is overclustered!

A 387million paired RNA-Seq data set is here.

Leave a Reply

© 2018 Opiniomics

Theme by Anders NorenUp ↑