My aim for this post is a quick, pithy review of available sequencers, what they’re good for, what they’re not and under which circumstances you should buy one. However, your first thought should be: do I need to buy a sequencer? I know it’s great to have your own box (I love all of ours), but they are expensive, difficult to run, temperamental and time-consuming. So think on that before you spend millions of pounds of your institute’s money – running the sequencers may cost millions more. My blog post on choosing an NGS supplier is still valid today.
Illumina HiSeq X Ten
To paraphrase Quentin Tarantino, you buy one of these “When you absolutely, positively got to sequence every person in the room, accept no substitutes”. The HiSeq X Ten is actually ten HiSeq X instruments; each instrument can run two flowcells and each flowcell has 8 lanes. Each lane will produce 380-450million 150PE reads (120Gbase or data or 40X of a human genome). Runs take 3 days. Expect to pay an extra £1M on computing to cope with the data streams. Ordered flowcells are quite difficult to deal with and can result in up to 30% “optical duplicates” (actually spill over from one well to adjacent wells). You can producing 160 genomes every 3 days. Essentially now used as a very cheap, whole-genome genotyping system, cost per genome is currently £850+VAT at Edinburgh Genomics. Limited to 30X (or greater) genome sequencing. I have checked with Illumina and this is definitely still true.
Illumina HiSeq X Five
Do not buy one of these. The reagents are 40% more expensive for no good reason. Simply out source to someone with an X Ten.
Illumina HiSeq 4000
The baby brother of the HiSeq X, I actually think it’s the same machine, except with smaller flowcells (possibly a different camera or laser). Expect 300million 150PE reads per lane (same setup, two flowcells, each with 8 lanes, 3.5 day run time). That’s 90Gbase per lane. Same caveats apply – ordered flowcells are tricky and it’s easy to generate lots of “optical duplicates”. No limitations, so you can run anything you like on this. The new workhorse of the Illumina stable. Buy one of these if you have lots and lots of things to sequence, and you want to run a variety of different protocols.
Illumina HiSeq 2500
One of the most reliable machines Illumina has ever produced. Two modes: high output has the classic 2 flowcell, 8 lane set-up and takes 6 days; rapid is 2 flowcell, 2 lanes and takes less time (all run times depend on read length). High output V4 capable of 250million 125PE reads, and rapid capable of 120million 250PE reads. Increased throughput of the 4000 makes the 2500 more expensive per Gb and therefore only buy a 2500 if you can get a good one cheap second-hand, or get offered a really great deal on a new one. Even then, outsourced 4000 data is likely to be cheaper than generating your own data on a 2500
Illumina NextSeq 500
I’ve never really seen the point – small projects can go on MiSeq, and medium- to large- projects fit better and are cheaper as a subset of lanes on the 2500/4000. The machine only measures 3 bases, with the 4th base being an absence of signal. This means the runs are ~25% quicker. I am told V1 data was terrible, but V2 data much improved. NextSeq flowcells are massive, the size of an iPad mini, and have four lanes, each capable of 100million 150PE reads. Illumina claim these are good for “Everyday exome, transcriptome, and targeted resequencing“, but realistically these would all be better and cheaper run multiplexed on a 4000.
A great little machine, one lane per run, V2 is capable of 12million 250PE reads per run; V3 claims 25million 300PE reads but good luck getting those, there has been a problem with V3 300PE for as long as I can remember – it just doesn’t work well. Great for small genomes and 16S.
I suspect Illumina will sell tons of these as they are so cheap (< $50k), but no-one yet knows how well it will run. Supposedly capable of 25million 150PE reads per run, that’s 7.5Gbase of data. You could just about run a single RNA-Seq sample on there, but why would you? A possible replacement for MiSeq if they get the read length up. Could be good for small genomes and possibly 16S. Illumina claim it’s for targetted DNA and RNA samples, so could work well with e.g. gene panels for rapid diagnostics.
One interesting downside of Illumina machines is that you have to fill the whole flowcell before you can run the machine. What this means is that despite the fact Illumina’s cost-per-Gb is smaller, small projects can be cheaper and faster on other platforms.
Ion Torrent and Ion Proton
The people who I meet who are happy with their Ion* platforms are generally diagnostic labs, where run time is really important (they are faster than Illumina machines) and where absolute base-pair accuracy is not important. Noone I know who works in genomics research uses Ion* data – it’s just not good enough. Major indel problem and Illumina data is cheaper and better.
No-one has seen any data but this looks like an impressive machine. There are 1 million ZMWs per SMRT cell and about 30-40% will return useable data. Useable data will be 10-20Kb reads at 85% raw accuracy, but correctable to 99% accuracy. Output at launch is 5-10Gbase per SMRT cell, and PacBio expect to produce 20Kb and 30Kb library protocols in 2016. Great for genome assembly and structural variation, not quite quantitative for RNA-Seq bu fantastic for gene discovery. Link this up to NimbleGen’s long fragment capture kits and you can target difficult areas of the genome with long reads. Machine cost is £300k so good value compared to the RSII. These will fly off the shelf.
The previous workhorse of PacBio, capable of 2Gbase of long reads per SMRT cell. Cool machine, but over-shadowed by Sequel, I wouldn’t recommend buying one.
Oxford Nanopore MinION
The coolest sequencer on the planet, a $1000 access fee gets you a USB sequencer the size of an office stapler. Each run on the mark I MinION can produce several hundred Mb of 2D data, and fast mode (in limited early access) promises to push this into the Gbases. Read lengths are a mean of 10Kb with raw 2D accuracy at 85% and a range of options for correction to 98-99% accuracy. We use for scaffolding bacterial genomes, and also for pathogen detection. Should you buy one? You should have one already!
Oxford Nanopore PromethION
The big brother of the MinION, this is best imagined as 48 bigger, fast-mode MinIONs run in parallel. If fast mode MinION can produce 1Gbase per run, the PromethION will produce 300Gbase per run. This machine is in limited early access, but offers the possibility of long-read, population scale sequencing. Access fee is $75,000 but expect to spend ten times that on compute to deal with the data. Get one if you can deal with the data.