bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

Is the long read sequencing war already over?

My enthusiasm for nanopore sequencing is well known; we have some awesome software for working with the datawe won a grant to support this work; and we successfully assembled a tricky bacterial genome.  This all led to Nick and I writing an editorial for Nature Methods.

So, clearly some bias towards ONT from me.

Having said all of that, when PacBio announced the Sequel, I was genuinely excited.   Why?  Well, revolutionary and wonderful as the MinION was at the time, we were getting ~100Mb runs.  Amazing technology, mobile sequencer, tri-corder, just incredible engineering – but 100Mb was never going to change the world.  Some uses, yes; but for other uses we need more data.  Enter Sequel.

However, it turns out Sequel isn’t really delivering on promises.  Rather than 10Gb runs, folk are getting between 3 and 5Gb from the Sequel:

At the same time, MinION has been coming along great guns:

Whilst we are right to be skeptical about ONT’s claims about their own sequencer, other people who use the MinION have backed up these claims and say they regularly get figures similar to this. If you don’t believe me, go get some of the World’s first Nanopore human data here.

PacBio also released some data for Sequel here.

So how do they stack up against one another?  I won’t deal with accuracy here, but we can look at #reads, read length and throughput.

To be clear, we are comparing “rel2-nanopore-wgs-216722908-FAB42316.fastq.gz” a fairly middling run from the NA12878 release, m54113_160913_184949.subreads.bam and one of the Sequel SMRT cell datasets released.

Read length histograms:


As you can see, the longer reads are roughly equivalent in length, but MinION has far more reads at shorter read lengths.  I know the PacBio samples were size selected on Blue Pippin, but unsure about the MinION data.

The MinION dataset includes 466,325 reads, over twice as many as the Sequel dataset at 208,573 reads.

In terms of throughput, MinION again came out on top, with 2.4Gbases of data compared to just 2Gbases for the Sequel.

We can limit to reads >1000bp, and see a bit more detail:


  • The MinION data has 326,466 reads greater than 1000bp summing to 2.37Gb.
  • The Sequel data has 192,718 reads greater than 1000bp, summing to 2Gb.

Finally, for reads over 10,000bp:

  • The MinION data has 84,803 reads greater than 10000bp summing to 1.36Gb.
  • The Sequel data has 83,771 reads greater than 10000bp, summing to 1.48Gb.

These are very interesting stats!

This is pretty bad news for PacBio.  If you add in the low cost of entry for MinION, and the £300k cost of the Sequel, the fact that MinION is performing as well as, if not better, than Sequel is incredible.  Both machines have a long way to go – PacBio will point to their roadmap, with longer reads scheduled and improvements in chemistry and flowcells.  In response, ONT will point to the incredible development path of MinION, increased sequencing speeds and bigger flowcells.  And then there is PromethION.

So is the war already over?   Not quite yet.  But PacBio are fighting for their lives.


  1. How do quality scores btw technologies compare?

  2. Your stated bias at the start seems to be the reason you violated the first rule of evaluating any technology – evaluate it based on what you actually need it to do and not some intermediate metric.

    Just because of ton of dung floats and weighs more than a yacht doesn’t mean scat is the next great thing in ocean travel.

  3. The race is still on. PacBio has to keep its promises though, or start to be more realistic. More competition is always good.

  4. For MinION run, G-tube shearing to approx 8kb and 0.4x ampure bead washes are what determine the size. Note, the DNA extracted fresh from human cells resulted in significantly longer average read length than purchased human DNA.

  5. biomickwatson

    9th December 2016 at 9:18 pm

    Did I choose a GIAB DNA sample?

  6. Consensus sequence quality is still obviously the edge PacBio still has. So ONT can substitute when that isn’t a problem, or if you can fix it (e.g. hybrid assembly or polishing with Illumina).

    If the announced Scrappie basecaller can start dealing with ONT’s homopolymer issue, then PacBio won’t have much edge left.

  7. biomickwatson

    9th December 2016 at 10:47 pm

    There’s degrees here right? We had 60X pacbio -> still had indel problem after polishing

  8. This is a poor comparison.

    1) You’re using outdated Pacbio chemistry. Most Sequels now are getting 5-7 GB’s. Or about 3x the newest nanopore 1D data gives

    2) 1D nanopore quality is not that good. Raw bases dont matter if the accuracy is low

    3) The nanopore run is ~24hours while the Pacbio run is 6 hrs.

    While the Sequel costs 350k, you are able to get 5-7GB’s with high consensus accuracy, low run times (<6 hrs), and long read lengths for about $1k. A nanopore costs $1k and gives your poorer data quality, lower reads, and 1-2 GB of data. There's also not many applications out yet for nanopore. They have a lot of work to do before it's ready for large scale projects.

  9. It sounds like ONT is running about a year behind PacBio. That is a pretty small lead for PacBio, given that the underlying technology for nanopores is substantially cheaper than anything based on optics.

  10. Thanks for that, that’s very interesting. Throughput improvements for the minion have been significant.
    Looking at px, atleast for this human genome study, about 29 flow cells were required and even at the cheapest flow cell px, that’s about $15k.
    That’s more expensive than a 20x on PacBio. Assuming 6GB per flow cell, this is about $7k on Sequel in 6hrs. Right ?
    I think one thing that’s conspicuous with its absence in discussions about the minion is the accuracy. Why doesn’t someone just publish a comparison between Sanger and Minion data or for that matter, vs PacBio for NA12878 to settle the matter. PacBio has 10x on this at DNANexus when they released a note on 19 Oct about identifying SVs at low coverage. Unftly, I don’t have the bioinformatics skills for this but am very curious to know, as must be lots of others !
    Thanks ,

  11. biomickwatson

    10th December 2016 at 8:03 am

    Do you have any evidence for the 5-7Gb figure? I’ve not seen that.

    In terms of quality values I believe 1D nanopore has a mean slightly above PacBio raw but a longer tail of poorer values. Both are correctable to ~99%

    The run time is a good point.

  12. biomickwatson

    10th December 2016 at 8:06 am

    I’ve no doubt this comparison is coming!

    Can you break down those prices? ONT flowcells are $500k. How much is Sequel SMRT? And ONT would argue there is no $300k capital cost of their tech.

  13. biomickwatson

    10th December 2016 at 8:08 am

    I believe they are similar but ONT have a longer tail of low values

  14. biomickwatson

    10th December 2016 at 8:09 am

    And what do you think is the best use case for each tech?

  15. biomickwatson

    10th December 2016 at 8:10 am

    Exactly. PacBio need to up their game. We need to see 20kb and 30kb preps on the Sequel.

  16. Reply to John :

    1. 5-7G isn’t 3X what a MinION gives. Indeed 1 MinION flow cell will give that and up to 10G on exceptional runs. Bad samples / preps might be 1G. Most user runs are giving a median of 4.5G/cell. One of the public human genome efforts got over 7G.

    2. 1D nanopore is ~9% error (uncorrected). As I understand thats slightly better than PacBio. 2D is ~4% error. We expect 1D^^2 to be better than 2D.

    3. Nanopore does not have a fixed run time. You can run a few minutes or up to 48 Hrs. Yes the pacbio run is 6hrs per flow cell but as i understand the library prep is much longer. A few MinIONs can easily be run in parallel, in contrast to the Sequel which runs flow cells in sequence.

    In fact MinION cost 1K, read lengths are down to fragment lengths and are longer if done right, 1D accuracy is better and the latest base callers will deal with homopolymers better. Methods using raw data give very high consensus (more news on this soon) and I was easily able to get my own genome sequence just by running a few MinIONs in parallel with about a 30min blood->reads time. Id say that was a large scale project.

    Hope that is useful.

  17. If most sequels give 5-7 Gb data why haven’t we seen that yet? In addition I hear the Sequel quality currently is lower than RSII…. 2D (and likley 1D2) have higher raw read qualities better to PacBio, with the exception of homopolymers.

  18. I should add, of course people might reasonably think im biased. But there is quite a lot of reported data now. Also, just 1k and you can test any of these claims for yourself instead of listening to factoids.

  19. Yes I agree. ONT has made great progress and is catching up to Pacbio. But the Sequel platform is also very scalable. The Sequel chip is integrated onto an optical sensor which removes most of the high costs. It also makes it easy to scale up. They can just order higher multiplexed optical sensors to increase the ZMW count substantially. They have discussed the 5-10M ZMW chip they expect in 2017-2018.

    The long read war is far from over. Which is great news for customers.

  20. See Pacbio’s latest chemistry update. Many customers have been able to get 5-7GB of data per run. It just depends on the application you are running.

    Yes both are correctable to 99%. But ONT errors arent random so Pacbio can be corrected to very high accuracies 99.999%+

  21. Thanks, this information has been useful.

    Pacbio run times are also not fixed. But I think they max out at 6 hrs. How many MiniIONs can run in parallel on a computer?

    In terms of accuracy, if I remember correctly ONT has systematic errors that will prevent it from reaching the high consensus accuracy that Pacbio gets.

    By large scale projects I mean projects that will use thousands on MiniIONs. Right now Pacbio ships tens of thousands of SMRT cells a month. Their technology is being used for many large scale projects. I dont think we can say that about ONT (yet). There is a lot work ONT will need to do to build up a manufacturing line for that type of demand.

  22. biomickwatson

    10th December 2016 at 6:40 pm

    PacBio still have an indel error after correcting. Definitely.

  23. Just answering johns questions (nothin g here not covered elsewhere in our online videos).

    Several MinIONs can be run on a computer. It does depend somewhat on the computer in question. As we roll out the hardware accelerated local base caller it is likely this will increase. However, MinION isn’t really designed for that sort of factory sequencing. For that we have PromethION, with 48 individually addressable flowcells.

    Our consensus accuracy is 99.98+ on error corrected data. The remaining issue is mostly long homopolymer data and we recently showed a new base caller that can deal with homopolymer calling. More WIP than systematic problem , I.e. The basecalling is catching up with the raw data. On those theme, base analogues are on the feature list to tackle next, they are visible in the raw data.

    ONT ships thousands of flowcells a month, we have a few thousand MinIONs out with variable utilization. Our production line can deal with 10s of thousands per month. Our CEO and people he has recruited, were instrumental in setting up blood glucose strip productions lines where billions are shipped per year (and made near Oxford).

    Hope that is helpful.

  24. I’ve seen the real data of Sequels in Asian countries. They often generate >5 Gb data per cell, although they varies sample to samples. If loading concentration is best fit, >5Gb throughput with >10 kb avg read is in common. I don’t see the quality is lower than RSII. Maybe it’s an old chemistry (v.1.2.0) story. Current v.1.2.1 chemistry is good enough. I hope the next version chemistry generate longer read length.

  25. As Justin mentioned, the MinION reads from the WGS consortium were sheared with G-tubes prior to sequencing, which explains the 10kb spike:


    I expect that they will be eventually releasing transposon-fragmented reads as well, which should be considerably longer.

  26. There has also been a “human” run done on PacBio, from the same cell lines as done by the nanopore WGS consortium (NA12878):

    [10-fold coverage of Sequel data of NA12878]

  27. I just had a look at comparing the mitochondrial genome mapping between PacBio and flow cell run ‘FAB45271’ of the nanopore WGS. I took the mapped reads for chrM, and remapped them using Graphmap to the hg38 mitochondrial chromosome, then processed them through a variant proportion script that I have cooked up. Results are here:



    My 1-minute overview (using Tablet statistics):

    Sequel — 2,272 reads, 8.6% mismatch, 28,667 features (SNPs/INDELs)
    MinION — 220 reads, 15.3% mismatch, 700 features

  28. I’ve just retrieved all the reads that BWT thought were likely mtDNA reads from *all* of the downloadable MinION runs from cultured cells. After mapping with GraphMap (unfortunately not in circular mode due to a Segfault bug), the error rate is worse, and the numbers of features are in excess of the Sequel features:

    MinION — 2,386 reads, 25.4% mismatch, 38,312 features (SNPs/INDELs)
    Sequel — 2,272 reads, 8.6% mismatch, 28,667 features

    The absolute mismatch error is less important than the relative error; mileage may vary depending on how reads are mapped to sequence. It’s also possible that a better set of MinION reads could be chosen by using the PacBio mapper, or a worse set of PacBio redas by using BWA mem.

    But what about consensus?

    For the MinION reads, these are all the mitochondrial sequence positions that had a reference read coverage that was not the highest coverage:

    $ zcat proportion_GraphMap_MinION_cells_all_vs_hg38_chrM.csv.gz | awk -F’,’ ‘{if(($6 == “pR”) || ($6 < $7) || ($6 < $8) || ($6 < $9)|| ($6 < $10)|| ($6 < $11)|| ($6 < $12)){print $0}}'

    And here are the same statistics for the Sequel reads (i.e. positions with reference coverage not max coverage):

    $ zcat proportion_GraphMap_Sequel_vs_hg38_chrM.csv.gz | awk -F',' '{if(($6 == "pR") || ($6 < $7) || ($6 < $8) || ($6 < $9)|| ($6 < $10)|| ($6 < $11)|| ($6 Del
    7337: G -> A/Del (MinION); G -> A (Sequel)
    13326: T -> C
    14831: G -> A
    14872: C -> T
    15326: A -> G/Del (MinION); A -> G (Sequel)

    So Sequel is producing fewer errors both at a single read level, and also in consensus.

    See BAM and proportion files here:


  29. Last bit of my post was munched…

    And here are the same statistics for the Sequel reads (i.e. positions with reference coverage not max coverage):

    $ zcat proportion_GraphMap_Sequel_vs_hg38_chrM.csv.gz | awk -F’,’ ‘{if(($6 == “pR”) || ($6 < $7) || ($6 < $8) || ($6 < $9)|| ($6 < $10)|| ($6 < $11)|| ($6 Del
    7337: G -> A/Del (MinION); G -> A (Sequel)
    13326: T -> C
    14831: G -> A
    14872: C -> T
    15326: A -> G/Del (MinION); A -> G (Sequel)

    So Sequel is producing fewer errors both at a single read level, and also in consensus.

    See BAM and proportion files here:


  30. Interesting analysis. So Sequel still produces better quality data. But the gap is closing.

  31. Yawn. Wake me when Oxford can show something both independent and peer reviewed. …

  32. Hi, thanks for that. There was a neat table on twitter for comparing pacb vs nanopore for Streptomyces Coelicolor.

    A number of commentators pointed out the errors in SNPs and Indels vs the ref.
    SNPs vs refs: 88(Pacb) vs 1095 (ont)
    Indels vs refs: 58(Pacb) vs 6044 (ont)

    Would it be possible to generate a table from your analysis above for this human genome ? I guess your nos don’t account for 10x Pacbio vs 20x ont in this analysis right or do we interpret this as Pacb has lesser errors even at 10x vs ont at 20x ? Although one could argue they converge with higher coverage…

  33. At least for mitochondrial DNA, PacBio is showing fewer errors than ONT, even in the consensus sequence at a similar coverage. I’ll have another try with that PacBio table using HTML codes instead of “<” and “>”:

    $ zcat GraphMap_Sequel_vs_hg38_chrM.proportion.csv.gz | awk -F’,’ ‘{if(($6 == “pR”) || ($6 < $7) || ($6 < $8) || ($6 < $9)|| ($6 < $10)|| ($6 < $11)|| ($6 < $12)){print $0}}’Assembly,Position,Coverage,ref,cR,pR,A,C,G,T,d,i,InsMode

    The variants shown were the ones that varied from the hg38 reference in both sequencing technologies:

    7337: G -> A/Del (MinION); G -> A (Sequel)
    13326: T -> C
    14831: G -> A
    14872: C -> T
    15326: A -> G/Del (MinION); A -> G (Sequel)

  34. If Oxford Nanopore Technologies is showing this thing that you want to be woken up for, it won’t be independent research. Most larger research projects are probably going to be discounted by ONT in some fashion because they want to encourage research on their technology.

    What I’m trying to say is that if you want to wait until discoveries around ONT technologies are truly independent, you might as well go into suspended animation.

    If, on the other hand, you are comfortable with only peer review, just do a search:


    The rate of technology improvement by ONT is such that most peer-reviewed articles appear woefully out-of-date even at the time of publication. Consider that the average yield from a good working MinION flow cell at the start of this year (2016) was around 100-200Mb.

    Peer reviewed R9.4 articles are probably going to show up in the next few months, by which time ONT will have brought out a faster chemistry and/or software update to improve yield and accuracy. I’m not quite sure why technology improvement would put anyone to sleep (unless it were for suspended animation), but each to their own, I guess.

  35. Has anyone done amplicon sequencing on the minion? Like, short-read amplicons. Can consensus over many short reads improve accuracy, and by how much? Is there any requirement for the DNA length to be reasonably sequenced on the Minion?

  36. Would the termination of the Roche deal slow down or accelerate developments at PacBio for the research market?

  37. Frankly speaking i would say that at the end science will win.
    Instead of speculating on which one is better I would say that this battle is great and note that we are not discussing about illumina anymore, suggesting that a new era on NGS is coming. Maybe the two will have their market place , who knows, maybe they end up merging one day but it´s great that we are moving to long read sequencing at relative low cost. I´ve been waiting for this for a long time.

  38. biomickwatson

    20th December 2016 at 9:05 am

    Yes they have but unsure it’s a good idea. E.g. 16S OTU analysis is often done using 3% difference, yet on MinION error is 10%. Seems like a recipe for disaster.

  39. We’ve done a little bit. It really depends on the application whether or not the MinION will be effective. With 20,000 reads split over about 20 gene copies of a ~600bp sequence with 1-5% difference (about 30-90 mins of sequencing, depending on the flow cell), we were able to do rough counting of approximate proportion, but linking a specific read to a specific gene was difficult. With a bit more bayesian chops, I can imagine it could be done.

    For 16S sequencing, the sequenced length should be as long as possible (ideally full length), and even then evidence should be collected from multiple points in the reads to get an idea of the sample composition. Kraken is fairly good at this, but only when the sample is composed of things that have been seen before; it needs a well-described database, and any unknown sequences will probably get tagged as something else.

    When looking for SNPs (or VNTR lengths) in a single sequence, then the MinION should be fine at ~100X coverage (which is very easily and quickly done for amplicons). INDELs of only a few bases don’t really work, except when digging deep into the signal-level data; there’s too much systematic error in the software base calling model to reliably determine the presence or absence of a particular INDEL.

    De-novo sequencing from amplicons needs evidence from other more accurate reads to work properly. The MinION is great for generating initial scaffolds, but they need to be cleaned up a bit before being set in stone (or paper).

  40. Almost nothing from this is useful for a hospital, that is the sadness IMO. All people talk about is how fast your engine goes and no one in genomic pulls out turn-key detect/read/call system for hospital lab techs. This is techy talk, really no one wins anything in real clinical life…

Leave a Reply

© 2017 Opiniomics

Theme by Anders NorenUp ↑