Opiniomics

bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

Assembling B fragilis from MinION and Illumina data

You may have seen our bioRxiv preprint about the sequencing and assembly of B fragilis using Illumina + MinION sequence data.  Well, here is how to do it yourself.

First get the data:

# MinION data (raw dast5 data; needs to be extracted)
wget ftp://ftp.sra.ebi.ac.uk/vol1/ERA463/ERA463589/oxfordnanopore_native/FAA37759_GB2974_MAP005_20150423__2D_basecalling_v1.14_2D.tar.gz
mkdir fragilis_minion
tar -xzf FAA37759_GB2974_MAP005_20150423__2D_basecalling_v1.14_2D.tar.gz -C fragilis_minion
rm fragilis_minion/*.md5

# MiSeq data (fastq data)
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR973/ERR973713/ERR973713_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR973/ERR973713/ERR973713_2.fastq.gz

Then, within R:

library(poRe)
extract.run.fasta(dir="fragilis_minion")

This will extract all sequences as FASTA into fragilis_minion/extracted.  Let’s put all the 2D reads into one file:

cat fragilis_minion/extracted/*.2D.fasta > fragilis_minion.2D.fasta

And finally we are ready to assemble it:

spades.py -o spades_fragilis -1 ERR973713_1.fastq.gz -2 ERR973713_2.fastq.gz --nanopore fragilis_minion.2D.fasta

That’s as far as I can go on my Ubuntu laptop, wiil update when I get to work!

2 Comments

  1. Nice! In the preprint, you mention trimming the reads with trimmomatic. I am trying to do that but am getting different read trimmed numbers, probably because I am using different settings. Can you add the trimmomatic command to your post?

  2. Nice job. I can handle my sequencing data now.

Leave a Reply

© 2017 Opiniomics

Theme by Anders NorenUp ↑