Should be a popular one this 🙂
First of all we need an example. I’m using R23456, which we can download from NCBI. On that page, look towards the top-right, click “Send To”, choose “File”, leave format as “GenBank (full)” and click “Create File”. This should download to your computer.
In Excel, click File -> Open, navigate to the folder you downloaded the GenBank sequence to, make sure “All files (*.*)” is chosen in the File Open dialogue box and choose the recently downloaded file (it is almost certainly called sequence.gb)
This should immediately open up the Excel “Text Import Wizard”. Just hit Finish. You should see something like:
Now, scroll down to the sequence data, which in my version is in cells A55 to A61. Select all of those cells.
Select the Data menu and then click “Text to columns”
This brings up another wizard, but again you can just click finish. The sequence data is now in B55 through E61.
Now click in cell I55 and type “=B55&C55&D55&E55&F55&G55”. Hit enter. Now click on the little green square and drag down to I61 so we have sequence data for all of the cells.
Now we’re getting close!
Click cell A1 and then repeat the “Text to Columns” trick.
Now go back to down to cell I54 and type “=’>’&B1” and hit enter. We now have something that looks a lot like FASTA!
Final step – select all cells in the range I54:I61, hit Ctrl-C, then hit the “+” at the bottom to add a new sheet, and in the new sheet hover over cell A1, right-click your mouse and choose Paste Special.
In the “Paste Special” dialogue click the radio button next to “Values” and hit OK.
Now click File -> Save As, navigate to a suitable folder, make sure “Save as type” is set to “Text (Tab delimited) (*.txt)”, give it a filename and hit Save. Click “OK” and “Yes” to Excel’s next two questions.
Go look at the file, it’s FASTA!
Awesome! Who needs bioinformaticians eh?
(and if you use this, please never, ever speak to me. Thanks 🙂)