bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

How not to make your papers replicable

Titus Brown has written a somewhat idealistic post on replicable bioinformatics papers, so I thought I would write some of my ideas down too 🙂

1. Create a new folder to hold all of the results/analysis.  Probably the best thing to do is name it after yourself, so try “Dave” or “Kelly”.  If you already have folders with those names, just add an index number e.g. “Dave378” or “Kelly5142”

2. Put all of your analysis in a Perl script.  Call this analysis.pl.  If you need to update this script, don’t add a version number, simply call these new scripts “newanalysis.pl”, “latestanalysis.pl”, “newnewanalysis.pl”, “newestanalysis.pl” etc etc

3. Place a README in the directory.  Don’t put anything in the README.  Your PI will simply check that the README is there and be satisfied you are doing reproducible research.  They won’t ever read the README.

4. Write the paper in Word, called “paper.docx”.  Send it round to all co-authors, asking them to turn on track changes.  Watch in horror has 500 different versions come back to you, called things like “paper_edited.docx”, “paper_mw.docx”, “paper_new.docx” etc etc.  Open each one to see that it now looks like Salvadore Dali had an epilectic fit in a paint factory.

5. When reviewer comments come back 6 months later asking for some small detail to be changed, have a massive panic attack as you realise you have no idea how you did any of it.  Start the whole analysis again, in a new folder (“Dave379” or “Kelly5143”) and pray to God that you somehow miraculously come up with the same results and figures.

6. After the paper has been accepted, and the copy editor insists that all figures are 1200 dpi, first look up dpi so you know what it means, and then wrestle with R’s png() and jpeg() functions.  Watch as your PC grinds away for 300 hours to produce a scatterplot that, in area, is roughly the size of Russia and comes in at 30Tb.  Attempts to open it in an image viewer crash your entire network.

7. Weep silently with joy when someone tells you about ImageMagick, or that the journal will accept PDF images.

8. Upon publication, forget any of this ever happened.


  1. Hi Mick,

    Regarding your fourth point, what would be the better alternative? I’ve always marvelled at how there’s no real version control software tailored to written documents. Just like software development, writing is a process that is iterative and often times collaborative. I’ve found that if you want to get general comments from a large group of people, the best thing is to create a public but (unlisted) google docs document that anyone can edit. That way every comment and edit is out in the open and the final result is not an incomprehensible, shambolic mess.

  2. One potential alternative is Google Docs, where you can track changes and actually see changes made by others in real time.

    But document management is a mature process. There are many tools out there that allow document management, including code-like tools such as checking out, branching and merging.

  3. IPython notebooks. Done.

  4. Reblogged this on iscb-rsg-switzerland and commented:
    Do you have some really bad habits in your computational research?

Leave a Reply

© 2018 Opiniomics

Theme by Anders NorenUp ↑