Opiniomics

bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

Call the bioinformatics police!

First of all, I want to state quite clearly that I am not a code Nazi.  I don’t care about your coding practices.  Good architecture, an elegant object model, a stable API, version control, efficient code reuse, efficient code etc etc.  I don’t care.  Write all the unit tests you like, because if they fail, I’ll just force the install anyway.  I don’t care if you used extreme programming, whether or not you involved Tibetan monks and had your github repository blessed by the Dalai Lama.  Maybe you ensured the planets were aligned before you released version 0.1, or made sure all of your code monkeys had perfect Feng Shui in their bedsits.  I don’t care.  That don’t impress me much.

However, I do care that your code goddamn works.

I think, as a scientist, if I take some published code, that it should work.  Not much too ask is it?  Sure, a readme.txt or a manual.pdf would be nice too, but first and foremost, it has to just do the eff-ing job it’s supposed to.

Very recently, I’ve had the huge displeasure of downloading, installing and trying to run a published bioinformatics tool – and believe me, this one is a doozy.

I’m not going to name names (see below) but here are a few features that make me angry:

  • This thing has been published twice, once in 2010, once in 2012, both times in journals that are pretty much a dream for most bioinformaticians, both with double-digit impact factors
  • There is no manual, no README, no INSTALL.txt – nothing
  • The main code calls another script, that calls a few more other scripts, that call other scripts etc etc etc.  This is not easy to debug.
  • It outputs a number of text formats, all of which are completely novel and completely undocumented.
  • There is some help printed to the command-line, but that help contradicts itself
  • There are hard-coded URLs and filenames in the code
  • The code relies on filename and unique identifier conventions that have long since been retired
  • The vast majority of comments in the code are lines of code that have been commented out
  • It doesn’t work on its own example data

*sigh*

The thing is, I really want this code to work.  I have some data, this code would help, and damn it but I don’t have time to fix the thing!

I know what you’re going to say….

If everyone cared about good coding practice, and if everyone adhered to “the rules”, then we wouldn’t be in this position, would we?  Well, yes and no.  I take your point, we should *all* follow rules for writing good code.  However, let’s not forget, we have full time jobs as scientists.  Being a good coder is also a full time job.  It’s fairly difficult to hold down two full time jobs, and so often something has to give, and my attitude is that as long as the code does the job it’s supposed to do, then that is a higher priority.  Some of these thoughts came out in a discussion with C Titus Brown on his blog.  What I was trying to say is that, given a choice between spending two weeks streamlining code and making it more efficient, or spending two weeks interpreting the information that your code has produced in a biological context, I’d choose the latter.  Why?  Because that’s why we are here – to do science.  I’ve met far too many bioinformaticians who’ve forgotten they’re also supposed to be biologists, not just coders.

I know many of you will disagree, but it would be boring if we all had the same opinion, right?

So why won’t I name and shame?

Well, it seems like a personal attack, and I don’t want that.  This group is by no means alone, and why would I single one group out when many are bad at it?  We’re all in the same boat.  We’re all being pressured to publish, REF2014 is on the horizon and I feel a lot of kinship with my fellow PIs, I want their papers to get published, I want them to have high impact papers, I want them to feel secure and happy.  Who knows why the code is so bad?  We’ve all had post-docs who couldn’t tie their own show-laces never mind write papers/code, and we still have to publish.  I happen to know the PI involved, they’re not a bad scientist, they’re just responding to the environment in which they exist.

Why am I bothering to write this then?

It’s cathartic, I am angry at the World and the way everything is set up to allow, no, to encourage this type of work.  I’m not angry at the PI – they’re just responding to the pressures we all feel.

A completely unworkable solution

This wouldn’t be a blog post if I didn’t propose a completely unworkable solution that assumes a perfect world full of nice fluffy people.  Anyway, here it is: the bioinformatics police.

The bioinformatics police are a voluntary organisation of kindly, helpful, experienced bioinformaticians.  Every month, the entire bioinformatics community votes for a piece of code or software tool that they would like the bioinformatics police to investigate.  When one is selected, three members of the bioinformatics police independently try to download, install and run the code/tool on a vanilla Ubuntu install.  If they are successful, they document how it is done.  If they are not successful, they contact the PI and work with them to improve the code until it is ready for the public to enjoy.

How does that sound?  I bet you thought I was going to suggest retraction, hanging or public humiliation, didn’t you?

Here’s to a perfect world!

Update – 19th January 2013

Just a quick update on the above, which caused quite a stir in certain circles.  I thought it was quite obvious that I wrote the above whilst very angry, and, given my mention of the Dalai Lama and Feng Shui, that I was being somewhat sarcastic.  It’s really important that you don’t completely miss the point, though 🙂

For the record, I’m not for one moment suggesting that good coding practices aren’t important.  Of course they are, and in an ideal world, everyone would use them.  My point is that they are not the most important aspects of scientific software.

Consider the following fictitious situation.  Software A is developed according to all of the rules of good coding practice, it is efficient and well documented…. but, it turns out that the algorithm used is flawed and more often than not, the answers the code produces are wrong.  In contrast, software B is a hacky perl script which reads entire genomic datasets into a hash array, uses tons of RAM and has no comments in the code to speak of.  However, the algorithm in this script is much better, and more often than not, the answers it produces are correct.

Which do you want to use on your data?  Which should get published?  On the latter point, I’m betting that most peer reviewers judge whether or not software reaches the correct answer; and pay little attention to the underlying coding practices.  I have no problem if you want to change this paradigm, and include good coding practice in the peer review process – but priority number one has to be the science.  It is absurd to suggest otherwise, as we may then see well engineered software being published that serves up incorrect answers (again, you might say “better than poorly engineered software that gets the answers wrong”, and I would agree – that’s what this post is about :))

Some of you may point out that well engineered and tested code is less likely to be buggy and less likely to make mistakes;  and I would agree, up to a point.  But no amount of good coding can make up for bad biology, and so I maintain my point is that in scientific software, the science is more important than the coding practices.  The latter, of course, are still important 🙂

38 Comments

  1. I feel the problem stems from no single organization to take the reigns and really be the conduit for all bioinformatics. For a long time I have hoped that Illumina would really become the development center on this, but that hasn’t really happened. Thus the problem you described: Everyone writes their own from scratch which is never a good thing. I think when we get a central organization who we can work within the confines of, things will get better.

  2. Having encountered way too much crappy bioinformatics software, I feel your pain. But I do think that using good coding practices would be the way to go. Because many of the problems I’ve seen in bioinformatics software did not lead to the software failing completely – far worse, it led to the software _appearing_ to work, but producing wrong results, in subtle ways. Transient and subtle memory corruption errors, silently using standard file formats in slightly non-standard ways, you name it. When software fails completely, at least you know there’s something wrong.

    You’re right, of course, that writing robust, dependable code is a full-time job. An obvious solution would be to concede to that and _make_ it a full-time job. I do realize that this may be somewhat wishful thinking, what with budget cuts and all, but it might also be an issue of priorities. Because after all, in bench work, there are TAs – so why doesn’t bioinformatics have TAs, too?

    I’m not entirely sure how serious you are with the bioinformatics police thing. Either way, it doesn’t change the fact that developing dependable software is a full time job. Plus, it’s _a lot_ more work to fix a piece of shoddy software afterwards, than it is to develop robust software in the first place.

    But now that I think about it, it might still be nice to have a volunteer task force that rewrites extant, crappy bioinformatics software. Assuming such a task force existed, which software would be highest on your personal list to be targeted for reengineering?

    • I just realized that I probably used the wrong term – TA is a Teaching Assistant, no? I was thinking of Technical Assistant, or possibly lab technician.

    • One of the issues is REF and the way we are assessed. Producing good code should be recognised as an impact and funded, but it isn’t, the papers you publish are. Therefore the emphasis is on papers, not software, and on lots of papers, not continued development of tbe same tool.

      The issue of silent use of code inappropriately, or of errors occurring without reports, has come up a few times. This is rife and I have no solution. For a long time, I felt that too many people in bioinformatics thought the answer to every question was to run BLAST, with default parameters. With more complicated algorithms, the opportunities for misuse are far more abundant.

  3. This is the kind of situation that led to Potti. Do you really think that the original authors understand how their own code works? Or is this analysis the product of a single programmer working in isolation who is rewarded with a big smiles, featured lab presentations and ultimatly publications when the analysis “works” and produces striking results but gets a cold shoulder when negative findings are presented? Given the pressure to produce more results and the obviously stressful and time consuming process of maintaining and evolving code like this, it is way to easy for coders to fool themselves.

    • My impression of this code is that it was very much developed for internal use, but then the pressure to publish made the PI put it out there, and it’s not really built for public consumption

  4. Where I work this is called QA.

  5. When are openings for this job?

  6. I’m the sysadmin for a bioinformatics group and I’d happily join the bioinformatics police because there a lot of software packages whose installation procedures are an atrocious waste of time.It’s not that bloody hard to document software and make it easy to install.

    • But scientists are not rewarded for doing it, nor are they punished for not doing it

      • I’m well aware of this (I was once one of you). But it’s still immensely frustrating.

        Bad development practices don’t just hurt other users of your code. It also leads to situations where you have pipelines or externally-facing web servers that are impossible to maintain.

      • Well, you are punished if your software doesn’t work or is difficult to get to work: people don’t use it (as much).

        The bioinformatics police would just bring this fact to people’s attention.

    • Sorry Tom. You are making us better step by step even if it appears to be a sisyphean task at times. 🙂

      If software is to be published then it should be simple and straightforward to install and run on the test data. Not some arbitrary tarball with no docs that is impossible to satisfy dependency hell on. We did threaten to use our PI as the ultimate arbiter of whether code was to be published, by having him install and run it without assistance.

  7. I agree that a tool should work as a priority and the rest is an added bonus. However, this post highlights a more serious problem: how did this tool pass review twice in high impact factor [sic] journals? Assuming there isn’t a problem at your end (of course not!), it suggests at least six reviewers didn’t test the code. This is a failing on the reviewers and journals.

    This will keep happening until journals insist that code goes through some minimal testing.

    I think the journals/reviewers should become the bioinformatics police.

    • I agree in part, but realistically, peer reviewers won’t do this. I didn’t for my most recent review for Bioinformatics. Do you download, install and run every piece of code you review? Perhaps journals should demand it.

  8. I get your frustration at bad code, that won’t install. But, at least you know from the off that it’s a dud. Worse, is code that installs, runs, produces an answer, but one that bears no resemblance to what the published paper says it should. I’d say the second kind is worse by an order of magnitude; it’s only after chasing false leads and lots of head scratching about inconsistent results that you think to start debugging the bad code.

    What I don’t get is why you think good coding practice won’t help. ok, if someone manages to publish code (twice!) that doesn’t even install, then all the unit tests in the world wouldn’t help. But why would taking some time to think about objects, data structures, test data and documentation not help generally? Until this becomes the norm in this field, we will continue to face failing, unmaintained, non-extensible code.

    • It’s not that I think good practice won’t help, I just recognise that as scientists web have other stuff to do, and we can’t do full time software engineering plus data analysis, plus interpretation, plus writing papers and grants. The compromise is we write software that works, but may not be perfectly engineered.

  9. The goal of Software Carpentry (http://software-carpentry.org) is to teach working scientists who already have too much to do a few basic coding practices so that they don’t wind up in this mess. We run two-day workshops whenever and wherever we can, and we’d be very happy to run some for would-be bioinformatics patrol officers 🙂 If you’re interested, please mail info@software-carpentry.org for more details.

  10. In my opinion there is a conceptual discrepancy between ‘scientific software’ and ‘research outcome’. In the bioinformatics community, published software is seen as a piece of finished work. Once it is published, the wheel is spinning on, and the authors move to the next project.
    But as soon as you publish a program, the life cycle of a software only begins. OK, if the software is useless it will be quickly forgotten. But if it turns out to be interesting you will have users. Users will report bugs, come up with suggestions on how the program might work better for them, and request attention by the authors. Now, the authors face the dilemma to slow their new projects in order to take care of their user base. Or put off their peirs. Most scientists try to do both – welcome to software quality hell!
    Making software is a commitment that does not stop with an accepted publication. But how can you maintain good software then and still do good science? I think you need to either a) focus on making few but ultra-sharp tools. b) focus on the biological target. or c) share your code not after a year when the application note is written, but rather when you have a first dirty script after a week and prove it provides some added value (see “Lean Entrepreneurship” on the latter).
    Finally, as a reader or reviewer, when you read the sentence in a manuscript “We believe our tool X will be useful in the area Y.” you can safely read it as “This is the first time we expose our program to the public, and we have no idea whether it has practical value because we have no data to back up our claims.” Ask for data, as you would in any research. Your job as a policeman will be done, then.

  11. You may be interested in the Bioinformatics Testing Consortium, then: http://bytesizebio.net/index.php/2012/08/24/can-we-make-research-software-accountable/ (Scroll down to last paragraph for the business bit) it may be actually workable… also, more thoughts: http://bytesizebio.net/index.php/2012/09/04/should-research-code-be-released-as-part-of-the-peer-review-process/

  12. Do you (author and readers) actually want this to go somewhere, or is it just and open “what if” discussion? I ask this honestly, because I really think this is a workable and possibly fruitful idea.
    In my opinion, community based review made by a community ran group is the best way to keep track of what is done by the community itself (can I use this work one more time?).
    Your Idea of selecting a software every month and three people to test it is great.
    Any thoughts on how should these testers be selected? I think any one should have the chance, but this should be an open process and the software creator(s) must also be given a space to respond publicly.
    I think a forum format would be good, one thread per software. The first posts should be from the testers and then the thread could be opened for other users or the programmers to discuss the issues raised.
    Any more ideas? Or am I the only one who thinks this could work?

    • I’m glad you think this is a good idea, however, my initial feeling is that we all have enough to do already (work + peer review + teaching) and that asking for volunteers to test and install software is a step too far. However, if you want to take the ball and run with it, please do so 🙂 A forum or wiki would work, and perhaps SeqAnswers is a good place to start?

  13. “Works” is not a simple yes/no concept. I think there’s a hierarchy:
    0. Totally broken, doesn’t work at all.
    1. Produces plausible results, sometimes, with lots of manual hacking by the original author, in a single, heavily customised environment. Not at all portable.
    2. Can install somewhere other than its original home, and run without crashing. No guarantees on output quality.
    3. As for 2, but with some unit tests, including some sort of realistic input data.
    4. As for 3, but with very thorough unit tests, exploration of failure modes, etc.
    5. As for 4, but effectively handles all kinds of “real world” data. Gold standard, very reliable, seldom fails and provides useful troubleshooting output if it does.

    For (5) I’m thinking of something like Samtools — which probably needs a long time with a significant user base and somebody actively maintaining the code. I think journals should be asking at least for stage (3) before publication, but it sounds as if they are content with stage (1).

  14. I share your frustration with poor, or broken bioinformatics software. But I think your argument is based on a false premise. The use of “good software practices” is not the goal. The goal is working software – but writing good, working software is *hard*. That is the whole point of best software practices, it helps guide the coder toward writing good software!

    So, let’s encourage the use of version control and unit tests in the hope that one day all bioinformatics software will install easily, and run cleanly!

    • Yes — in my comment above, I was thinking our host’s opening rant could be taken to mean that “can install and run without crashing” is sufficient, when in fact it is necessary but not sufficient.

      What we really need is “can install and run without crashing, *and* have reasonable assurance it is correctly doing what it claims to do.” What constitutes “reasonable assurance” is open to discussion, but in practice it is likely to be equivalent to a unit test — “when I run it with this input, it produces these outputs, which I can verify to be the correct ones”.

  15. I think there is a silver lining on the horizon with the birth of OMICtools, an iTunes or Google Play – like website! The OMICtools website (http://omictools.com) allows anyone to rate bioinformatic tools, write reviews, report a problem or ask a question. Anyone can submit tools that are arranged in various categories. This means that we can have all eyes on these tools, just like the new trend of open/public peer review (preprint servers). So any one can be part of the bioinformatics police, including both experts and newbies.
    Hitherto, one had to scan various forums such as biostars.org, SeqAnswers, stackoverflow, reddit etc to find reported problems and suggested solutions concerning particular bioinformatic tools. I hope that OMICtools can be adopted by the community to put more pressure on tool developers to maintain them well, especially when the low rating and bad reviews start to influence funding or tool adoption/usage. I know this is a long shot but a start-up culture applied to genomics would do us some good as resources like OMICtools start to come up.
    Here is the OMICtools publication http://database.oxfordjournals.org/content/2014/bau069.full?keytype=ref&ijkey=1zt9VjS71cjnLmv
    What do you think?

Leave a Reply

Your email address will not be published.

*

*

code

© 2017 Opiniomics

Theme by Anders NorenUp ↑