UPDATE 7th April 2015: the licensing of GATK to commercial partners will now be carried out directly by Broad, Appistry not involved

I realised I was writing a poorly structured rant, and so instead of writing this really long, angry, idiotic blog post, I’m going to put a list of (short) bullet points.

This post is about GATK and the news that commercial entities will require a license to use the software.  This is why I feel that the licensing of GATK to commercial entities is bad:

1. Bioinformatics is special

In fact, it’s amazing.  More often than not, we completely open our code, and let anyone use it for any purpose.  This is awesome.  It means our methods are there for anyone to examine, in all their visceral glory/ugliness.  No other scientific method is as open to scrutiny, which is a fantastic achievement.

2. We give away our competitive advantage

Because we adhere to (1) so much, we are giving away our competitive advantage.  This benefits science immensely.  Instead of those fantastic methods sitting in the hands of the few that developed them, they are now available for everyone to use.  I’ll say it again: this benefits science immensely. It happens far less so in the lab.  Lab techniques are published, yes, but stories of others being unable to reproduce results are common.  Lab techniques can be poorly described, or crucial details left out.  Not so in open-source bioinformatics.  We do something amazing that allows us to crunch data in a way that noone else has ever been able to do – and then we give it away.  So that everyone can benefit from it.  Just stop and think about that for a second.

Seriously, it’s incredible that we do this, and I want to protect it.

3. Commercial science is not bad science

We need commercial entities doing science.  It’s easy to sit in our academic ivory towers and believe we are some kind of benevolent force of nature, carrying out science for the good of humanity, and that commercial science is of a lower order.  It’s not true.  We need commercial companies, we need drugs and treatments, and disease resistant crops, and we need men and women who are driven by money and commercial success to develop them.

Commercial entities create jobs, they employ our friends and family, they drive the economy.

There is absolutely nothing wrong with academic science publishing code that benefits commercial companies.  And there is nothing wrong with us doing that freely, with no expectation of a return on our efforts.

Just as we are doing good by giving our papers and our code freely to the public, we are also doing good by giving our code and papers freely to commercial companies.

4. Not all commercial companies make $billions

This is fairly obvious.  Sure, you might think that GSK or Roche can spare a few $1000s to buy a GATK license, but others will be small start ups, spin outs, people who are just trying to get ahead, trying to make a living.  Small companies with just a few employees, trying to make a difference in a competitive market.

Why should we not let them benefit from our algorithms in the same way academic science does?  If they end up curing a disease, then we have all won.  Humanity has won.  And if they make a little money out of that, is that a bad thing?  No.

5. Broad’s current model won’t work

Currently, academic institutes who collaborate with a commercial entity will not need a license to do so.  This will change, rapidly.  It has to.  Otherwise every single commercial entity will simply partner up with an academic to avoid the license fee.  When Broad realise this, they will change the license.  Again.

6. Support is not the issue

The stated problem is that supporting commercial entities was becoming difficult on limited resources, hence the license.  If support was the issue, then open up the code, give it away free and then charge for support.  Red Hat do this.  MySQL do this.  It works.  I have asked why Broad do not take up this model a few times, and I have not received a valid reply.

7. The role of Broad is not to make money

Their role is to carry out great science (which they do) that benefits the USA, and the rest of the World.  That’s why governments fund academic research, to do stuff that’s important that will benefit us all.

They’re not there to sell stuff.  They’re not there to make money.

You’re going to challenge me on this and say that every university has an office dedicated to the commercialisation of research.  Sure.  So spin out GATK as a software company and sell it.  See how far you get.  See how much money you don’t makeAnd then see how much everyone has failed to benefit from the process.

Ask yourself this: Why don’t the NCBI sell blast?  Why doesn’t EBI license Ensembl?  Why didn’t Sanger sell BWA to companies?

Because it’s not their job to sell stuff.  It’s their job to carry out science that benefits everyone, commercial companies included.

And yes, I’m aware that WUBLAST was commercialised: http://www.advbiocomp.com/blast.html.

Let’s all just sit and think about what an amazing success that has been.

Update – 13:20 GMT, 28/01/2013

It has been pointed out by Daniel MacArtthur, and others agree with him, that part of the problem is start-up companies who add very little value and simply try to make money from OA software with a pretty front-end/GUI:


Whilst at first glance, this argument seems to stand up, actually if you work through the options, it really doesn’t.

Firstly, I am yet to see evidence these companies exist.

Secondly, if they can offer interpretation, customer service and data delivery through a nice web front-end – what is wrong with that?  They are still offering something Broad does not.

Thirdly, there are other options to GATK.  There are other free tools that do what GATK does.  If these companies truly add no value and are simply after a fast buck, they will just use something else.

Perhaps that is what Broad want – but why would you want that?  There are three possibilities here:

  1. The other tools are better than GATK.  Then Broad have nothing to sell.
  2. The other tools are as good as GATK.  Then Broad have nothing to sell.
  3. GATK is better than everything else.

If (3) is the case (and it’s a big, unproven “if”), then all Broad are doing is denying paying members of the public access to the best algorithm.  As I said above, if these “vampiric” companies exist solely to offer free tools to a gullible public, then they won’t pay for GATK, they’ll just use Samtools/DINDEL/SoapSNP etc etc etc.  So instead of those gullible members of the public getting access to the best data, they get access to data that is slightly less good.  The companies still exist.  The public still pays.

You haven’t changed anything except less people benefit from GATK.