A quick micro-post!  I was just having  coffee with a guy from EPCC (Edinburgh Parallel Computing Centre), and he said: “We have lots of compute power, and you guys have lots of data, it’s a marriage made in heaven!” (when he said “you guys”, he meant biologists)

And of course he is right; but also sadly wrong.  This is how conversations tend to go between HPC and biologists:

  • Biologist: I have lots of data, please help!  Can I use your HPC?
  • HPC: Yes, of course you can.  What resources do you need?  How much RAM?  How many processors?
  • Biologist: Erm, I don’t know, and I won’t know until I run the software and complete the analysis
  • HPC: Hmmm, well you can’t have access until you know how much you want…..

Note: This is not a criticism of EPCC!  I have had the above conversation about 30 times with different HPC providers, going back to when the cloud was called the grid.  It’s a perpetual problem.

So, what’s the solution?  Why isn’t this a marriage made in heaven?  Comments please!