A quick micro-post! I was just having coffee with a guy from EPCC (Edinburgh Parallel Computing Centre), and he said: “We have lots of compute power, and you guys have lots of data, it’s a marriage made in heaven!” (when he said “you guys”, he meant biologists)
And of course he is right; but also sadly wrong. This is how conversations tend to go between HPC and biologists:
- Biologist: I have lots of data, please help! Can I use your HPC?
- HPC: Yes, of course you can. What resources do you need? How much RAM? How many processors?
- Biologist: Erm, I don’t know, and I won’t know until I run the software and complete the analysis
- HPC: Hmmm, well you can’t have access until you know how much you want…..
Note: This is not a criticism of EPCC! I have had the above conversation about 30 times with different HPC providers, going back to when the cloud was called the grid. It’s a perpetual problem.
So, what’s the solution? Why isn’t this a marriage made in heaven? Comments please!