I am trying to crowd-source 101 fun bioinformatics facts!  Please contribute in the comments and I will add them below:

  1. BLAST is so fast, the authors had to deliberately slow down the code so it doesn’t overheat the servers
  2. GCG, the old bioinformatics package, was named after the authors kept high-fiving each other, shouting “good code guys!”
  3. Bowtie is named so because “it is almost impossible to tie”, referring to code to avoid a “race condition” when using multiple processors
  4. TopHat is named do because it was the first spliced RNA-Seq aligner, and when it worked first time, the authors shouted “Top that!”
  5. Over 1 billion people have searched the NCBI protein database for their own name
  6. The EBI is an elaborate front-end to NCBI services
  7. The SRA (short read archive) is the best known of the archives, and not many people know or use the MRA (medium read archive), the KLRA (kinda long read archive) and the LRA (long read archive)
  8. Europe PubMed Central has only ever been accessed by people accidentally clicking on links.  100% of visitors immediately bounce to pubmed.com
  9. There are now more journals than papers
  10. The HGAP assembler is actually an elaborate front-end hiding three thousand slave labourers all running GAP4 (via @IanGoodhead)
  11. HPC actually means “Homunculus Powered Computing”, and all servers are actually just mechanical turks full of leprechauns (via @froggleston)
  12. The biosemantics.org group of LUMC is doing psipred protein folding on 1kWh household radiators (128 cores each) https://vimeo.com/122893200 (via Eric Feliksik)
  13. The ‘p’ in p-value actually stands for p-otentially interesting! (via )
  14. Velvet is so named because @dzerbino wore velvet gloves when coding it (via @pathogenomenick)
  15. The @PacBio machines are so large because inside’s an Illumina machine + a bioinformatician running assemblies (via )
  16. The Cloud is actually just a cloud. That’s it. A real cloud (via @froggleston)
  17. If you plug in a @nanopore MinION and hit left,right,up,up,A,B,down, it’ll transform into a lifesize statue of @Clive_G_Brown (via @froggleston)
  18. DDBJ have their data centre in a volcano, and are basically a front for Osato Chemicals (SPECTRE) (via @SCEdmunds)
  19. Hidden Markov Models were initially developed to find Waldo shawnhymel.com/portfolio/413/ (via )
  20. 99.5% of people who cite Altschul et al have never read the paper
  21. Bioinformatics Applications Notes have to be automatically generated by the software they describe
  22. BGI exclusively publish in Nature journals because their papers are first rejected by Gigascience
  23. BGI actually only have one HiSeq but made to look like hundreds by a set up of mirrors, like that bit in Enter the Dragon (via @froggleston)
  24. the consumption rate of coffee (+ beer 🍻) among Bioinformaticians from around the world is increasing every year. TRUE FACT! (via @NazeefaFatima)
  25. EBI” actually stands for “European bureau of investigation”. It’s a front of the EU secret service, collecting genomic info (via @klmr)
  26. There are only 3 facts in 101 (via @mcaccamo)
  27. If all you have is a hmmer everything looks like it can be resolved with Viterbi (via @mcaccamo)
  28. Hidden Markov Models are like the recipe for Kentucky Fried Chicken.  There are only three people in the world who understand small parts of how HMMs work, and only when they get together do they know the full picture
  29. The “e” in e-value stands for “excellent”, as in “that’s an excellent BLAST hit”
  30. The Burrows-Wheeler transform, used in BWA and Bowtie, saves memory by transforming the DNA sequence data into a parallel dimension, meaning it ceases to exist in 4D space/time in this Universe
  31. Base qualities are called “Phred” scores in honour of Fred Sanger who developed DNA sequencing. #101bioinfofunfacts (via @tostenseemann)
  32. In a recent public survey of the 100 most desirable jobs, bioinformatician was a close second to astronaut (via @dynomics)
  33. Heng Li writes all his code in x86 assembly language, and uses a C decompiler before releasing it. @lh3lh3 (via @torstenseemann)
  34. The EBI secretly funds the Perl Foundation to ensure its legacy internal software infrastructure won’t collapse (via @torstenseemann)
  35. Illumina reads are short as before the development of Basespace they were delivered via Twitter (via @RoyChaudhuri)
  36. Pet Bioinformaticians are paid with cuddling #101bioinfofunfacts (via )
  37. Python was conceived in the 1980’s by @gvanrossum & named after his favourite British comedy, Monty Python’s Flying Circus (via )
  38. the word “ELVIS” appears 35 times in human peps (GRch38). “ELVISLIVES” appears 0 times. The king has left the genome #slowday (via @rdemes)
  39. Tuxedo suit is so named that only ‘privileged’ know how to use it ! #bioinformaticsfun (via )
  40. It’s easy! You only have to download this database in which all the genes have only one ID and you can retrieve the IDs in the most important databases (via @jorjial)
  41. If you stand in front of a mirror and say ‘HiSeq’ 3 times, Illumina staff member will show up holding the HiSeq X Ten system (via @nazeetafatima)
  42. @BenLangmead wrote Bowtie while wearing a tuxedo but he did all the testing in zip-up onesie batman pajamas (via @coletrapnell)
  43. Spike-ins are like gold (via @nomad421)
  44. Do you need more hard disk space to store and do the analysis? Sure! Let’s buy 10 hard disk of 3 TB in the supermarket (via @jorjial)
  45. This could be the basis for 10.1 papers in PLOS Comp. Biol. (via @kbradnam)
  46. All bioinformatics problems can be solved through the medium of twitter, snide and ranting 😉 (via @guyleonard)
  47. Installing TopHat with option –reverse will install HotTap, a program that spews vapid results on a random science hot topic (via @CamLBerthelot)
  48. SOLiD sequencers generated colour-space sequence using an algorithm based on the once popular “Simon Says” hand held game (via @iandcalling)
  49. CriMap was called CriMap because users do an awful lot of crying before they get a half decent map (via @dj_de_koning)
  50. A single anonymous donor, RP11, accounts for 72 percent of the human reference genome (via CanGenom)
  51. If you amass the de-bugging tears of a bioinformatician it is enough to fill an Olympic size swimming pool annually (via @paulhoskisson)
  52. FASTA 80 character line wrapping was invented to standardise data sharing using MS Word (via @IanGoodhead)
  53. nine out of ten Bioinformaticians prefer Excel (via
  54. if you’ve never shown the NIH sequencing costs plot in talk/lecture you’re not a real bioinformatician pic.twitter.com/jQzG7MGosd (via @AliciaOshlack)
  55. Illumina is short for Illuminati, the shadowy organisation that controls sequencing worldwide. (via @neilfws)
  56. Every time you run a closed source bioinformatics tool, a PhD student’s soul is sacrificed to the Blood God. (via @froggleston)
  57. The number of replicates needed for your RNA-seq experiment equals the impact factor of the journal you want to publish in (via @torstenseemann)
  58. NCBI’s bacterial annotation takes 6 weeks because it’s done manually by work experience students pasting ORFs into web BLAST (via @torstenseemann)
  59. The majority of bioinformaticians can’t pronounce “de Bruijn” properly (see also thegenomefactory.blogspot.sg/2013/08/how-to… @torstenseemann) (via @rvaerle)
  60. Oxford Nanopore plans to introduce a new FASTQ encoding scheme using an ASCII offset of 48 with optional emoji (via @torstenseemann)
  61. The HMMer package was so named when someone asked how it worked, and the developers said “Hmmmm… errr….” (via @mgollery)
  62. 63% of Bioinformaticists were Biologists to start with, but they realized that the cold room is really COLD! (via @mgollery)
  63. It has been calculated that there are twice as many data formats as there are Bioinformaticians (via @mgollery)