bioinformatics, genomes, biology etc. "I don't mean to sound angry and cynical, but I am, so that's how it comes across"

How journals could “add value”

I wrote a piece for Genome Biology, you may have read it, about open science.  I said a lot of things in there, but one thing I want to focus on is how journals could “add value”.  As brief background: I think if you’re going to make money from academic publishing (and I have no problem if that’s what you want to do), then I think you should “add value”.  Open science and open access is coming: open access journals are increasingly popular (and cheap!), preprint servers are more popular, green and gold open access policies are being implemented etc etc. Essentially, people are going to stop paying to access research articles pretty soon – think 5-10 year time frame.

So what can journals do to “add value”?  What can they do that will make us want to pay to access them?  Here are a few ideas, most of which focus on going beyond the PDF:

  1. Live figures: This is something F1000Research are already doing, and there’s no reason other journals couldn’t do the same.  Wherever there is a figure, let readers interact with it – change the colours, axes, chart type etc etc
  2. Re-run analyses in a browser: I think this is something Gigascience have attempted, and would be an incredible training opportunity.  Let readers repeat the entire data analysis, in a browser, using e.g. Galaxy or iPython Notebook
  3. Re-run analyses from the command-line: as above, but provide a VM and a list of commands that readers can run to repeat the analysis
  4. Re-run analyses as an R package: imagine it – every paper has a downloadable R package, with the data included, that allows readers to explore the data within the wonderful statistical and visualisation environment that is R.
  5. Add/remove data – “what happens to graphs, tables and statistics if I remove these data points?”
  6. Show discarded data – so the data had outliers that were cleaned/removed?  Show them to me.  Where are they in the graphs?  Where are they in the tables?  Why were they discarded?
  7. What would my data look like? – basically, if the reader has a dataset which is similar to that analysed in the paper, the reader can upload that dataset (in agreed format) and see how the graphs, tables and stats change
  8. Enforce conventions – when you’re parsing that word document and converting to PDF, pick up gene names and automatically suggest community standards (e.g. HGNC)
  9. Enforce data submission (or do it for the authors).  Yes, do not publish unless the data are submitted to a public archive.  In fact, help the authors do it!
  10. Find similar data in…. – help readers find similar (public) datasets in public databases
  11. Actually check the statistics.  Yes you, the journal.  Most biologists aren’t qualified to check the statistics and experimental design, or do power analysis, so you do it.  Employ some statisticians!

OK, so I’m not pretending any of the above is easy, but I unsure why none of the above is happening – some publishers make HUGE PROFITS, why on earth have they not updated their product?  Imagine if Apple were still trying to sell version 1 of the iPod – no-one would buy it.  Most products get updated on a frequent basis to keep customers happy, yet journals have stuck with the PDF/Print version for decades.  Time for an update, especially if you want to keep the money flowing in.


  1. This totally makes sense to me, and I’m really glad to see that modern-day publishers are pushing for technological advances in publishing rather than sticking with the tired jaded status quo.

    I’d like to suggest a point 12: Analyse my data with the analysis that the authors used.

    We’re actively working to do 2, 8, 9 and 10 in COPO (http://bit.ly/1GD3ejx). I’m currently writing a blog post about it but haven’t finished it yet!

  2. Cool!

    I think this is what I meant in point 7. This all needs to happen 🙂

  3. Thanks for flagging our Galaxy examples, but did you see we’ve also had nice examples of point 4, where we published a paper generated in Knitr, alongside all of the R code, data and files used to generate the paper? http://blogs.biomedcentral.com/gigablog/2014/04/16/qa-on-dynamic-documents/ For points 8 onwards, having curators and data scientists in-house means we do the data curating, brokering, take (citeable) snapshots of the software (everything mandated and OSI compliant) and have them do any other necessary reproducibility checks (test VMs, workflows, etc.), although it does slow down the publication process a bit at the end (see the review process of VirAMP for example https://publons.com/publon/71300/). Hopefully people think the end actually reusable product is worth it though.

  4. I’m a big fan of #1-3, and having tried various bits and bobs of reproducible analyses out (version control, single-point-of-control make, IPython Notebook, cloud/VMs, Docker) I think there might be a real opportunity for journals to *coordinate* infrastructure. You touch on obliquely in points #8 and #9, too.

    I don’t think we can reasonably expect journals to be at the forefront of technological innovation, but it sure would be nice if (e.g.) they could help marshal resources for me to publish papers in their full technological glory.

    Random thoughts:

    The journal might coordinate bloc grants/efforts for Amazon Web Services with figshare data storage and github repositories so that people didn’t have to arrange all of that themselves.

    The journal could coordinate training and/or hackathon events at conferences for people who want to refactor their analysis code to make it replicable, automated, Dockerized, etc.

    The journal could organize global code review or replication “parties” that would result in papers getting published based on passing “peer review” in these areas, OR “merely” on badges getting awarded.

    Scientists can coordinate a lot of this on their own but they don’t have the visibility to draw in people from groups that aren’t composed of the usual suspects. Journals do.

  5. I think journals could get back into the business of “editing” and “publishing” – employing language, copy and graphical and web technology editors to help authors produce high-quality online papers. Right now this kind of thing is limited to the glam journals, while most of the others have jettisoned these functions. They force scientists to be editors, graphical designers, copy editors, and spend all their time formatting references and meeting image format requirements. Journals could also help authors with things like sharing, archiving, etc. (#8-9 above, but more broadly). Scientists shouldn’t have to become experts in things like SEO and web discoverability.

  6. Gigascience are definitely one of the innovators in this space. What you need is a really high profile paper that hits a lot of the points above, to make everyone sit up and take notice

  7. You don’t like the R package idea? 😉

    Some of the publishers make $1bn profit per year. At that level, they *could* be on the forefront of tech innovation.

    These are not new ideas, Mike Eisen spoke about many of these at an MGED in 2002

  8. The pressure on the authors should be on making their research available and reproducible; journals should encourage it!

  9. Something pretty basic that hasn’t been mentioned, along the lines of the comment of @noamross, is to check that the references make sense, that the intellectual foundation is valid, and that the claims are supported by data.

  10. <em some publishers make HUGE PROFITS, why on earth have they not updated their product?

    Because they are making HUGE PROFITS as it is. Why rock the boat? Yes, their traditional business model may be nearing its end, but traditionally companies don’t make the leap. When I was growing up in the 1970s in the US, we all shopped via the Sears catalog and had goods shipped to us. It was basically Amazon before Amazon, and yet they didn’t make the leap to the Internet until well after Amazon established itself.

  11. Reblogged this on A librarian abroad and commented:
    Some great ideas of what academic authors & readers can ask publishers for. Especially regarding handling & presenting data.

  12. You make some excellent points! I hope we see this happen in the near future. It will make authors far more accountable for their data (and analyses) and hence their conclusions, which can only ever be a good thing!

  13. Open peer review? Anyone, brilliant idea?

    It is surprising how many mistakes can slip into peer reviewed science. The input of any interested reader could increase the quality of published science.

Leave a Reply

© 2018 Opiniomics

Theme by Anders NorenUp ↑