I have to admit to being a bit of a snob when it comes to graphs and charts in scientific papers and presentations. It’s not like I think I am particularly good at it – I’m OK – it’s just that I know what’s bad. I’ve seen folk screenshot multiple Excel graphs so they can paste them into a powerpoint table to create multi-panel plots… and it kind of makes me want to scream. I’m sorry, I really am, but when I see Excel plots in papers I judge the authors, and I don’t mean in a good way. I can’t help it. Plotting good graphs is an art, and sticking with the metaphor, Excel is paint-by-numbers and R is a blank canvas, waiting for something beautiful to be created; Excel is limiting, whereas R sets you free.

Readers of this blog will know that I like to take plots that I find which are fabulous and recreate them. Well let’s do that again 🙂

I saw this Tweet by Trevor Branch on Twitter and found it intriguing:

Revising the spaghetti plot: small multiples with gray in the background for the other lines https://t.co/RX2IrEAW0M pic.twitter.com/ayIG1StMDY

— Trevor A. Branch (@TrevorABranch) August 24, 2016

It shows two plots of the same data. The Excel plot:

And the multi plot:

You’re clearly supposed to think the latter is better, and I do; however perhaps disappointingly, the top graph would be easy to plot in Excel but I’m guessing most people would find it impossible to create the bottom one (in Excel or otherwise).

Well, I’m going to show you how to create both, in R. All code now in Github!

**The Excel Graph**

Now, I’ve shown you how to create Excel-like graphs in R before, and we’ll use some of the same tricks again.

First we set up the data:

```
# set up the data
df <- data.frame(Circulatory=c(32,26,19,16,14,13,11,11),
Mental=c(11,11,18,24,23,24,26,23),
Musculoskeletal=c(17,18,13,16,12,18,20,26),
Cancer=c(10,15,15,14,16,16,14,14))
rownames(df) <- seq(1975,2010,by=5)
df
```

Now let's plot the graph

```
# set up colours and points
cols <- c("darkolivegreen3","darkcyan","mediumpurple2","coral3")
pch <- c(17,18,8,15)
# we have one point on X axis for each row of df (nrow(df))
# we then add 2.5 to make room for the legend
xmax <- nrow(df) + 2.5
# make the borders smaller
par(mar=c(3,3,0,0))
# plot an empty graph
plot(1:nrow(df), 1:nrow(df), pch="",
xlab=NA, ylab=NA, xaxt="n", yaxt="n",
ylim=c(0,35), bty="n", xlim=c(1,xmax))
# add horizontal lines
for (i in seq(0,35,by=5)) {
lines(1:nrow(df), rep(i,nrow(df)), col="grey")
}
# add points and lines
# for each dataset
for (i in 1:ncol(df)) {
points(1:nrow(df), df[,i], pch=pch[i],
col=cols[i], cex=1.5)
lines(1:nrow(df), df[,i], col=cols[i],
lwd=4)
}
# add bottom axes
axis(side=1, at=1:nrow(df), tick=FALSE,
labels=rownames(df))
axis(side=1, at=seq(-0.5,8.5,by=1),
tick=TRUE, labels=NA)
# add left axis
axis(side=2, at=seq(0,35,by=5), tick=TRUE,
las=TRUE, labels=paste(seq(0,35,by=5),"%",sep=""))
# add legend
legend(8.5,25,legend=colnames(df), pch=pch,
col=cols, cex=1.5, bty="n", lwd=3, lty=1)
```

And here is the result:

Not bad eh? Actually, yes, very bad; but also *very Excel!*

**The multi-plot**

Plotting multi-panel figures in R is *sooooooo* easy! Here we go for the alternate multi-plot. We use the same data.

```
# split into 2 rows and 2 cols
split.screen(c(2,2))
# keep track of which screen we are
# plotting to
scr <- 1
# iterate over columns
for (i in 1:ncol(df)) {
# select screen
screen(scr)
# reduce margins
par(mar=c(3,2,1,1))
# empty plot
plot(1:nrow(df), 1:nrow(df), pch="", xlab=NA,
ylab=NA, xaxt="n", yaxt="n", ylim=c(0,35),
bty="n")
# plot all data in grey
for (j in 1:ncol(df)) {
lines(1:nrow(df), df[,j],
col="grey", lwd=3)
}
# plot selected in blue
lines(1:nrow(df), df[,i], col="blue4", lwd=4)
# add blobs
points(c(1,nrow(df)), c(df[1,i], df[nrow(df),i]),
pch=16, cex=2, col="blue4")
# add numbers
mtext(df[1,i], side=2, at=df[1,i], las=2)
mtext(df[nrow(df),i], side=4, at=df[nrow(df),i],
las=2)
# add title
title(colnames(df)[i])
# add axes if we are one of
# the bottom two plots
if (scr >= 3) {
axis(side=1, at=1:nrow(df), tick=FALSE,
labels=rownames(df))
}
# next screen
scr <- scr + 1
}
# close multi-panel image
close.screen(all=TRUE)
```

And here is the result:

And there we have it.

So which do you prefer?

25th August 2016 at 9:51 pm

I have made just for fun a version with ggplot: https://gist.github.com/Artjom-Metro/0d2861f98aec7ab7544de2ae2b3d1e44

Maybe I should consider using R base plotting more often in the future…

25th August 2016 at 10:16 pm

This is soooooo static.

https://plot.ly/ggplot2/

🙂

26th August 2016 at 2:05 am

I have to confess, this particular example leaves me very conflicted. The faceted plot is clearly more aesthetically pleasing and more in keeping with good practice. But if the aim is to compare quickly the values from the 4 categories, I think many people would find that the original does so quite well, despite the ugliness. I can say straight off, for example, that “circulatory starts highest and ends lowest.”

26th August 2016 at 1:57 pm

Ehm. Error bars?

30th August 2016 at 4:39 pm

Yeah, so point of post was to recreate graphs… they don’t have error bars so I didn’t re-create them!