Darren Dahly PhD Statistical Epidemiology

Heat Maps in R

Heatmaps are a nice way displaying the values of lots of variables (columns), for lots of observations (rows). There are many ways to make them, but I prefer to use ggplot2 whenever possible. Use geom_tile, and treat the variable describing each observation (e.g. a unique ID number) as a factor variable. It is likely your data will need reshaping to a “long” format, which I’ve done below with tidyr::gather.

library(tidyr)
library(ggplot2)
library(RColorBrewer)

data <- data.frame("ID" = c(1:500), 
                   "A"  = sample(c(1:5), 500, replace = T),
                   "B"  = sample(c(1:5), 500, replace = T),
                   "C"  = sample(c(1:5), 500, replace = T),
                   "D"  = sample(c(1:5), 500, replace = T),
                   "E"  = sample(c(1:5), 500, replace = T),
                   "G"  = sample(c(1:5), 500, replace = T))
  
dataLong <- gather(data, question, value, A:G)
   
dataLong$value <- factor(dataLong$value, 
                         labels = c("Weekly", 
                                    "Monthly/Quarterly", 
                                    "Yearly",
                                    "Not yet", 
                                    "Not my job"))
  
ggplot(dataLong, aes(x = question, y = as.factor(ID))) + 
  geom_tile(aes(fill = value)) +
  ylab("Each row is a person (n = 500)") +
  xlab("Survey Question") +
  scale_y_discrete(labels = "") +
  theme_bw() + 
  theme(text = element_text (color = "black", family = "serif"), 
        strip.background = element_blank(),
        panel.border = element_blank(), 
        panel.grid.major.y = element_blank(),
        panel.grid.major.x = element_blank(), 
        axis.text.x = element_text (angle = 90), 
        axis.ticks.y = element_blank()) +
  scale_fill_brewer("", palette = "RdBu")

Heatmap