将热力图与表格结合

R
DT
Author

Rui

Published

May 19, 2024

https://www.r-bloggers.com/2023/04/heatmap-formatting-of-a-table-with-dt/

数据处理

library(tidyverse)
library(DT)
# 读取数据
dat0 <- read.csv("data/TempCitiesUSA.csv")
head(dat0)
##   state      city month temp.f
## 1    AK ANCHORAGE     1   15.8
## 2    AK ANCHORAGE     2   18.7
## 3    AK ANCHORAGE     3   25.9
## 4    AK ANCHORAGE     4   36.3
## 5    AK ANCHORAGE     5   46.9
## 6    AK ANCHORAGE     6   54.7

可以看出这是一个长数据。接下来将温度单位转换为摄氏度,并将数据转换为宽数据。

# 温度单位转换为摄氏度
dat0[["temp.f"]] <- round((dat0[["temp.f"]] - 32) / 1.8, digits = 1L)

# 长数据转换为宽数据
dat <- dat0 %>% 
  pivot_wider(
    names_from  = month,
    values_from = temp.f
)

将第 3 至第 6 列的列名改成月份:

# 修改列名
colnames(dat)[3:14] <- month.abb
glimpse(dat, width = 65)
## Rows: 261
## Columns: 14
## $ state <chr> "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "…
## $ city  <chr> "ANCHORAGE", "ANNETTE", "BARROW", "BETHEL", "FAIR…
## $ Jan   <dbl> -9.0, 1.7, -25.4, -14.1, -23.2, -4.8, -3.5, -9.2,…
## $ Feb   <dbl> -7.4, 2.8, -26.6, -13.6, -19.9, -3.9, -1.7, -9.1,…
## $ Mar   <dbl> -3.4, 4.2, -25.4, -9.7, -11.6, -1.4, 0.9, -4.7, 0…
## $ Apr   <dbl> 2.4, 6.6, -18.1, -3.4, -0.2, 2.4, 4.9, 0.6, 2.9, …
## $ May   <dbl> 8.3, 9.7, -6.6, 5.2, 9.3, 6.5, 8.8, 6.4, 6.4, -0.…
## $ Jun   <dbl> 12.6, 12.4, 1.7, 10.8, 15.4, 10.0, 12.2, 10.5, 9.…
## $ Jul   <dbl> 14.7, 14.6, 4.7, 13.3, 16.9, 12.3, 13.8, 13.2, 12…
## $ Aug   <dbl> 13.6, 14.8, 3.7, 12.0, 13.4, 12.1, 13.2, 12.7, 12…
## $ Sep   <dbl> 9.0, 12.1, -0.4, 7.4, 6.9, 8.8, 10.0, 8.7, 9.7, 5…
## $ Oct   <dbl> 1.2, 8.1, -9.7, -1.1, -4.7, 3.2, 5.7, 0.7, 4.6, -…
## $ Nov   <dbl> -5.7, 4.3, -18.3, -8.1, -16.5, -1.4, 0.7, -4.9, 1…
## $ Dec   <dbl> -8.1, 2.4, -23.7, -12.6, -21.1, -3.4, -1.8, -8.2,…

month.add 是 R 中的内置常量,为 1 月至 12 月英文简称的字符串向量。

head(dat)
## # A tibble: 6 × 14
##   state city     Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov
##   <chr> <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AK    ANCHO…  -9    -7.4  -3.4   2.4   8.3  12.6  14.7  13.6   9     1.2  -5.7
## 2 AK    ANNET…   1.7   2.8   4.2   6.6   9.7  12.4  14.6  14.8  12.1   8.1   4.3
## 3 AK    BARROW -25.4 -26.6 -25.4 -18.1  -6.6   1.7   4.7   3.7  -0.4  -9.7 -18.3
## 4 AK    BETHEL -14.1 -13.6  -9.7  -3.4   5.2  10.8  13.3  12     7.4  -1.1  -8.1
## 5 AK    FAIRB… -23.2 -19.9 -11.6  -0.2   9.3  15.4  16.9  13.4   6.9  -4.7 -16.5
## 6 AK    HOMER   -4.8  -3.9  -1.4   2.4   6.5  10    12.3  12.1   8.8   3.2  -1.4
## # ℹ 1 more variable: Dec <dbl>

获取所有气温数据中的最低温与最高温:

# 可以直接在长数据中获取
lowest  <- min(dat0[["temp.f"]])
highest <- max(dat0[["temp.f"]])

数值映射

为了方便将气温映射为颜色,我们需要对气温数值先进行统一的处理。最好将不同地区不同月份的气温统一映射在 [0, 1] 区间内,0 代表最低气温,1 代表最高气温,如此可以方便之后根据映射后的数值绘制颜色。

但要注意的是 0 摄氏度非常特殊,它在映射后应该处于[0, 1] 区间的中点,即 0.5。而常用的标准化、极大极小值变换法不能控制 0 变换后的数值,所以不能适用。

这里介绍一种方法叫做样条插值。它可以将一系列离散的点近似为一条光滑的曲线(用一条光滑的曲线将离散的点连起来),其实也就是通过一串离散的点找到穿过它们的函数(插值函数)。目前,我们有 (lowest,0), (0,0.5),(highest,1) 这 3 个点,想找到一条穿过它们的函数并不困难。

可以尝试使用 1 月份的气温进行试验:

x <- dat[["Jan"]]

# function to map from (lowest, highest) to (0, 1), mapping 0 to 0.5
interpfun <- splinefun(
  c(lowest, 0, highest),
  c(0, 0.5, 1)
) 
# map the January data
y <- interpfun(x)
y
##   [1] 0.34198568 0.52856574 0.02470358 0.24737848 0.06946654 0.41714589
##   [7] 0.43990607 0.33834463 0.47788087 0.14910668 0.10358904 0.23790602
##  [13] 0.43990607 0.29235601 0.21311276 0.40302127 0.44164699 0.57146693
##  [19] 0.66373748 0.63245544 0.91211154 0.55507919 0.57472758 0.57635578
##  [25] 0.47788087 0.69749680 0.67609249 0.52020632 0.73057421 0.64346826
##  [31] 0.55507919 0.62771452 0.72162073 0.72162073 0.73206153 0.53023340
##  [37] 0.62296093 0.62929624 0.72759535 0.65752615 0.68223619 0.68835734
##  [43] 0.67609249 0.62771452 0.64346826 0.90688083 0.33104563 0.46242815
##  [49] 0.47274264 0.44338650 0.47445680 0.47959080 0.43990607 0.49661256
##  [55] 0.52689667 0.49491672 0.68529958 0.73354744 0.78610176 0.69901511
##  [61] 0.68835734 0.82850220 0.81169418 0.75420220 0.67916716 0.67763053
##  [67] 0.75713034 0.77023722 0.79610861 0.59417308 0.59739427 0.63403292
##  [73] 0.62296093 0.65596980 0.83683012 0.83959482 0.83959482 0.38880648
##  [79] 0.37091120 0.34562110 0.48300642 0.51517376 0.42768017 0.39414759
##  [85] 0.40832857 0.37450152 0.43467485 0.48982076 0.41890512 0.44686130
##  [91] 0.41714589 0.44859659 0.48129931 0.45897871 0.45379399 0.48300642
##  [97] 0.51853021 0.50000000 0.51012852 0.66373748 0.66992626 0.68376859
## [103] 0.63087654 0.44338650 0.47445680 0.41890512 0.90688083 0.91341570
## [109] 0.50338181 0.27745367 0.40124935 0.36191072 0.42768017 0.39770129
## [115] 0.40832857 0.36191072 0.39947603 0.29792114 0.41890512 0.31637999
## [121] 0.26621774 0.20543617 0.31454044 0.30162418 0.26996868 0.46070413
## [127] 0.45206293 0.49661256 0.47788087 0.61819465 0.62771452 0.57798258
## [133] 0.42417438 0.29049814 0.40124935 0.38523870 0.39770129 0.41890512
## [139] 0.53522792 0.62771452 0.58933072 0.57146693 0.62771452 0.28491611
## [145] 0.24926875 0.23410717 0.26246116 0.40832857 0.40832857 0.38880648
## [151] 0.41538524 0.40832857 0.40124935 0.42768017 0.39236863 0.38523870
## [157] 0.23220563 0.50169161 0.49321948 0.53522792 0.51853021 0.57309796
## [163] 0.43816374 0.43467485 0.63560900 0.51517376 0.48129931 0.40656088
## [169] 0.40124935 0.42768017 0.48982076 0.49830698 0.50507060 0.50169161
## [175] 0.42241937 0.41009485 0.43467485 0.43990607 0.46415075 0.44512461
## [181] 0.42592798 0.42241937 0.43292830 0.54352393 0.54020975 0.59578438
## [187] 0.57146693 0.56492874 0.51685269 0.57309796 0.57635578 0.55178473
## [193] 0.45379399 0.44512461 0.48471212 0.50338181 0.45725188 0.43816374
## [199] 0.45206293 0.88306363 0.88306363 0.91080598 0.88439879 0.87636673
## [205] 0.46931011 0.61501009 0.58123195 0.29235601 0.32555658 0.40832857
## [211] 0.32372408 0.56820066 0.55178473 0.57309796 0.54517891 0.60542260
## [217] 0.53522792 0.64503588 0.66373748 0.74243329 0.71412075 0.61022269
## [223] 0.62613140 0.67301219 0.61978482 0.71111089 0.32921735 0.67763053
## [229] 0.55672430 0.60221549 0.68070238 0.61819465 0.66528678 0.68988411
## [235] 0.62771452 0.57798258 0.46242815 0.47274264 0.52355431 0.57472758
## [241] 0.54020975 0.53522792 0.36371363 0.55672430 0.57960797 0.58771379
## [247] 0.58123195 0.45552364 0.47274264 0.34016586 0.34380410 0.35649353
## [253] 0.39058826 0.48471212 0.51349342 0.46759174 0.50675798 0.40656088
## [259] 0.44164699 0.38702329 0.39770129

splinefun 是一个非常特殊的函数。splinefun(x, y) 用于计算样条插值函数,其返回的结果是类型为 R 函数的插值函数(还是个函数), 得到的插值函数可以用来计算 x 处的插值函数值及导数值, 在调用生成的插值函数时加选项 deriv= 可以指定导数阶数, 最多 3 阶

  • 使用 splinefun 函数以及 (lowest,0), (0,0.5),(highest,1) 这 3 个点找到一个插值函数,并命名为 interpfun

  • 利用 interpfun 将向量 x 的所有值映射在 [0,1] 区间内,且 0 映射为 0.5,最小值映射为 0,最大值映射为 1

颜色映射

将数值映射后的 y 向量再进行颜色映射。蓝色代表低温,温度越低颜色越深;红色代表高温,温度越高颜色越深;白色代表 0 摄氏度。

colfunc <- colorRamp(c("blue", "white", "red"))
cols <- colfunc(y) # 不能有NA值

将颜色的编码从rgb格式转换为hex,后者在前端更常用

clrs <- rgb(cols[, 1L], cols[, 2L], cols[, 3L], maxColorValue = 255)

对每一个月份的气温进行映射

使用 apply 家族中的 lapply 函数对每一个月份都进行数值映射和颜色映射。

Colors <- lapply(dat[, month.abb], function(x) {
  y <- interpfun(x)
  cols <- colfunc(y)
  rgb(cols[, 1L], cols[, 2L], cols[, 3L], maxColorValue = 255)
})

str(Colors)
## List of 12
##  $ Jan: chr [1:261] "#AEAEFF" "#FFF0F0" "#0C0CFF" "#7E7EFF" ...
##  $ Feb: chr [1:261] "#BDBDFF" "#FFE7E7" "#0000FF" "#8282FF" ...
##  $ Mar: chr [1:261] "#E1E1FF" "#FFDBDB" "#0C0CFF" "#A7A7FF" ...
##  $ Apr: chr [1:261] "#FFEAEA" "#FFC7C7" "#5757FF" "#E1E1FF" ...
##  $ May: chr [1:261] "#FFB9B9" "#FFAEAE" "#C4C4FF" "#FFD3D3" ...
##  $ Jun: chr [1:261] "#FF9797" "#FF9999" "#FFF0F0" "#FFA5A5" ...
##  $ Jul: chr [1:261] "#FF8787" "#FF8888" "#FFD7D7" "#FF9292" ...
##  $ Aug: chr [1:261] "#FF9090" "#FF8787" "#FFDFDF" "#FF9C9C" ...
##  $ Sep: chr [1:261] "#FFB4B4" "#FF9B9B" "#FBFBFF" "#FFC1C1" ...
##  $ Oct: chr [1:261] "#FFF4F4" "#FFBBBB" "#A7A7FF" "#F5F5FF" ...
##  $ Nov: chr [1:261] "#CCCCFF" "#FFDADA" "#5555FF" "#B6B6FF" ...
##  $ Dec: chr [1:261] "#B6B6FF" "#FFEAEA" "#1E1EFF" "#8C8CFF" ...

至此,所有气温数值都映射成了颜色数值。

绘制表格

使用 {DT} 包绘制表格,并让数据按照州进行汇总。

library(DT)
dtable <- datatable(
  dat, rownames = FALSE, extensions = "RowGroup", 
  options = list(
    rowGroup = list(dataSrc = list(0)),
    columnDefs = list( # hide the grouping column
      list(targets = 0, visible = FALSE)
    )
  )
)
for(month in month.abb) {
  dtable <- dtable %>%
    formatStyle(
      month, 
      backgroundColor = styleEqual(dat[[month]], Colors[[month]])
    )
}
dtable