Reading multiple xlsx files each with multiple sheets - purrr

Multi tool use
Reading multiple xlsx files each with multiple sheets - purrr
I have multiple excel files, each with different worksheets. I have tried to use readxl and map to import it to R. However, I was only able to do it using a for loop. The code below works fine but I would like to know if there is a clever way to do this. I keep thinking that I could have done it with map2 but I am missing something.
library(tidyverse)
library(readxl)
library(writexl)
### As a first step, I get all the files from my project folder and create an empty list for looping purposes
files <- list.files(pattern = ".xlsx")
data_xlsx <- list()
### I then use seq_along in all the files and map_df to read the each excel file
for (i in seq_along(files)) {
data_xlsx[[i]] <- files[i] %>%
excel_sheets() %>%
set_names() %>%
map_df(
~ read_xlsx(path = files[i], sheet = .x, range = "H3"),
.id = "sheet")
}
# I use the code below to get the files name into the list
data_xlsx <- set_names(data_xlsx, files)
# This final code is just to transform the list into a data frame with a column with the name of the files
data_xlsx_df <- map2_df(data_xlsx, files, ~update_list(.x, file = .y))
Created on 2018-07-01 by the reprex package (v0.2.0).
1 Answer
1
You can use nested map_df
calls to replace the for loop. As far as I know map2
can only operate on two lists of length n
and return a list of length n
, I don't think it's a way to generate a length n * m
list from two lists of length n
and m
.
map_df
map2
n
n
n * m
n
m
files <- list.files(pattern = ".xlsx")
data_xlsx_df <- map_df(set_names(files), function(file) {
file %>%
excel_sheets() %>%
set_names() %>%
map_df(
~ read_xlsx(path = file, sheet = .x, range = "H3"),
.id = "sheet")
}, .id = "file")
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.