Read data from an Excel file or Workbook object into a data.frame
Usage
read_xlsx(
xlsxFile,
sheet,
startRow = 1,
startCol = NULL,
rowNames = FALSE,
colNames = TRUE,
skipEmptyRows = FALSE,
skipEmptyCols = FALSE,
rows = NULL,
cols = NULL,
detectDates = TRUE,
namedRegion,
na.strings = "#N/A",
na.numbers = NA,
check.names = FALSE,
sep.names = ".",
fillMergedCells = FALSE,
...
)
Arguments
- xlsxFile
An xlsx file, Workbook object or URL to xlsx file.
- sheet
The name or index of the sheet to read data from.
- startRow
first row to begin looking for data.
- startCol
first column to begin looking for data.
- rowNames
If
TRUE
, first column of data will be used as row names.- colNames
If
TRUE
, the first row of data will be used as column names.- skipEmptyRows
If
TRUE
, empty rows are skipped else empty rows after the first row containing data will return a row of NAs.- skipEmptyCols
If
TRUE
, empty columns are skipped.- rows
A numeric vector specifying which rows in the Excel file to read. If NULL, all rows are read.
- cols
A numeric vector specifying which columns in the Excel file to read. If NULL, all columns are read.
- detectDates
If
TRUE
, attempt to recognize dates and perform conversion.- namedRegion
A named region in the Workbook. If not NULL startRow, rows and cols parameters are ignored.
- na.strings
A character vector of strings which are to be interpreted as NA. Blank cells will be returned as NA.
- na.numbers
A numeric vector of digits which are to be interpreted as NA. Blank cells will be returned as NA.
- check.names
logical. If TRUE then the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names
- sep.names
(unimplemented) One character which substitutes blanks in column names. By default, "."
- fillMergedCells
If TRUE, the value in a merged cell is given to all cells within the merge.
- ...
additional arguments passed to
wb_to_df()
Details
Formulae written using write_formula to a Workbook object will not get picked up by read_xlsx(). This is because only the formula is written and left to be evaluated when the file is opened in Excel. Opening, saving and closing the file with Excel will resolve this.
Examples
xlsxFile <- system.file("extdata", "openxlsx2_example.xlsx", package = "openxlsx2")
read_xlsx(xlsxFile = xlsxFile)
#> Var1 Var2 NA Var3 Var4 Var5 Var6 Var7 Var8
#> 3 TRUE 1 NA 1 a 2023-05-29 3209324 This #DIV/0! 01:27:15
#> 4 TRUE NA NA #NUM! b 2023-05-23 <NA> 0 14:02:57
#> 5 TRUE 2 NA 1.34 c 2023-02-01 <NA> #VALUE! 23:01:02
#> 6 FALSE 2 NA <NA> #NUM! <NA> <NA> 2 17:24:53
#> 7 FALSE 3 NA 1.56 e <NA> <NA> <NA> <NA>
#> 8 FALSE 1 NA 1.7 f 2023-03-02 <NA> 2.7 08:45:58
#> 9 NA NA NA <NA> <NA> <NA> <NA> <NA> <NA>
#> 10 FALSE 2 NA 23 h 2023-12-24 <NA> 25 <NA>
#> 11 FALSE 3 NA 67.3 i 2023-12-25 <NA> 3 <NA>
#> 12 NA 1 NA 123 <NA> 2023-07-31 <NA> 122 <NA>