# Add formulas to a workbook

Source:`vignettes/openxlsx2_formulas_manual.Rmd`

`openxlsx2_formulas_manual.Rmd`

Below you find various examples how to create formulas with
`openxlsx2`

. Though, before we start with the examples, let
us begin with a word of warning. Please be aware, while it is possible
to create all these formulas, they are not evaluated unless they are
opened in spreadsheet software. Even worse, if there are cells
containing the result of some formula, it can not be trusted unless the
formula is evaluated in spreadsheet software.

This can be shown in a simple example: We have a spreadsheet with a
formula `A1 + B1`

. This formula was evaluated with
spreadsheet software as `A1 + B1 = 2`

. Therefore if we read
the cell, we see the value 2. Lets recreate this output in
`openxlsx2`

```
# Create artificial xlsx file
wb <- wb_workbook()$add_worksheet()$add_data(x = t(c(1, 1)), col_names = FALSE)$
add_formula(dims = "C1", x = "A1 + B1")
# Users should never modify cc as shown here
wb$worksheets[[1]]$sheet_data$cc$v[3] <- 2
# we expect a value of 2
wb_to_df(wb, col_names = FALSE)
#> A B C
#> 1 1 1 2
```

Now, lets assume we modify the data in cell `A1`

to 2.

```
wb$add_data(x = 2)
# we expect 3
wb_to_df(wb, col_names = FALSE)
#> A B C
#> 1 2 1 2
```

What happened? Even though we see cells `A1`

and
`B1`

show a value of `2`

and `1`

our
formula in `C1`

was not updated. It still shows a value of
`2`

. This is because `openxlsx2`

does not evaluate
formulas and workbooks on a more general scale. In the open xml style
the cell looks something like this:

```
<c r="C1">
<f>A1 + B1</f>
<v>2</v>
</c>
```

And when we read from this cell, we always return the value of
`v`

. In this case it is obvious, but still wrong and it is a
good idea to check if underlying fields contain formulas.

```
wb_to_df(wb, col_names = FALSE, show_formula = TRUE)
#> A B C
#> 1 2 1 A1 + B1
```

If `openxlsx2`

writes formulas, as shown in the examples
below, the fields will be entirely blank. These fields will only be
evaluated and filled, once the output file is opened in spreadsheet
software.

The only way to avoid surprises is to be aware of this all the time and similar, checking for similar things all the time.

## Simple formulas

```
wb <- wb_workbook()$add_worksheet()$
add_data(x = head(cars))$
add_formula(x = "SUM(A2, B2)", dims = "D2")$
add_formula(x = "A2 + B2", dims = "D3")
# wb$open()
```

## Array formulas

```
wb <- wb_workbook()$add_worksheet()$
add_data(x = head(cars))$
add_formula(x = "A2:A7 * B2:B7", dims = "C2:C7", array = TRUE)
# wb$open()
```

## Array formulas creating multiple fields

In the example below we want to use `MMULT()`

which
creates a matrix multiplication. This requires us to write an array
formula and to specify the region where the output will be written
to.

```
m1 <- matrix(1:6, ncol = 2)
m2 <- matrix(7:12, nrow = 2)
wb <- wb_workbook()$add_worksheet()$
add_data(x = m1)$
add_data(x = m2, dims = wb_dims(from_col = 4))$
add_formula(x = "MMULT(A2:B4, D2:F3)", dims = "H2:J4", array = TRUE)
# wb$open()
```

Similar a the coefficients of a linear regression

```
# we expect to find this in D1:E1
coef(lm(head(cars)))
#> (Intercept) dist
#> 5.2692308 0.1153846
wb <- wb_workbook()$add_worksheet()$
add_data(x = head(cars))$
add_formula(x = "LINEST(A2:A7, B2:B7, TRUE)", dims = "D2:E2", array = TRUE)
# wb$open()
```

## cells metadata (cm) formulas

Similar to array formulas, these cell metadata (cm) formulas hide to
the user that they are array formulas. Using these is implemented in
`openxlsx2`

> 0.6.1:

```
wb <- wb_workbook()$add_worksheet()$
add_data(x = head(cars))$
add_formula(x = "SUM(ABS(A2:A7))", dims = "D2", cm = TRUE)
#> Warning in write_data2(wb = wb, sheet = sheet, data = x, name = name, colNames
#> = colNames, : modifications with cm formulas are experimental. use at own risk
# wb$open()
```

##
`dataTable`

formulas^{1}

#####
`dataTable`

formula differences

A | B | C | |
---|---|---|---|

1 | sales_price | COGS | sales_quantity |

2 | 20 | 5 | 1 |

3 | 30 | 11 | 2 |

4 | 40 | 13 | 3 |

Given a basic table like the above, a similarly basic formula for
`total_sales`

would be “= A2 * C2” with the row value
changing at each row.

An implementation for this formula using
`wb_add_formula()`

would look this (taken from current
documentation) lets say we’ve read in the data and assigned it to the
table `company_sales`

```
## creating example data
company_sales <- data.frame(
sales_price = c(20, 30, 40),
COGS = c(5, 11, 13),
sales_quantity = c(1, 2, 3)
)
## write in the formula
company_sales$total_sales <- paste(paste0("A", 1:3 + 1L), paste0("C", 1:3 + 1L), sep = " * ")
## add the formula class
class(company_sales$total_sales) <- c(class(company_sales$total_sales), "formula")
## write a workbook
wb <- wb_workbook()$
add_worksheet("Total Sales")$
add_data_table(x = company_sales)
```

Then we create the workbook, worksheet, and use
`wb_add_data_table()`

.

One of the advantages of the open xml `dataTable`

syntax
is that we don’t have to specify row numbers or columns as letters. The
table also grows dynamically, adding new rows as new data is appended
and extending formulas to the new rows. These `dataTable`

have named columns that we can use instead of letters. When writing the
formulas within the `dataTable`

we would use the following
syntax `[@[column_name]]`

to reference the current row. So
the “total_sales” formula written in open xml in `dataTable`

would look like this;
`=[@[sales_price]] * [@[sales_quantity]]`

If we are writing the formula outside of the `dataTable`

we have to reference the table name. In this case lets say the table
name is ‘daily_sales’
`=daily_sales[@[sales_price]] * daily_sales[@[sales_quantity]]`

However, if we were to pass this as the text for the formula to be written it would cause an error because the syntax that open xml requires for selecting the current row is different.

In open xml the `dataTable`

formula looks like this:

```
<calculatedColumnFormula>
daily_sales[[#This Row],[sales_price]]*daily_sales[[#ThisRow],[sales_quantity]]
</calculatedColumnFormula>
```

Now we can see that open xml replaces `[@[sales_price]]`

with `daily_sales[[#This Row],[sales_price]]`

We must then
use this syntax when writing formulas for `dataTable`

```
## Because we want the `dataTable` formula to propagate down the entire column of the data
## we can assign the formula by itself to any column and allow that single string to be repeated for each row.
## creating example data
example_data <-
data.frame(
sales_price = c(20, 30, 40),
COGS = c(5, 11, 13),
sales_quantity = c(1, 2, 3)
)
## base R method
example_data$gross_profit <- "daily_sales[[#This Row],[sales_price]] - daily_sales[[#This Row],[COGS]]"
example_data$total_COGS <- "daily_sales[[#This Row],[COGS]] * daily_sales[[#This Row],[sales_quantity]]"
example_data$total_sales <- "daily_sales[[#This Row],[sales_price]] * daily_sales[[#This Row],[sales_quantity]]"
example_data$total_gross_profit <- "daily_sales[[#This Row],[total_sales]] - daily_sales[[#This Row],[total_COGS]]"
class(example_data$gross_profit) <- c(class(example_data$gross_profit), "formula")
class(example_data$total_COGS) <- c(class(example_data$total_COGS), "formula")
class(example_data$total_sales) <- c(class(example_data$total_sales), "formula")
class(example_data$total_gross_profit) <- c(class(example_data$total_gross_profit), "formula")
```

```
wb$
add_worksheet("Daily Sales")$
add_data_table(
x = example_data,
table_style = "TableStyleMedium2",
table_name = "daily_sales"
)
```

And if we open the workbook to view the table we created we can see that the formula has worked.

A | B | C | D | E | F | G | |
---|---|---|---|---|---|---|---|

1 | sales_price | COGS | sales_quantity | gross_profit | total_COGS | total_sales | total_gross_profit |

2 | 20 | 5 | 1 | 15 | 5 | 20 | 15 |

3 | 30 | 11 | 2 | 19 | 22 | 60 | 38 |

4 | 40 | 13 | 3 | 27 | 39 | 120 | 81 |

We can also see that it has replaced `[#This Row]`

with
`@`

.

A | B | C | D | E | F | G | |
---|---|---|---|---|---|---|---|

1 | sales_price | COGS | sales_quantity | gross_profit | total_COGS | total_sales | total_gross_profit |

2 | 20 | 5 | 1 | =[@sales_price] - [@COGS] | =[@COGS] * [@sales_quantity] | =[@sales_price] * [@sales_quantity] | =[@[total_sales]] - [@[total_COGS]] |

3 | 30 | 11 | 2 | =[@sales_price] - [@COGS] | =[@COGS] * [@sales_quantity] | =[@sales_price] * [@sales_quantity] | =[@[total_sales]] - [@[total_COGS]] |

4 | 40 | 13 | 3 | =[@sales_price] - [@COGS] | =[@COGS] * [@sales_quantity] | =[@sales_price] * [@sales_quantity] | =[@[total_sales]] - [@[total_COGS]] |

For completion, the formula as we wrote it appears as;

D | E | F | G |
---|---|---|---|

gross_profit | total_COGS | total_sales | total_gross_profit |

=gross_profit[[#This Row],[sales_price]] - gross_profit[[#This Row],[COGS]] | =gross_profit[[#This Row],[COGS]] * gross_profit[[#This Row],[sales_quantity]] | =gross_profit[[#This Row],[sales_price]] * gross_profit[[#This Row],[sales_quantity]] | =gross_profit[[#This Row],[total_sales]] - gross_profit[[#This Row],[total_COGS]] |

=gross_profit[[#This Row],[sales_price]] - gross_profit[[#This Row],[COGS]] | =gross_profit[[#This Row],[COGS]] * gross_profit[[#This Row],[sales_quantity]] | =gross_profit[[#This Row],[sales_price]] * gross_profit[[#This Row],[sales_quantity]] | =gross_profit[[#This Row],[total_sales]] - gross_profit[[#This Row],[total_COGS]] |

=gross_profit[[#This Row],[sales_price]] - gross_profit[[#This Row],[COGS]] | =gross_profit[[#This Row],[COGS]] * gross_profit[[#This Row],[sales_quantity]] | =gross_profit[[#This Row],[sales_price]] * gross_profit[[#This Row],[sales_quantity]] | =gross_profit[[#This Row],[total_sales]] - gross_profit[[#This Row],[total_COGS]] |

```
#### sum dataTable examples
wb$add_worksheet("sum_examples")
### Note: dataTable formula do not need to be used inside of dataTables. dataTable formula are for referencing the data within the dataTable.
### Note: dataTable formula do not need to be used inside of dataTables. dataTable formula are for referencing the data within the dataTable.
sum_examples <- data.frame(
description = c("sum_sales_price", "sum_product_Price_Quantity"),
formula = c("", "")
)
wb$add_data(x = sum_examples)
# add formulas
wb$add_formula(x = "sum(daily_sales[[#Data],[sales_price]])", dims = "B2")
wb$add_formula(x = "sum(daily_sales[[#Data],[sales_price]] * daily_sales[[#Data],[sales_quantity]])", dims = "B3", array = TRUE)
#### dataTable referencing
wb$add_worksheet("dt_references")
### Adding the headers by themselves.
wb$add_formula(
x = "daily_sales[[#Headers],[sales_price]:[total_gross_profit]]",
dims = "A1:G1",
array = TRUE
)
### Adding the raw data by reference and selecting them directly.
wb$add_formula(
x = "daily_sales[[#Data],[sales_price]:[total_gross_profit]]",
start_row = 2,
dims = "A2:G4",
array = TRUE
)
# wb$open()
```