Package 'keyholder'

Title: Store Data About Rows
Description: Tools for keeping track of information, named "keys", about rows of data frame like objects. This is done by creating special attribute "keys" which is updated after every change in rows (subsetting, ordering, etc.). This package is designed to work tightly with 'dplyr' package.
Authors: Evgeni Chasnovski [aut, cre]
Maintainer: Evgeni Chasnovski <[email protected]>
License: MIT + file LICENSE
Built: 2024-10-03 03:58:47 UTC

keyholder: Store Data About Rows


keyholder offers a set of tools for storing information about rows of data frame like objects. The common use cases are:

  • Track rows of data frame without changing it.

  • Store columns for future restoring in data frame.

  • Hide columns for convenient use of dplyr's *_if scoped variants of verbs.


To learn more about keyholder:


Maintainer: Evgeni Chasnovski [email protected] (ORCID)

Useful links:

Key by selection of variables


These functions perform keying by selection of variables using corresponding scoped variant of select. Appropriate data frame is selected with scoped function first, and then it is assigned as keys.


key_by_all(.tbl, .funs = list(), ..., .add = FALSE, .exclude = FALSE)

key_by_if(.tbl, .predicate, .funs = list(), ..., .add = FALSE,
  .exclude = FALSE)

key_by_at(.tbl, .vars, .funs = list(), ..., .add = FALSE,
  .exclude = FALSE)



Reference data frame .


Parameter for scoped functions.


Parameter for scoped functions.


Whether to add keys to (possibly) existing ones. If FALSE keys will be overridden.


Whether to exclude key variables from .tbl.


Parameter for scoped functions.


Parameter for scoped functions.

Not scoped key_by()


mtcars %>% key_by_all(.funs = toupper)

mtcars %>% key_by_if(rlang::is_integerish, toupper)

mtcars %>% key_by_at(c("vs", "am"), toupper)

Keyed object


Utility functions for keyed objects which are implemented with class keyed_df. Keyed object should be a data frame which inherits from keyed_df and contains a data frame of keys in attribute 'keys'.




## S3 method for class 'keyed_df'
print(x, ...)

## S3 method for class 'keyed_df'
x[i, j, ...]



Object to check.


Object to print or extract elements.


Further arguments passed to or from other methods.

i, j

Arguments for [.



mtcars %>% key_by(vs) %>% is_keyed_df

# Not valid keyed_df
df <- mtcars
class(df) <- c("keyed_df", "data.frame")

One-table verbs from dplyr for keyed_df


Defined methods for dplyr generic single table functions. Most of them preserve 'keyed_df' class and 'keys' attribute (excluding summarise with scoped variants, distinct and do which remove them). Also these methods modify rows in keys according to the rows modification in reference data frame (if any).


## S3 method for class 'keyed_df'
select(.data, ...)

## S3 method for class 'keyed_df'
rename(.data, ...)

## S3 method for class 'keyed_df'
mutate(.data, ...)

## S3 method for class 'keyed_df'
transmute(.data, ...)

## S3 method for class 'keyed_df'
summarise(.data, ...)

## S3 method for class 'keyed_df'
group_by(.data, ...)

## S3 method for class 'keyed_df'
ungroup(x, ...)

## S3 method for class 'keyed_df'
rowwise(data, ...)

## S3 method for class 'keyed_df'
distinct(.data, ..., .keep_all = FALSE)

## S3 method for class 'keyed_df'
do(.data, ...)

## S3 method for class 'keyed_df'
arrange(.data, ..., .by_group = FALSE)

## S3 method for class 'keyed_df'
filter(.data, ...)

## S3 method for class 'keyed_df'
slice(.data, ...)


.data, data, x

A keyed object.


Appropriate arguments for functions.


Parameter for dplyr::distinct.


Parameter for dplyr::arrange.


dplyr::transmute() is supported implicitly with dplyr::mutate() support.

dplyr::rowwise() is not supposed to be generic in dplyr. Use rowwise.keyed_df directly.

All scoped variants of present functions are also supported.

Two-table verbs


mtcars %>% key_by(vs, am) %>% dplyr::mutate(gear = 1)

Two-table verbs from dplyr for keyed_df


Defined methods for dplyr generic join functions. All of them preserve 'keyed_df' class and 'keys' attribute of the first argument. Also these methods modify rows in keys according to the rows modification in first argument (if any).


## S3 method for class 'keyed_df'
inner_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
left_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
right_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
full_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
semi_join(x, y, by = NULL, copy = FALSE, ...)

## S3 method for class 'keyed_df'
anti_join(x, y, by = NULL, copy = FALSE, ...)


x, y, by, copy, suffix, ...

Parameters for join functions.

One-table verbs


dplyr::band_members %>% key_by(band) %>%
  dplyr::semi_join(dplyr::band_instruments, by = "name") %>%

Add id column and key


Functions for creating id column and key.





key_by_id(.tbl, .add = FALSE, .exclude = FALSE)



Reference data frame.


Character vector of names.

.add, .exclude

Parameters for key_by().


use_id() assigns as keys a tibble with column '.id' and row numbers of .tbl as values.

compute_id_name() computes the name which is different from every element in x by the following algorithm: if '.id' is not present in x it is returned; if taken - '.id1' is checked; if taken - '.id11' is checked and so on.

add_id() creates a column with unique name (computed with compute_id_name()) and row numbers as values (grouping is ignored). After that puts it as first column.

key_by_id() is similar to add_id(): it creates a column with unique name and row numbers as values (grouping is ignored) and calls key_by() function to use this column as key. If .add is FALSE unique name is computed based on .tbl column names; if TRUE then based on .tbl and its keys column names.


mtcars %>% use_id()

mtcars %>% add_id()

mtcars %>% key_by_id(.exclude = TRUE)

Operate on a selection of keys


keyholder offers scoped variants of the following functions:



Parameter for scoped functions.


Parameter for scoped functions.


Parameter for scoped functions.


Parameter for scoped functions.

Not scoped manipulation functions

Not scoped key_by()

Supported functions


keyholder supports the following functions:

Get keys


Functions for getting information about keys.







Reference data frame.


keys() always returns a tibble of keys. In case of no keys it returns a tibble with number of rows as in .tbl and zero columns. raw_keys() is just a wrapper for attr(.tbl, "keys"). To know whether .tbl has keys use has_keys().

Set keys, Manipulate keys





df <- key_by(mtcars, vs, am)


Manipulate keys


Functions to manipulate keys.


remove_keys(.tbl, ..., .unkey = FALSE)

restore_keys(.tbl, ..., .remove = FALSE, .unkey = FALSE)

pull_key(.tbl, var)

rename_keys(.tbl, ...)



Reference data frame.


Variables to be used for operations defined in similar fashion as in dplyr::select().


Whether to unkey() .tbl in case there are no keys left.


Whether to remove keys after restoring.


Parameter for dplyr::pull().


remove_keys() removes keys defined with ....

restore_keys() transfers keys defined with ... into .tbl and removes them from keys if .remove == TRUE. If .tbl is grouped the following happens:

  • If restored keys don't contain grouping variables then groups don't change;

  • If restored keys contain grouping variables then result will be regrouped based on restored values. In other words restoring keys beats 'not-modifying' grouping variables rule. It is made according to the ideology of keys: they contain information about rows and by restoring you want it to be available.

pull_key() extracts one specified column from keys with dplyr::pull().

rename_keys() renames columns in keys using dplyr::rename().

Get keys, Set keys

Scoped functions


df <- mtcars %>% dplyr::as_tibble() %>%
  key_by(vs, am, .exclude = TRUE)
df %>% remove_keys(vs)

df %>% remove_keys(dplyr::everything())

df %>% remove_keys(dplyr::everything(), .unkey = TRUE)

df %>% restore_keys(vs)

df %>% restore_keys(vs, .remove = TRUE)

df %>% restore_keys(dplyr::everything(), .remove = TRUE)

df %>% restore_keys(dplyr::everything(), .remove = TRUE, .unkey = TRUE)

# Restoring on grouped data frame
df_grouped <- df %>% dplyr::mutate(vs = 1) %>% dplyr::group_by(vs)
df_grouped %>% restore_keys(dplyr::everything())

# Pulling
df %>% pull_key(vs)

# Renaming
df %>% rename_keys(Vs = vs)

Set keys


Key is a vector which goal is to provide information about rows in reference data frame. Its length should always be equal to number of rows in data frame. Keys are stored as tibble in attribute "keys" and so one data frame can have multiple keys. Data frame with keys is implemented as class keyed_df.


keys(.tbl) <- value

assign_keys(.tbl, value)

key_by(.tbl, ..., .add = FALSE, .exclude = FALSE)




Reference data frame .


Values of keys (converted to tibble).


Variables to be used as keys defined in similar fashion as in dplyr::select().


Whether to add keys to (possibly) existing ones. If FALSE keys will be overridden.


Whether to exclude key variables from .tbl.


key_by ignores grouping when creating keys. Also if .add == TRUE and names of some added keys match the names of existing keys the new ones will override the old ones.

Value for ⁠keys<-⁠ should not be NULL because it is converted to tibble with zero rows. To remove keys use unkey(), remove_keys() or restore_keys(). assign_keys is a more suitable for piping wrapper for ⁠keys<-⁠.

Get keys, Manipulate keys

Scoped key_by()


df <- dplyr::as_tibble(mtcars)

# Value is converted to tibble
keys(df) <- 1:nrow(df)

# This will throw an error
## Not run: 
keys(df) <- 1:10

## End(Not run)

# Use 'vs' and 'am' as keys
df %>% key_by(vs, am)

df %>% key_by(vs, am, .exclude = TRUE)

df %>% key_by(vs) %>% key_by(am, .add = TRUE, .exclude = TRUE)

# Override keys
df %>% key_by(vs, am) %>% dplyr::mutate(vs = 1) %>%
  key_by(gear, vs, .add = TRUE)

# Use select helpers
df %>% key_by(dplyr::one_of(c("vs", "am")))

df %>% key_by(dplyr::everything())

Remove selection of keys


These functions remove selection of keys using corresponding scoped variant of select. .funs argument is removed because of its redundancy.


remove_keys_all(.tbl, ..., .unkey = FALSE)

remove_keys_if(.tbl, .predicate, ..., .unkey = FALSE)

remove_keys_at(.tbl, .vars, ..., .unkey = FALSE)



Reference data frame.


Parameter for scoped functions.


Whether to unkey() .tbl in case there are no keys left.


Parameter for scoped functions.


Parameter for scoped functions.


df <- mtcars %>% dplyr::as_tibble() %>% key_by(vs, am, disp)
df %>% remove_keys_all()

df %>% remove_keys_all(.unkey = TRUE)

df %>% remove_keys_if(rlang::is_integerish)

df %>% remove_keys_at(c("vs", "am"))

Rename selection of keys


These functions rename selection of keys using corresponding scoped variant of rename.


rename_keys_all(.tbl, .funs = list(), ...)

rename_keys_if(.tbl, .predicate, .funs = list(), ...)

rename_keys_at(.tbl, .vars, .funs = list(), ...)



Reference data frame.


Parameter for scoped functions.


Parameter for scoped functions.


Parameter for scoped functions.


Parameter for scoped functions.

Restore selection of keys


These functions restore selection of keys using corresponding scoped variant of select. .funs argument can be used to rename some keys (without touching actual keys) before restoring.


restore_keys_all(.tbl, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)

restore_keys_if(.tbl, .predicate, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)

restore_keys_at(.tbl, .vars, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)



Reference data frame.


Parameter for scoped functions.


Parameter for scoped functions.


Whether to remove keys after restoring.


Whether to unkey() .tbl in case there are no keys left.


Parameter for scoped functions.


Parameter for scoped functions.


df <- mtcars %>% dplyr::as_tibble() %>% key_by(vs, am, disp)
# Just restore all keys
df %>% restore_keys_all()

# Restore all keys with renaming and without touching actual keys
df %>% restore_keys_all(.funs = toupper)

# Restore with renaming and removing
df %>%
  restore_keys_all(.funs = toupper, .remove = TRUE)

# Restore with renaming, removing and unkeying
df %>%
  restore_keys_all(.funs = toupper, .remove = TRUE, .unkey = TRUE)

# Restore with renaming keys satisfying the predicate
df %>%
  restore_keys_if(rlang::is_integerish, .funs = toupper)

# Restore with renaming specified keys
df %>%
  restore_keys_at(c("vs", "disp"), .funs = toupper)