Package 'keyholder' reference manual

Title:	Store Data About Rows
Description:	Tools for keeping track of information, named "keys", about rows of data frame like objects. This is done by creating special attribute "keys" which is updated after every change in rows (subsetting, ordering, etc.). This package is designed to work tightly with 'dplyr' package.
Authors:	Evgeni Chasnovski [aut, cre]
Maintainer:	Evgeni Chasnovski <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.7.9000
Built:	2025-04-01 03:46:48 UTC
Source:	https://github.com/echasnovski/keyholder

keyholder: Store Data About Rows

Description

keyholder offers a set of tools for storing information about rows of data frame like objects. The common use cases are:

Track rows of data frame without changing it.
Store columns for future restoring in data frame.
Hide columns for convenient use of dplyr's *_if scoped variants of verbs.

Details

To learn more about keyholder:

Browse vignettes with browseVignettes(package = "keyholder").
Look how to set keys.
Look at the list of supported functions.

Author(s)

Maintainer: Evgeni Chasnovski [email protected] (ORCID)

Key by selection of variables

Description

These functions perform keying by selection of variables using corresponding scoped variant of select. Appropriate data frame is selected with scoped function first, and then it is assigned as keys.

Usage

key_by_all(.tbl, .funs = list(), ..., .add = FALSE, .exclude = FALSE)

key_by_if(.tbl, .predicate, .funs = list(), ..., .add = FALSE,
  .exclude = FALSE)

key_by_at(.tbl, .vars, .funs = list(), ..., .add = FALSE,
  .exclude = FALSE)
key_by_all(.tbl, .funs = list(), ..., .add = FALSE, .exclude = FALSE)

key_by_if(.tbl, .predicate, .funs = list(), ..., .add = FALSE,
  .exclude = FALSE)

key_by_at(.tbl, .vars, .funs = list(), ..., .add = FALSE,
  .exclude = FALSE)

Arguments

`.tbl`	Reference data frame .
`.funs`	Parameter for scoped functions.
`...`	Parameter for scoped functions.
`.add`	Whether to add keys to (possibly) existing ones. If `FALSE` keys will be overridden.
`.exclude`	Whether to exclude key variables from `.tbl`.
`.predicate`	Parameter for scoped functions.
`.vars`	Parameter for scoped functions.

Examples

mtcars %>% key_by_all(.funs = toupper)

mtcars %>% key_by_if(rlang::is_integerish, toupper)

mtcars %>% key_by_at(c("vs", "am"), toupper)

mtcars %>% key_by_all(.funs = toupper)

mtcars %>% key_by_if(rlang::is_integerish, toupper)

mtcars %>% key_by_at(c("vs", "am"), toupper)

Utility functions for keyed objects which are implemented with class keyed_df. Keyed object should be a data frame which inherits from keyed_df and contains a data frame of keys in attribute 'keys'.

Usage

is_keyed_df(.tbl)

is.keyed_df(.tbl)

## S3 method for class 'keyed_df'
print(x, ...)

## S3 method for class 'keyed_df'
x[i, j, ...]
is_keyed_df(.tbl)

is.keyed_df(.tbl)

## S3 method for class 'keyed_df'
print(x, ...)

## S3 method for class 'keyed_df'
x[i, j, ...]

Arguments

`.tbl`	Object to check.
`x`	Object to print or extract elements.
`...`	Further arguments passed to or from other methods.
`i`, `j`	Arguments for `[`.

Examples

is_keyed_df(mtcars)

mtcars %>% key_by(vs) %>% is_keyed_df

# Not valid keyed_df
df <- mtcars
class(df) <- c("keyed_df", "data.frame")
is_keyed_df(df)

is_keyed_df(mtcars)

mtcars %>% key_by(vs) %>% is_keyed_df

# Not valid keyed_df
df <- mtcars
class(df) <- c("keyed_df", "data.frame")
is_keyed_df(df)

One-table verbs from dplyr for keyed_df

Description

Defined methods for dplyr generic single table functions. Most of them preserve 'keyed_df' class and 'keys' attribute (excluding summarise with scoped variants, distinct and do which remove them). Also these methods modify rows in keys according to the rows modification in reference data frame (if any).

Usage

## S3 method for class 'keyed_df'
select(.data, ...)

## S3 method for class 'keyed_df'
rename(.data, ...)

## S3 method for class 'keyed_df'
mutate(.data, ...)

## S3 method for class 'keyed_df'
transmute(.data, ...)

## S3 method for class 'keyed_df'
summarise(.data, ...)

## S3 method for class 'keyed_df'
group_by(.data, ...)

## S3 method for class 'keyed_df'
ungroup(x, ...)

## S3 method for class 'keyed_df'
rowwise(data, ...)

## S3 method for class 'keyed_df'
distinct(.data, ..., .keep_all = FALSE)

## S3 method for class 'keyed_df'
do(.data, ...)

## S3 method for class 'keyed_df'
arrange(.data, ..., .by_group = FALSE)

## S3 method for class 'keyed_df'
filter(.data, ...)

## S3 method for class 'keyed_df'
slice(.data, ...)
## S3 method for class 'keyed_df'
select(.data, ...)

## S3 method for class 'keyed_df'
rename(.data, ...)

## S3 method for class 'keyed_df'
mutate(.data, ...)

## S3 method for class 'keyed_df'
transmute(.data, ...)

## S3 method for class 'keyed_df'
summarise(.data, ...)

## S3 method for class 'keyed_df'
group_by(.data, ...)

## S3 method for class 'keyed_df'
ungroup(x, ...)

## S3 method for class 'keyed_df'
rowwise(data, ...)

## S3 method for class 'keyed_df'
distinct(.data, ..., .keep_all = FALSE)

## S3 method for class 'keyed_df'
do(.data, ...)

## S3 method for class 'keyed_df'
arrange(.data, ..., .by_group = FALSE)

## S3 method for class 'keyed_df'
filter(.data, ...)

## S3 method for class 'keyed_df'
slice(.data, ...)

Arguments

`.data`, `data`, `x`	A keyed object.
`...`	Appropriate arguments for functions.
`.keep_all`	Parameter for dplyr::distinct.
`.by_group`	Parameter for dplyr::arrange.

Details

dplyr::transmute() is supported implicitly with dplyr::mutate() support.

dplyr::rowwise() is not supposed to be generic in dplyr. Use rowwise.keyed_df directly.

All scoped variants of present functions are also supported.

Examples

mtcars %>% key_by(vs, am) %>% dplyr::mutate(gear = 1)

mtcars %>% key_by(vs, am) %>% dplyr::mutate(gear = 1)

Two-table verbs from dplyr for keyed_df

Description

Defined methods for dplyr generic join functions. All of them preserve 'keyed_df' class and 'keys' attribute of the first argument. Also these methods modify rows in keys according to the rows modification in first argument (if any).

Usage

## S3 method for class 'keyed_df'
inner_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
left_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
right_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
full_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
semi_join(x, y, by = NULL, copy = FALSE, ...)

## S3 method for class 'keyed_df'
anti_join(x, y, by = NULL, copy = FALSE, ...)
## S3 method for class 'keyed_df'
inner_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
left_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
right_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
full_join(x, y, by = NULL, copy = FALSE,
  suffix = c(".x", ".y"), ...)

## S3 method for class 'keyed_df'
semi_join(x, y, by = NULL, copy = FALSE, ...)

## S3 method for class 'keyed_df'
anti_join(x, y, by = NULL, copy = FALSE, ...)

Arguments

x, y, by, copy, suffix, ...

Parameters for join functions.

Examples


dplyr::band_members %>% key_by(band) %>%
  dplyr::semi_join(dplyr::band_instruments, by = "name") %>%
  keys()

dplyr::band_members %>% key_by(band) %>%
  dplyr::semi_join(dplyr::band_instruments, by = "name") %>%
  keys()

Add id column and key

Description

Functions for creating id column and key.

Usage

use_id(.tbl)

compute_id_name(x)

add_id(.tbl)

key_by_id(.tbl, .add = FALSE, .exclude = FALSE)
use_id(.tbl)

compute_id_name(x)

add_id(.tbl)

key_by_id(.tbl, .add = FALSE, .exclude = FALSE)

Arguments

`.tbl`	Reference data frame.
`x`	Character vector of names.
`.add`, `.exclude`	Parameters for `key_by()`.

Details

use_id() assigns as keys a tibble with column '.id' and row numbers of .tbl as values.

compute_id_name() computes the name which is different from every element in x by the following algorithm: if '.id' is not present in x it is returned; if taken - '.id1' is checked; if taken - '.id11' is checked and so on.

add_id() creates a column with unique name (computed with compute_id_name()) and row numbers as values (grouping is ignored). After that puts it as first column.

key_by_id() is similar to add_id(): it creates a column with unique name and row numbers as values (grouping is ignored) and calls key_by() function to use this column as key. If .add is FALSE unique name is computed based on .tbl column names; if TRUE then based on .tbl and its keys column names.

Examples

mtcars %>% use_id()

mtcars %>% add_id()

mtcars %>% key_by_id(.exclude = TRUE)

mtcars %>% use_id()

mtcars %>% add_id()

mtcars %>% key_by_id(.exclude = TRUE)

Operate on a selection of keys

Description

keyholder offers scoped variants of the following functions:

key_by(). See key_by_all().
remove_keys(). See remove_keys_all().
restore_keys(). See restore_keys_all().
rename_keys(). See rename_keys_all().

Arguments

`.funs`	Parameter for scoped functions.
`.vars`	Parameter for scoped functions.
`.predicate`	Parameter for scoped functions.
`...`	Parameter for scoped functions.

Supported functions

Description

keyholder supports the following functions:

Base subsetting with [.
dplyr one table verbs.
dplyr two table verbs.

Get keys

Description

Functions for getting information about keys.

Usage

keys(.tbl)

raw_keys(.tbl)

has_keys(.tbl)
keys(.tbl)

raw_keys(.tbl)

has_keys(.tbl)

Arguments

.tbl

Reference data frame.

Value

keys() always returns a tibble of keys. In case of no keys it returns a tibble with number of rows as in .tbl and zero columns. raw_keys() is just a wrapper for attr(.tbl, "keys"). To know whether .tbl has keys use has_keys().

Examples

keys(mtcars)

raw_keys(mtcars)

has_keys(mtcars)

df <- key_by(mtcars, vs, am)
keys(df)

has_keys(df)

keys(mtcars)

raw_keys(mtcars)

has_keys(mtcars)

df <- key_by(mtcars, vs, am)
keys(df)

has_keys(df)

Manipulate keys

Description

Functions to manipulate keys.

Usage

remove_keys(.tbl, ..., .unkey = FALSE)

restore_keys(.tbl, ..., .remove = FALSE, .unkey = FALSE)

pull_key(.tbl, var)

rename_keys(.tbl, ...)
remove_keys(.tbl, ..., .unkey = FALSE)

restore_keys(.tbl, ..., .remove = FALSE, .unkey = FALSE)

pull_key(.tbl, var)

rename_keys(.tbl, ...)

Arguments

`.tbl`	Reference data frame.
`...`	Variables to be used for operations defined in similar fashion as in `dplyr::select()`.
`.unkey`	Whether to `unkey()` `.tbl` in case there are no keys left.
`.remove`	Whether to remove keys after restoring.
`var`	Parameter for `dplyr::pull()`.

Details

remove_keys() removes keys defined with ....

restore_keys() transfers keys defined with ... into .tbl and removes them from keys if .remove == TRUE. If .tbl is grouped the following happens:

If restored keys don't contain grouping variables then groups don't change;
If restored keys contain grouping variables then result will be regrouped based on restored values. In other words restoring keys beats 'not-modifying' grouping variables rule. It is made according to the ideology of keys: they contain information about rows and by restoring you want it to be available.

pull_key() extracts one specified column from keys with dplyr::pull().

rename_keys() renames columns in keys using dplyr::rename().

Examples

df <- mtcars %>% dplyr::as_tibble() %>%
  key_by(vs, am, .exclude = TRUE)
df %>% remove_keys(vs)

df %>% remove_keys(dplyr::everything())

df %>% remove_keys(dplyr::everything(), .unkey = TRUE)


df %>% restore_keys(vs)

df %>% restore_keys(vs, .remove = TRUE)


df %>% restore_keys(dplyr::everything(), .remove = TRUE)

df %>% restore_keys(dplyr::everything(), .remove = TRUE, .unkey = TRUE)


# Restoring on grouped data frame
df_grouped <- df %>% dplyr::mutate(vs = 1) %>% dplyr::group_by(vs)
df_grouped %>% restore_keys(dplyr::everything())

# Pulling
df %>% pull_key(vs)

# Renaming
df %>% rename_keys(Vs = vs)

df <- mtcars %>% dplyr::as_tibble() %>%
  key_by(vs, am, .exclude = TRUE)
df %>% remove_keys(vs)

df %>% remove_keys(dplyr::everything())

df %>% remove_keys(dplyr::everything(), .unkey = TRUE)


df %>% restore_keys(vs)

df %>% restore_keys(vs, .remove = TRUE)


df %>% restore_keys(dplyr::everything(), .remove = TRUE)

df %>% restore_keys(dplyr::everything(), .remove = TRUE, .unkey = TRUE)


# Restoring on grouped data frame
df_grouped <- df %>% dplyr::mutate(vs = 1) %>% dplyr::group_by(vs)
df_grouped %>% restore_keys(dplyr::everything())

# Pulling
df %>% pull_key(vs)

# Renaming
df %>% rename_keys(Vs = vs)

Set keys

Description

Key is a vector which goal is to provide information about rows in reference data frame. Its length should always be equal to number of rows in data frame. Keys are stored as tibble in attribute "keys" and so one data frame can have multiple keys. Data frame with keys is implemented as class keyed_df.

Usage

keys(.tbl) <- value

assign_keys(.tbl, value)

key_by(.tbl, ..., .add = FALSE, .exclude = FALSE)

unkey(.tbl)
keys(.tbl) <- value

assign_keys(.tbl, value)

key_by(.tbl, ..., .add = FALSE, .exclude = FALSE)

unkey(.tbl)

Arguments

`.tbl`	Reference data frame .
`value`	Values of keys (converted to tibble).
`...`	Variables to be used as keys defined in similar fashion as in `dplyr::select()`.
`.add`	Whether to add keys to (possibly) existing ones. If `FALSE` keys will be overridden.
`.exclude`	Whether to exclude key variables from `.tbl`.

Details

key_by ignores grouping when creating keys. Also if .add == TRUE and names of some added keys match the names of existing keys the new ones will override the old ones.

Value for ⁠keys<-⁠ should not be NULL because it is converted to tibble with zero rows. To remove keys use unkey(), remove_keys() or restore_keys(). assign_keys is a more suitable for piping wrapper for ⁠keys<-⁠.

Examples

df <- dplyr::as_tibble(mtcars)

# Value is converted to tibble
keys(df) <- 1:nrow(df)

# This will throw an error
## Not run: 
keys(df) <- 1:10

## End(Not run)

# Use 'vs' and 'am' as keys
df %>% key_by(vs, am)

df %>% key_by(vs, am, .exclude = TRUE)

df %>% key_by(vs) %>% key_by(am, .add = TRUE, .exclude = TRUE)

# Override keys
df %>% key_by(vs, am) %>% dplyr::mutate(vs = 1) %>%
  key_by(gear, vs, .add = TRUE)

# Use select helpers
df %>% key_by(dplyr::one_of(c("vs", "am")))

df %>% key_by(dplyr::everything())

df <- dplyr::as_tibble(mtcars)

# Value is converted to tibble
keys(df) <- 1:nrow(df)

# This will throw an error
## Not run: 
keys(df) <- 1:10

## End(Not run)

# Use 'vs' and 'am' as keys
df %>% key_by(vs, am)

df %>% key_by(vs, am, .exclude = TRUE)

df %>% key_by(vs) %>% key_by(am, .add = TRUE, .exclude = TRUE)

# Override keys
df %>% key_by(vs, am) %>% dplyr::mutate(vs = 1) %>%
  key_by(gear, vs, .add = TRUE)

# Use select helpers
df %>% key_by(dplyr::one_of(c("vs", "am")))

df %>% key_by(dplyr::everything())

Remove selection of keys

Description

These functions remove selection of keys using corresponding scoped variant of select. .funs argument is removed because of its redundancy.

Usage

remove_keys_all(.tbl, ..., .unkey = FALSE)

remove_keys_if(.tbl, .predicate, ..., .unkey = FALSE)

remove_keys_at(.tbl, .vars, ..., .unkey = FALSE)
remove_keys_all(.tbl, ..., .unkey = FALSE)

remove_keys_if(.tbl, .predicate, ..., .unkey = FALSE)

remove_keys_at(.tbl, .vars, ..., .unkey = FALSE)

Arguments

`.tbl`	Reference data frame.
`...`	Parameter for scoped functions.
`.unkey`	Whether to `unkey()` `.tbl` in case there are no keys left.
`.predicate`	Parameter for scoped functions.
`.vars`	Parameter for scoped functions.

Examples

df <- mtcars %>% dplyr::as_tibble() %>% key_by(vs, am, disp)
df %>% remove_keys_all()

df %>% remove_keys_all(.unkey = TRUE)

df %>% remove_keys_if(rlang::is_integerish)

df %>% remove_keys_at(c("vs", "am"))

df <- mtcars %>% dplyr::as_tibble() %>% key_by(vs, am, disp)
df %>% remove_keys_all()

df %>% remove_keys_all(.unkey = TRUE)

df %>% remove_keys_if(rlang::is_integerish)

df %>% remove_keys_at(c("vs", "am"))

Rename selection of keys

Description

These functions rename selection of keys using corresponding scoped variant of rename.

Usage

rename_keys_all(.tbl, .funs = list(), ...)

rename_keys_if(.tbl, .predicate, .funs = list(), ...)

rename_keys_at(.tbl, .vars, .funs = list(), ...)
rename_keys_all(.tbl, .funs = list(), ...)

rename_keys_if(.tbl, .predicate, .funs = list(), ...)

rename_keys_at(.tbl, .vars, .funs = list(), ...)

Arguments

`.tbl`	Reference data frame.
`.funs`	Parameter for scoped functions.
`...`	Parameter for scoped functions.
`.predicate`	Parameter for scoped functions.
`.vars`	Parameter for scoped functions.

Restore selection of keys

Description

These functions restore selection of keys using corresponding scoped variant of select. .funs argument can be used to rename some keys (without touching actual keys) before restoring.

Usage

restore_keys_all(.tbl, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)

restore_keys_if(.tbl, .predicate, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)

restore_keys_at(.tbl, .vars, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)
restore_keys_all(.tbl, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)

restore_keys_if(.tbl, .predicate, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)

restore_keys_at(.tbl, .vars, .funs = list(), ..., .remove = FALSE,
  .unkey = FALSE)

Arguments

`.tbl`	Reference data frame.
`.funs`	Parameter for scoped functions.
`...`	Parameter for scoped functions.
`.remove`	Whether to remove keys after restoring.
`.unkey`	Whether to `unkey()` `.tbl` in case there are no keys left.
`.predicate`	Parameter for scoped functions.
`.vars`	Parameter for scoped functions.

Examples

df <- mtcars %>% dplyr::as_tibble() %>% key_by(vs, am, disp)
# Just restore all keys
df %>% restore_keys_all()

# Restore all keys with renaming and without touching actual keys
df %>% restore_keys_all(.funs = toupper)

# Restore with renaming and removing
df %>%
  restore_keys_all(.funs = toupper, .remove = TRUE)

# Restore with renaming, removing and unkeying
df %>%
  restore_keys_all(.funs = toupper, .remove = TRUE, .unkey = TRUE)

# Restore with renaming keys satisfying the predicate
df %>%
  restore_keys_if(rlang::is_integerish, .funs = toupper)

# Restore with renaming specified keys
df %>%
  restore_keys_at(c("vs", "disp"), .funs = toupper)

df <- mtcars %>% dplyr::as_tibble() %>% key_by(vs, am, disp)
# Just restore all keys
df %>% restore_keys_all()

# Restore all keys with renaming and without touching actual keys
df %>% restore_keys_all(.funs = toupper)

# Restore with renaming and removing
df %>%
  restore_keys_all(.funs = toupper, .remove = TRUE)

# Restore with renaming, removing and unkeying
df %>%
  restore_keys_all(.funs = toupper, .remove = TRUE, .unkey = TRUE)

# Restore with renaming keys satisfying the predicate
df %>%
  restore_keys_if(rlang::is_integerish, .funs = toupper)

# Restore with renaming specified keys
df %>%
  restore_keys_at(c("vs", "disp"), .funs = toupper)

Package 'keyholder'

Help Index

keyholder: Store Data About Rows

Description

Details

Author(s)

See Also

Key by selection of variables

Description

Usage

Arguments

See Also

Examples

Keyed object

Description

Usage

Arguments

Examples

One-table verbs from dplyr for keyed_df

Description

Usage

Arguments

Details

See Also

Examples

Two-table verbs from dplyr for keyed_df

Description

Usage

Arguments

See Also

Examples

Add id column and key

Description

Usage

Arguments

Details

Examples

Operate on a selection of keys

Description

Arguments

See Also

Supported functions

Description

Get keys

Description

Usage

Arguments

Value

See Also

Examples

Manipulate keys

Description

Usage

Arguments

Details

See Also

Examples

Set keys

Description

Usage

Arguments

Details

See Also

Examples

Remove selection of keys

Description

Usage

Arguments

Examples

Rename selection of keys

Description

Usage

Arguments

Restore selection of keys

Description

Usage

Arguments

Examples