Skip to contents

Filter the HGNC data set by a keyword (or a regex) to be looked up in the columns containing gene names or symbols. By default, it will look up in symbol, name, alias_symbol, alias_name, prev_symbol and prev_name. Note that this function dives into list-columns for matching and returns a gene entry if at least one of the strings matches the keyword.

Usage

filter_by_keyword(
  tbl,
  keyword,
  cols = c("symbol", "name", "alias_symbol", "alias_name", "prev_symbol", "prev_name")
)

Arguments

tbl

A tibble containing the HGNC data set, typically obtained with import_hgnc_dataset().

keyword

A keyword or a regular expression to be used as search criterion.

cols

Columns to be looked up.

Value

A tibble of the HGNC data set filtered by observations matching the keyword.

Examples

if (FALSE) { # \dontrun{
# Start by retrieving the HGNC data set
hgnc_tbl <- import_hgnc_dataset()

# Search for entries containing "TP53" in the HGNC data set
hgnc_tbl |>
  filter_by_keyword('TP53') |>
  dplyr::select(1:4)

# The same as above but restrict the search to the `symbol` column
hgnc_tbl |>
  filter_by_keyword('TP53', cols = 'symbol') |>
  dplyr::select(1:4)

# Match "TP53" exactly in the `symbol` column
hgnc_tbl |>
  filter_by_keyword('^TP53$', cols = 'symbol') |>
  dplyr::select(1:4)

# `filter_by_keyword()` is vectorised over `keyword`
hgnc_tbl |>
  filter_by_keyword(c('^TP53$', '^PIK3CA$'), cols = 'symbol') |>
  dplyr::select(1:4)
} # }