Filter the HGNC data set by a keyword (or a regex) to be looked up in the
columns containing gene names or symbols. By default, it will look up in
symbol
, name
, alias_symbol
, alias_name
, prev_symbol
and
prev_name
. Note that this function dives into list-columns for matching and
returns a gene entry if at least one of the strings matches the keyword
.
Usage
filter_by_keyword(
tbl,
keyword,
cols = c("symbol", "name", "alias_symbol", "alias_name", "prev_symbol", "prev_name")
)
Arguments
- tbl
A tibble containing the HGNC data set, typically obtained with
import_hgnc_dataset()
.- keyword
A keyword or a regular expression to be used as search criterion.
- cols
Columns to be looked up.
Value
A tibble of the HGNC data set filtered by
observations matching the keyword
.
Examples
if (FALSE) { # \dontrun{
# Start by retrieving the HGNC data set
hgnc_tbl <- import_hgnc_dataset()
# Search for entries containing "TP53" in the HGNC data set
hgnc_tbl |>
filter_by_keyword('TP53') |>
dplyr::select(1:4)
# The same as above but restrict the search to the `symbol` column
hgnc_tbl |>
filter_by_keyword('TP53', cols = 'symbol') |>
dplyr::select(1:4)
# Match "TP53" exactly in the `symbol` column
hgnc_tbl |>
filter_by_keyword('^TP53$', cols = 'symbol') |>
dplyr::select(1:4)
# `filter_by_keyword()` is vectorised over `keyword`
hgnc_tbl |>
filter_by_keyword(c('^TP53$', '^PIK3CA$'), cols = 'symbol') |>
dplyr::select(1:4)
} # }