Skip to contents

Besides providing a name for every human gene, the HGNC complete gene set1 also includes mappings of HGNC symbols to gene entries in other popular databases or resources, e.g. to Entrez gene or UCSC gene identifiers.

The helper crosswalk() provides an easy interface to crosswalk (i.e. map or translate) across identifiers. This function allows matching exact terms in columns of an imported HGNC complete gene set and return values from another column.

Gene symbol to HUGO identifier and back

# Example bundled data set.
hgnc_dt <- hgnc_dataset_example()

# From gene symbol to HUGO identifier
crosswalk(
  c("A1BG", "A1BG-AS1"),
  from = "symbol",
  to = "hgnc_id",
  hgnc_dataset = hgnc_dt
)
#> [1] "HGNC:5"     "HGNC:37133"

# and back.
crosswalk(
  c("HGNC:5", "HGNC:37133"),
  from = "hgnc_id",
  to = "symbol",
  hgnc_dataset = hgnc_dt
)
#> [1] "A1BG"     "A1BG-AS1"

Gene symbol to aliases

Note that this is typically a one-to-many mapping, so the result is a list instead of a character vector.

crosswalk(
  c("A1BG", "A1BG-AS1", "A1CF"),
  from = "symbol",
  to = "alias_symbol",
  hgnc_dataset = hgnc_dt
)
#> [[1]]
#> [1] NA
#> 
#> [[2]]
#> [1] "FLJ23569"
#> 
#> [[3]]
#> [1] "ACF"       "ASP"       "ACF64"     "ACF65"     "APOBEC1CF"

Gene symbol to locus group

# From gene symbol to locus group
crosswalk(
  c("A1BG", "A1BG-AS1"),
  from = "symbol",
  to = "locus_group",
  hgnc_dataset = hgnc_dt
)
#> [1] "protein-coding gene" "non-coding RNA"

Gene symbol to locus type

# From gene symbol to locus type
crosswalk(
  c("A1BG", "A1BG-AS1"),
  from = "symbol",
  to = "locus_type",
  hgnc_dataset = hgnc_dt
)
#> [1] "gene with protein product" "RNA, long non-coding"

Gene symbol to Entrez gene id

# From gene symbol to Entrez gene id
crosswalk(
  c("A1BG", "A1BG-AS1"),
  from = "symbol",
  to = "entrez_id",
  hgnc_dataset = hgnc_dt
)
#> [1]      1 503538

Gene symbol to Ensembl gene id

# From gene symbol to Ensembl gene id
crosswalk(
  c("A1BG", "A1BG-AS1"),
  from = "symbol",
  to = "ensembl_gene_id",
  hgnc_dataset = hgnc_dt
)
#> [1] "ENSG00000121410" "ENSG00000268895"

Gene symbol to MGI marker id

Map human gene symbols to mouse homolog identifiers, i.e. MGI. If you need to also work with mouse gene names, you might find the package mgi.report.reader useful.

# From gene symbol to MGI marker id
crosswalk(
  c("A1BG", "A1BG-AS1", "AADACL4"),
  from = "symbol",
  to = "mgd_id",
  hgnc_dataset = hgnc_dt
)
#> [[1]]
#> [1] "MGI:2152878"
#> 
#> [[2]]
#> [1] NA
#> 
#> [[3]]
#> [1] "MGI:2685282" "MGI:2685284" "MGI:2685880" "MGI:3650257" "MGI:3650721"
#> [6] "MGI:3652194"