This help page provides information about most of the
functionality related to protein annotations in ensembldb.
The proteins method retrieves protein related annotations from
an EnsDb database.
The listUniprotDbs method lists all Uniprot database
names in the EnsDb.
The listUniprotMappingTypes method lists all methods
that were used for the mapping of Uniprot IDs to Ensembl protein IDs.
The listProteinColumns function allows to conveniently
extract all database columns containing protein annotations from
an EnsDb database.
# S4 method for EnsDb
proteins(
object,
columns = listColumns(object, "protein"),
filter = AnnotationFilterList(),
order.by = "",
order.type = "asc",
return.type = "DataFrame"
)
# S4 method for EnsDb
listUniprotDbs(object)
# S4 method for EnsDb
listUniprotMappingTypes(object)
listProteinColumns(object)The EnsDb object.
For proteins: character vector defining the columns to
be extracted from the database. Can be any column(s) listed by the
listColumns method.
For proteins: A filter object extending
AnnotationFilter or a list of such objects to select
specific entries from the database. See Filter-classes for
a documentation of available filters and use
supportedFilters to get the full list of supported filters.
For proteins: a character vector specifying the
column(s) by which the result should be ordered.
For proteins: if the results should be ordered
ascending (order.type = "asc") or descending
(order.type = "desc")
For proteins: character of lenght one specifying
the type of the returned object. Can be either "DataFrame",
"data.frame" or "AAStringSet".
The proteins method returns protein related annotations from
an EnsDb object with its return.type argument
allowing to define the type of the returned object. Note that if
return.type = "AAStringSet" additional annotation columns are
stored in a DataFrame that can be accessed with the mcols
method on the returned object.
The listProteinColumns function returns a character vector
with the column names containing protein annotations or throws an error
if no such annotations are available.
The proteins method performs the query starting from the
protein tables and can hence return all annotations from the
database that are related to proteins and transcripts encoding these
proteins from the database. Since proteins does thus only query
annotations for protein coding transcripts, the genes or
transcripts methods have to be used to retrieve annotations
for non-coding transcripts.
library(ensembldb)
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86
## Get all proteins from tha database for the gene ZBTB16, if protein
## annotations are available
if (hasProteinData(edb))
proteins(edb, filter = GeneNameFilter("ZBTB16"))
#> DataFrame with 5 rows and 4 columns
#> tx_id protein_id protein_sequence gene_name
#> <character> <character> <character> <character>
#> 1 ENST00000335953 ENSP00000338157 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 2 ENST00000544220 ENSP00000437716 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 3 ENST00000535700 ENSP00000443013 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 4 ENST00000392996 ENSP00000376721 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 5 ENST00000539918 ENSP00000445047 XGGLLPQGFIQRELFSKLGE.. ZBTB16
## List the names of all Uniprot databases from which Uniprot IDs are
## available in the EnsDb
if (hasProteinData(edb))
listUniprotDbs(edb)
#> [1] "SWISSPROT" "SPTREMBL"
## List the type of all methods that were used to map Uniprot IDs to Ensembl
## protein IDs
if (hasProteinData(edb))
listUniprotMappingTypes(edb)
#> [1] "DIRECT" "SEQUENCE_MATCH"
## List all columns containing protein annotations
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86
if (hasProteinData(edb))
listProteinColumns(edb)
#> [1] "tx_id" "protein_id" "protein_sequence"
#> [4] "protein_domain_id" "protein_domain_source" "interpro_accession"
#> [7] "prot_dom_start" "prot_dom_end" "uniprot_id"
#> [10] "uniprot_db" "uniprot_mapping_type"