This help page provides information about most of the
functionality related to protein annotations in ensembldb
.
The proteins
method retrieves protein related annotations from
an EnsDb
database.
The listUniprotDbs
method lists all Uniprot database
names in the EnsDb
.
The listUniprotMappingTypes
method lists all methods
that were used for the mapping of Uniprot IDs to Ensembl protein IDs.
The listProteinColumns
function allows to conveniently
extract all database columns containing protein annotations from
an EnsDb
database.
# S4 method for EnsDb
proteins(
object,
columns = listColumns(object, "protein"),
filter = AnnotationFilterList(),
order.by = "",
order.type = "asc",
return.type = "DataFrame"
)
# S4 method for EnsDb
listUniprotDbs(object)
# S4 method for EnsDb
listUniprotMappingTypes(object)
listProteinColumns(object)
The EnsDb
object.
For proteins
: character vector defining the columns to
be extracted from the database. Can be any column(s) listed by the
listColumns
method.
For proteins
: A filter object extending
AnnotationFilter
or a list of such objects to select
specific entries from the database. See Filter-classes
for
a documentation of available filters and use
supportedFilters
to get the full list of supported filters.
For proteins
: a character vector specifying the
column(s) by which the result should be ordered.
For proteins
: if the results should be ordered
ascending (order.type = "asc"
) or descending
(order.type = "desc"
)
For proteins
: character of lenght one specifying
the type of the returned object. Can be either "DataFrame"
,
"data.frame"
or "AAStringSet"
.
The proteins
method returns protein related annotations from
an EnsDb
object with its return.type
argument
allowing to define the type of the returned object. Note that if
return.type = "AAStringSet"
additional annotation columns are
stored in a DataFrame
that can be accessed with the mcols
method on the returned object.
The listProteinColumns
function returns a character vector
with the column names containing protein annotations or throws an error
if no such annotations are available.
The proteins
method performs the query starting from the
protein
tables and can hence return all annotations from the
database that are related to proteins and transcripts encoding these
proteins from the database. Since proteins
does thus only query
annotations for protein coding transcripts, the genes
or
transcripts
methods have to be used to retrieve annotations
for non-coding transcripts.
library(ensembldb)
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86
## Get all proteins from tha database for the gene ZBTB16, if protein
## annotations are available
if (hasProteinData(edb))
proteins(edb, filter = GeneNameFilter("ZBTB16"))
#> DataFrame with 5 rows and 4 columns
#> tx_id protein_id protein_sequence gene_name
#> <character> <character> <character> <character>
#> 1 ENST00000335953 ENSP00000338157 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 2 ENST00000544220 ENSP00000437716 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 3 ENST00000535700 ENSP00000443013 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 4 ENST00000392996 ENSP00000376721 MDLTKMGMIQLQNPSHPTGL.. ZBTB16
#> 5 ENST00000539918 ENSP00000445047 XGGLLPQGFIQRELFSKLGE.. ZBTB16
## List the names of all Uniprot databases from which Uniprot IDs are
## available in the EnsDb
if (hasProteinData(edb))
listUniprotDbs(edb)
#> [1] "SWISSPROT" "SPTREMBL"
## List the type of all methods that were used to map Uniprot IDs to Ensembl
## protein IDs
if (hasProteinData(edb))
listUniprotMappingTypes(edb)
#> [1] "DIRECT" "SEQUENCE_MATCH"
## List all columns containing protein annotations
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86
if (hasProteinData(edb))
listProteinColumns(edb)
#> [1] "tx_id" "protein_id" "protein_sequence"
#> [4] "protein_domain_id" "protein_domain_source" "interpro_accession"
#> [7] "prot_dom_start" "prot_dom_end" "uniprot_id"
#> [10] "uniprot_db" "uniprot_mapping_type"