This help page provides information about most of the functionality related to protein annotations in ensembldb.

The proteins method retrieves protein related annotations from an EnsDb database.

The listUniprotDbs method lists all Uniprot database names in the EnsDb.

The listUniprotMappingTypes method lists all methods that were used for the mapping of Uniprot IDs to Ensembl protein IDs.

The listProteinColumns function allows to conveniently extract all database columns containing protein annotations from an EnsDb database.

# S4 method for EnsDb
proteins(
  object,
  columns = listColumns(object, "protein"),
  filter = AnnotationFilterList(),
  order.by = "",
  order.type = "asc",
  return.type = "DataFrame"
)

# S4 method for EnsDb
listUniprotDbs(object)

# S4 method for EnsDb
listUniprotMappingTypes(object)

listProteinColumns(object)

Arguments

object

The EnsDb object.

columns

For proteins: character vector defining the columns to be extracted from the database. Can be any column(s) listed by the listColumns method.

filter

For proteins: A filter object extending AnnotationFilter or a list of such objects to select specific entries from the database. See Filter-classes for a documentation of available filters and use supportedFilters to get the full list of supported filters.

order.by

For proteins: a character vector specifying the column(s) by which the result should be ordered.

order.type

For proteins: if the results should be ordered ascending (order.type = "asc") or descending (order.type = "desc")

return.type

For proteins: character of lenght one specifying the type of the returned object. Can be either "DataFrame", "data.frame" or "AAStringSet".

Value

The proteins method returns protein related annotations from an EnsDb object with its return.type argument allowing to define the type of the returned object. Note that if

return.type = "AAStringSet" additional annotation columns are stored in a DataFrame that can be accessed with the mcols

method on the returned object.

The listProteinColumns function returns a character vector with the column names containing protein annotations or throws an error if no such annotations are available.

Details

The proteins method performs the query starting from the protein tables and can hence return all annotations from the database that are related to proteins and transcripts encoding these proteins from the database. Since proteins does thus only query annotations for protein coding transcripts, the genes or transcripts methods have to be used to retrieve annotations for non-coding transcripts.

Author

Johannes Rainer

Examples

library(ensembldb)
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86
## Get all proteins from tha database for the gene ZBTB16, if protein
## annotations are available
if (hasProteinData(edb))
    proteins(edb, filter = GeneNameFilter("ZBTB16"))
#> DataFrame with 5 rows and 4 columns
#>             tx_id      protein_id       protein_sequence   gene_name
#>       <character>     <character>            <character> <character>
#> 1 ENST00000335953 ENSP00000338157 MDLTKMGMIQLQNPSHPTGL..      ZBTB16
#> 2 ENST00000544220 ENSP00000437716 MDLTKMGMIQLQNPSHPTGL..      ZBTB16
#> 3 ENST00000535700 ENSP00000443013 MDLTKMGMIQLQNPSHPTGL..      ZBTB16
#> 4 ENST00000392996 ENSP00000376721 MDLTKMGMIQLQNPSHPTGL..      ZBTB16
#> 5 ENST00000539918 ENSP00000445047 XGGLLPQGFIQRELFSKLGE..      ZBTB16

## List the names of all Uniprot databases from which Uniprot IDs are
## available in the EnsDb
if (hasProteinData(edb))
    listUniprotDbs(edb)
#> [1] "SWISSPROT" "SPTREMBL" 


## List the type of all methods that were used to map Uniprot IDs to Ensembl
## protein IDs
if (hasProteinData(edb))
    listUniprotMappingTypes(edb)
#> [1] "DIRECT"         "SEQUENCE_MATCH"


## List all columns containing protein annotations
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86
if (hasProteinData(edb))
    listProteinColumns(edb)
#>  [1] "tx_id"                 "protein_id"            "protein_sequence"     
#>  [4] "protein_domain_id"     "protein_domain_source" "interpro_accession"   
#>  [7] "prot_dom_start"        "prot_dom_end"          "uniprot_id"           
#> [10] "uniprot_db"            "uniprot_mapping_type"