Utility functions related to RNA/DNA sequences, such as extracting RNA/DNA sequences for features defined in Ensb.

# S4 method for EnsDb
getGenomeFaFile(x, pattern="dna.toplevel.fa")

# S4 method for EnsDb
getGenomeTwoBitFile(x)

Arguments

(In alphabetic order)

pattern

For method getGenomeFaFile: the pattern to be used to identify the fasta file representing genomic DNA sequence.

x

An EnsDb instance.

Methods and Functions

getGenomeFaFile

Returns a FaFile-class (defined in Rsamtools) with the genomic sequence of the genome build matching the Ensembl version of the EnsDb object. The file is retrieved using the AnnotationHub package, thus, at least for the first invocation, an internet connection is required to locate and download the file; subsequent calls will load the cached file instead. If no fasta file for the actual Ensembl version is available the function tries to identify a file matching the species and genome build version of the closest Ensembl release and returns that instead. See the vignette for an example to work with such files.

getGenomeTwoBitFile

Returns a TwoBitFile-class (defined in the rtracklayer package) with the genomeic sequence of the genome build matching the Ensembl version of the EnsDb object. The file is retrieved from AnnotationHub and hence requires (at least for the first query) an active internet connection to download the respective resource. If no DNA sequence matching the Ensembl version of x is available, the function tries to find the genomic sequence of the best matching genome build (closest Ensembl release) and returns that.

See the ensembldb vignette for details.

Value

For getGenomeFaFile: a FaFile-class

object with the genomic DNA sequence.

For getGenomeTwoBitFile: a TwoBitFile-class

object with the genome sequence.

Author

Johannes Rainer

See also

Examples


## Loading an EnsDb for Ensembl version 86 (genome GRCh38):
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86

if (FALSE) {
    ## Retrieve a TwoBitFile with the gneomic DNA sequence matching the organism,
    ## genome release version and, if possible, the Ensembl version of the
    ## EnsDb object.
    Dna <- getGenomeTwoBitFile(edb)
    ## Extract the transcript sequence for all transcripts encoded on chromosome
    ## Y.
    ##extractTranscriptSeqs(Dna, edb, filter=SeqNameFilter("Y"))

}