1Institute for Biomedicine, Eurac Research, Italy
2Helmholtz Center Munich, Germany
https://github.com/jorainer/MetaboAnnotationIntro

  • Simplify annotation process and handling of matched results.
  • matchMz and matchSpectra functions.
  • Matching configured with specific Param object.
  • Tutorial.

matchMz

  • Annotation using mass or m/z and/or retention time.
  • matchMz(query, target, param)
  • query: features to annotate. Can be numeric, data.frame or SummarizedExperiment.
  • target: annotations, can be numeric, data.frame, CompDb (not yet).
  • param:
    • MzParam: match query and target m/z values.
    • MzRtParam: same as above with additional retention times.
    • Mass2MzParam: target provides exact masses. m/z for (specified) adducts are calculated and matched.
    • Mass2MzRtParam: same as above with additional retention times.
    • … suggest your own …

The result

  • Matched object: contains query, target and parameter (reproducibility).

The result

  • Matched object: contains query, target and parameter (reproducibility).

The result

  • Matched object: contains query, target and parameter (reproducibility).

The result

  • Matched object: contains query, target and parameter (reproducibility).

The result

  • Matched object: contains query, target and parameter (reproducibility).

matchSpectra

  • Match query MS2 spectra against reference.
  • matchSpectra(query, target, param)
  • query: Spectra.
  • target: Spectra (e.g. representing MassBank data).
  • param:
    • CompareSpectraParam: match spectra with score above threshold. Pre-filter by precursor m/z or presence of certain peak.
    • MatchForwardReverseParam: same as above, but calculates also the reverse score.
    • … suggest your own …

Outlook/TODOs

  • Integration of CompDb (and IonDb = + retention times) databases for matchMz.
  • Additional spectra similarity calculation methods? GNPS?
  • Improve handling of Matched and MatchedSpectra objects?

Example

  • The query.
library(MetaboAnnotation)
ms1_features <- read.table(system.file("extdata", "MS1_example.txt",
                                       package = "MetaboAnnotation"),
                           header = TRUE, sep = "\t")
head(ms1_features)
##     feature_id       mz    rtime
## 1 Cluster_0001 102.1281 1.560147
## 2 Cluster_0002 102.1279 2.153590
## 3 Cluster_0003 102.1281 2.925570
## 4 Cluster_0004 102.1281 3.419617
## 5 Cluster_0005 102.1270 5.801039
## 6 Cluster_0006 102.1230 8.137535

Example

  • The target data.
target_df <- read.table(system.file("extdata", "LipidMaps_CompDB.txt",
                                    package = "MetaboAnnotation"),
                        header = TRUE, sep = "\t")
head(target_df)
##   headgroup        name exactmass    formula chain_type
## 1       NAE  NAE 20:4;O  363.2773  C22H37NO3       even
## 2       NAT  NAT 20:4;O  427.2392 C22H37NO5S       even
## 3       NAE NAE 20:3;O2  381.2879  C22H39NO4       even
## 4       NAE    NAE 20:4  347.2824  C22H37NO2       even
## 5       NAE    NAE 18:2  323.2824  C20H37NO2       even
## 6       NAE    NAE 18:3  321.2668  C20H35NO2       even

Example

parm <- Mass2MzParam(adducts = c("[M+H]+", "[M+Na]+"),
                           tolerance = 0.005, ppm = 0)

matched_features <- matchMz(ms1_features, target_df, parm)
matched_features
## Object of class Matched 
## Total number of matches: 9173 
## Number of query objects: 2842 (1969 matched)
## Number of target objects: 57599 (3296 matched)

Example

  • whichQuery, whichTarget to get the indices of matched elements.
  • colnames to return the available columns names.
colnames(matched_features)
##  [1] "feature_id"        "mz"                "rtime"            
##  [4] "target_headgroup"  "target_name"       "target_exactmass" 
##  [7] "target_formula"    "target_chain_type" "adduct"           
## [10] "score"
  • Prefix "target_" is used for column names of the target.

Example

  • Extract matched elements.
matchedData(matched_features, c("feature_id", "adduct", "target_name"))
## DataFrame with 10046 rows and 3 columns
##        feature_id      adduct     target_name
##       <character> <character>     <character>
## 1    Cluster_0001          NA              NA
## 2    Cluster_0002          NA              NA
## ...           ...         ...             ...
## 2841 Cluster_2841     [M+Na]+    ACer 60:1;O4
## 2842 Cluster_2842      [M+H]+ Hex2Cer 42:2;O2

Example

  • Reduce the target to only matching elements.
matched_features
## Object of class Matched 
## Total number of matches: 9173 
## Number of query objects: 2842 (1969 matched)
## Number of target objects: 57599 (3296 matched)
matched_features <- pruneTarget(matched_features)
matched_features
## Object of class Matched 
## Total number of matches: 9173 
## Number of query objects: 2842 (1969 matched)
## Number of target objects: 3296 (3296 matched)

Example

  • Reduce the query to contain only matching elements.
matched_features
## Object of class Matched 
## Total number of matches: 9173 
## Number of query objects: 2842 (1969 matched)
## Number of target objects: 3296 (3296 matched)
matched_features <- matched_features[whichQuery(matched_features)]
matched_features
## Object of class Matched 
## Total number of matches: 9173 
## Number of query objects: 1969 (1969 matched)
## Number of target objects: 3296 (3296 matched)