Tutorials and workshops for the use of the Spectra Bioconductor package to analyze mass spectrometry (MS) data.


Seamless Integration of Mass Spectrometry Data from Different Sources


This (instructor-led live demo) workshop explains the Spectra package and shows how this new infrastructure can be used to represent and analyze Mass Spectrometry (MS) data. In a simple use case in which experimental MS2 spectra are matched against public spectral database the seamless integration and analysis of MS data from a variety of input formats is illustrated.


  • Basic familiarity with R and Bioconductor.
  • Basic understanding of Mass Spectrometry (MS) data.


  • During the EuroBioc2020 conference it is possible to run the workshop in the cloud.

  • Get the docker image of this tutorial with docker pull jorainer/spectra_tutorials:latest.

  • Start docker using

    docker run \
        -e PASSWORD=bioc \
        -p 8787:8787 \
  • Enter http://localhost:8787 in a web browser and log in with username rstudio and password bioc.

  • Open this R-markdown file (vignettes/analyzing-MS-data-from-different-sources-with-Spectra.Rmd) in the RStudio server version in the web browser and evaluate the R code blocks.

  • To get the source code: clone this github repository, e.g. with git clone https://github.com/jorainer/SpectraTutorials.

  • Optionally, to run also the code to import the MS2 spectra from HMDB the All Spectra Files (XML) archive from the hmdb downloads page has to be downloaded. The contents of the hmdb_all_spectra.zip archive should then be unzipped into the folder data/hmdb_all_spectra.

Manual setup

For more advanced users it is also possible to manually install all the resources required for this tutorial. In addition to R version >= 4, specifically for the examples involving the MassBank database, a running MySQL/MariaDB server is also required.

The required R packages can be installed with the code below:

install.packages(c("devtools", "rmarkdown", "BiocManager"))

A MySQL database dump of the MassBank database can be downloaded from the official github page. A database named MassBank should then be created in the local MySQL/MariaDB server. The downloaded .sql.gz needs to be unzipped and can then be installed with mysql MassBank < *.sql.

The source code for all tutorials in this package can be downloaded with:

git clone https://github.com/jorainer/SpectraTutorials

Then open the R-markdown (Rmd) files of one of the tutorials (which are located within the vignettes folder with the editor of choice (e.g. RStudio, emacs, vim, …) and evaluate the R-code in the tutorial interactively.

R/Bioconductor packages used

  • Spectra
  • MsCoreUtils

Other R packages not (yet) in Bioconductor:

Time outline

Activity Time
Introduction (LC-MS/MS, Spectra package) 10min
MS data import and handling 5min
Data processing and manipulation 5min
Spectrum data comparison 5min
Comparing spectra against MassBank 10min
Data export 5min
(Comparing spectra against HMDB) (5min)

Workshop goals and objectives

Learning goals

  • Understand how to import MS data into R.
  • Understand the basic concept how different backends can be used in Spectra to work with MS data from various sources.

Learning objectives

  • Import and export MS data with Spectra.
  • Integrate MS data from different resources into an MS data analysis workflow.
  • Apply different data manipulations on MS data represented as a Spectra object.
  • Use Spectra to perform spectra matching in R.