| Title: | Parser for mzML, mzXML, and netCDF Files (Mass Spectrometry Data) |
|---|---|
| Description: | A tiny parser to extract mass spectra data and metadata table of mass spectrometry acquisition properties from mzML, mzXML and netCDF files introduced in <doi:10.1021/acs.jproteome.2c00120>. |
| Authors: | Sadjad Fakouri-Baygi [aut]
|
| Maintainer: | Dinesh Barupal <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 2.0 |
| Built: | 2026-05-16 08:47:12 UTC |
| Source: | https://github.com/idslme/idsl.mxp |
This function returns a list of two data objects needed for the mass spectrometry data processing.
getNetCDF(MSfile)getNetCDF(MSfile)
MSfile |
name of the mass spectrometry file with .cdf extension |
scanTable |
a dataframe of different scan properties including 'seqNum', 'msLevel', 'polarity', 'peaksCount', 'totIonCurrent', 'retentionTime', 'basePeakMZ', 'basePeakIntensity', 'collisionEnergy', 'lowMZ', 'highMZ', 'precursorScanNum', 'precursorMZ', 'precursorCharge', 'precursorIntensity', 'injectionTime', 'filterString', 'scanType', 'centroided', 'isolationWindowTargetMZ', 'isolationWindowLowerOffset', 'isolationWindowUpperOffset', 'scanWindowLowerLimit', and 'scanWindowUpperLimit'. |
spectraList |
a list of matrices of m/z and intensity values for each chromatogram scan |
‘retentionTime' column in the 'scanTable’ object is presented in minute.
This function creates a scanTable from chromatogram scans of the mass spectrometry data.
getScanTable(xmlData, msFormat)getScanTable(xmlData, msFormat)
xmlData |
A structured data of the mass spectrometry data created by the 'read_xml' function. |
msFormat |
format extension of the mass spectrometry file c("mzML", "mzXML") |
a dataframe of different scan properties including 'seqNum', 'msLevel', 'polarity', 'peaksCount', 'totIonCurrent', 'retentionTime', 'basePeakMZ', 'basePeakIntensity', 'collisionEnergy', 'lowMZ', 'highMZ', 'precursorScanNum', 'precursorMZ', 'precursorCharge', 'precursorIntensity', 'injectionTime', 'filterString', 'scanType', 'centroided', 'isolationWindowTargetMZ', 'isolationWindowLowerOffset', 'isolationWindowUpperOffset', 'scanWindowLowerLimit', and 'scanWindowUpperLimit'. 'scanType' is only provided for the mzXML data format.
'retentionTime' column is presented in minute.
temp_wd <- tempdir() temp_wd_zip <- paste0(temp_wd,"/idsl_ipa_test_files.zip") download.file(paste0("https://github.com/idslme/IDSL.IPA/blob/main/", "IPA_educational_files/idsl_ipa_test_files.zip?raw=true"), destfile = temp_wd_zip, mode = "wb") unzip(temp_wd_zip, exdir = temp_wd) xmlData <- xml2::read_xml(paste0(path = temp_wd, "/", MSfile = "003.mzML")) scanTable <- getScanTable(xmlData, msFormat = "mzML")temp_wd <- tempdir() temp_wd_zip <- paste0(temp_wd,"/idsl_ipa_test_files.zip") download.file(paste0("https://github.com/idslme/IDSL.IPA/blob/main/", "IPA_educational_files/idsl_ipa_test_files.zip?raw=true"), destfile = temp_wd_zip, mode = "wb") unzip(temp_wd_zip, exdir = temp_wd) xmlData <- xml2::read_xml(paste0(path = temp_wd, "/", MSfile = "003.mzML")) scanTable <- getScanTable(xmlData, msFormat = "mzML")
This function creates a spectraList for the chromatogram scans of the mass spectrometry data.
getSpectra(xmlData, msFormat)getSpectra(xmlData, msFormat)
xmlData |
a structured data of the mass spectrometry data created by the 'read_xml' function. |
msFormat |
format extension of the mass spectrometry file c("mzML", "mzXML") |
a list of matrices of m/z and intensity values for each chromatogram scan
temp_wd <- tempdir() temp_wd_zip <- paste0(temp_wd,"/idsl_ipa_test_files.zip") download.file(paste0("https://github.com/idslme/IDSL.IPA/blob/main/", "IPA_educational_files/idsl_ipa_test_files.zip?raw=true"), destfile = temp_wd_zip, mode = "wb") unzip(temp_wd_zip, exdir = temp_wd) xmlData <- xml2::read_xml(paste0(path = temp_wd, "/", MSfile = "003.mzML")) spectraList <- getSpectra(xmlData, msFormat = "mzML")temp_wd <- tempdir() temp_wd_zip <- paste0(temp_wd,"/idsl_ipa_test_files.zip") download.file(paste0("https://github.com/idslme/IDSL.IPA/blob/main/", "IPA_educational_files/idsl_ipa_test_files.zip?raw=true"), destfile = temp_wd_zip, mode = "wb") unzip(temp_wd_zip, exdir = temp_wd) xmlData <- xml2::read_xml(paste0(path = temp_wd, "/", MSfile = "003.mzML")) spectraList <- getSpectra(xmlData, msFormat = "mzML")
Locate indices of the pattern in the string
MXP_locate_regex(string, pattern, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)MXP_locate_regex(string, pattern, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)
string |
a string as character |
pattern |
a pattern to screen |
ignore.case |
ignore.case |
perl |
perl |
fixed |
fixed |
useBytes |
useBytes |
This function returns 'NULL' when no matches are detected for the pattern.
A 2-column matrix of location indices. The first and second columns represent start and end positions, respectively.
pattern <- "Cl" string <- "NaCl.5HCl" Location_Cl <- MXP_locate_regex(string, pattern)pattern <- "Cl" string <- "NaCl.5HCl" Location_Cl <- MXP_locate_regex(string, pattern)
This function returns a list of two data objects required for the mass spectrometry data processing.
peak2list(path, MSfileName = "")peak2list(path, MSfileName = "")
path |
address of the mass spectrometry file |
MSfileName |
name of the mass spectrometry file with .mzML or .mzXML extensions |
scanTable |
a dataframe of different scan properties including 'seqNum', 'msLevel', 'polarity', 'peaksCount', 'totIonCurrent', 'retentionTime', 'basePeakMZ', 'basePeakIntensity', 'collisionEnergy', 'lowMZ', 'highMZ', 'precursorScanNum', 'precursorMZ', 'precursorCharge', 'precursorIntensity', 'injectionTime', 'filterString', 'scanType', 'centroided', 'isolationWindowTargetMZ', 'isolationWindowLowerOffset', 'isolationWindowUpperOffset', 'scanWindowLowerLimit', and 'scanWindowUpperLimit'. 'scanType' is only provided for the mzXML data format. |
spectraList |
a list of matrices of m/z and intensity values for each chromatogram scan |
‘retentionTime' column in the 'scanTable’ object is presented in minute.
https://colab.research.google.com/drive/1gXwwuI1zzDHykKfodLSQQt5rwTuFEMpD
temp_wd <- tempdir() temp_wd_zip <- paste0(temp_wd,"/idsl_ipa_test_files.zip") download.file(paste0("https://github.com/idslme/IDSL.IPA/blob/main/", "IPA_educational_files/idsl_ipa_test_files.zip?raw=true"), destfile = temp_wd_zip, mode = "wb") unzip(temp_wd_zip, exdir = temp_wd) p2l <- peak2list(path = temp_wd, MSfileName = "003.mzML")temp_wd <- tempdir() temp_wd_zip <- paste0(temp_wd,"/idsl_ipa_test_files.zip") download.file(paste0("https://github.com/idslme/IDSL.IPA/blob/main/", "IPA_educational_files/idsl_ipa_test_files.zip?raw=true"), destfile = temp_wd_zip, mode = "wb") unzip(temp_wd_zip, exdir = temp_wd) p2l <- peak2list(path = temp_wd, MSfileName = "003.mzML")