SpADS: An R Script for Mass Spectrometry Data Preprocessing before Data Mining

Luca Belmonte; Rosanna Spera; Claudio Nicolini

SpADS: An R Script for Mass Spectrometry Data Preprocessing before Data Mining

Abstract

Luca Belmonte, Rosanna Spera and Claudio Nicolini

The recent application of Mass Spectrometry (MS) to Nucleic Acid Programmable Protein Array (NAPPA) technique for proteins identification by non-classical methods leads to the needs of more sophisticated algorithm for peak recognition. NAPPA technique allows for functional proteins to be synthesized in situ directly from printed cDNAs but faces the difficulty generated by the presence of master mix and lysate molecules peaks appearing as background in the overall spectra. A wide range of tools are available to analyze proteins conventional mass spectra corresponding to few molecular species. None of them is optimized for background subtraction. Moreover, peak identification is performed by statistical analysis on characteristics peaks and thus background subtraction can alter outcome by erasing characteristic peaks. A first attempt to overcome the so far discussed problem is here discussed. The result of this effort is the development of SpADS: Spectrum Analyzer and Data Set manager-an R script for MS data preprocessing-therein discussed. SpADS provides useful preprocessing functions such binning and peak extractions, as available tools, and provides functions of spectra background subtraction and dataset managing. It is entirely developed in R, thus free of charge. A cluster k means implementation is here used to improve results of SpADS preprocessing on test datasets and on NAPPA expressed proteins.

免责声明: 此摘要通过人工智能工具翻译，尚未经过审核或验证

分享此文章

期刊亮点

索引于

CAS 来源索引 (CASSI)
哥白尼索引
谷歌学术
夏尔巴·罗密欧
学术期刊数据库
Genamics 期刊搜索
期刊目录
引用因子
电子期刊图书馆
参考搜索
哈姆达大学
亚利桑那州EBSCO
期刊摘要索引目录
世界科学期刊目录
OCLC-WorldCat
学者指导
SWB 在线目录
虚拟生物学图书馆 (vifabio)
普布隆斯
Dtu 查找
日内瓦医学教育与研究基金会

计算机科学与系统生物学杂志