关键词:
Exhaled breath analysis
Selected-Ion Flow-Tube Mass Spectrometry
Principal Component Analysis
Classification and discrimination
Data preprocessing techniques
摘要:
Selected-Ion Flow-Tube Mass Spectrometry (SIFT-MS) has been applied in a clinical context as diagnostic tool for breath samples using target biomarkers. Exhaled breath sampling is non-invasive and therefore much more patient friendly compared to bronchoscopy, which is the golden standard for evaluating airway inflammation. In the actual pilot study, 55 exhaled breath samples of children with asthma, cystic-fibrosis and healthy individuals were included. Rather than focusing on the analysis of target biomarkers or on the identification of biomarkers, different data analysis strategies, including a variety of pretreatment, classification and discrimination techniques, are evaluated regarding their capacity to distinguish the three classes based on subtle differences in their full scan SIFT-MS spectra. Proper data-analysis strategies are required because these full scan spectra contain much external, i.e. unwanted, variation. Each SIFT-MS analysis generates three spectra resulting from ionmolecule reactions of analyte molecules with H3O+, NO+ and O-2(+). Models were built with Linear Discriminant Analysis, Quadratic Discriminant Analysis, Soft Independent Modelling by Class Analogy, Partial Least Squares - Discriminant Analysis, K-Nearest Neighbours, and Classification and Regression Trees. Perfect models, concerning overall sensitivity and specificity (100% for both) were found using Direct Orthogonal Signal Correction (DOSC) pretreatment. Given the uncertainty related to the classification models associated with DOSC pretreatments (i.e. good classification found also for random classes), other models are built applying other preprocessing approaches. A Partial Least Squares - Discriminant Analysis model with a combined pre-processing method considering single value imputation results in 100% sensitivity and specificity for calibration, but was less good predictive. Pareto scaling prior to Quadratic Discriminant Analysis resulted in 41/55 correctly classified samples