top of page

Machine learning classifier to study age related variations from flow cytometry data

Clustering organizes flow cytometry data into subpopulations with substantially homogeneous characteristics but does not directly address the important problem of identifying the salient differences in subpopulations between subjects and groups. Here, we address this problem by augmenting SWIFT—a mixture model based clustering algorithm reported previously. First, we show that SWIFT clustering using a “template” mixture model, in which all subpopulations are represented, identifies small differences in cell numbers per subpopulation between samples. Second, we demonstrate that resolution of inter-sample differences is increased by “competition” wherein a joint model is formed by combining the mixture model templates obtained from different groups. Thus SWIFT template competition is a powerful approach to sharpen comparisons between selected groups in flow cytometry datasets.


In this study a SVM is designed to determine if the template competition approach improves classification between young and elderly subjects. Three strategies were used for classification using SCR and Joint templates, after normalizing results for each cluster by z scores across all samples: 1) ten-fold cross-validation was performed on clustering results from Study 1; 2) ten-fold cross-validation was performed on the aggregate results from Studies 1 and 2; and 3) Study 2 results were used as an independent test set after training on the Study 1 samples. All three strategies showed a classification accuracy of >80%, confirming that the differences detected by SWIFT represented real aging-related changes. In the first two strategies, template competition further increased the accuracy of the classification, and in the independent analysis of Study 2, the accuracy was equal with or without competition.






bottom of page