ADD results from the modeling paper
This commit is contained in:
parent
0c98ecce44
commit
89866515e8
|
@ -2265,6 +2265,17 @@ Venn diagram from the venn R package.
|
|||
% TODO this section header sucks
|
||||
\subsection{AI modeling reveals highly predictive species}
|
||||
|
||||
Due to the heterogeneity of the multivariate data collected and knowing that no
|
||||
single model structure is perfect for all applications, we implemented an
|
||||
agnostic modeling approach to better understand these TN+TCM responses. To
|
||||
achieve this, a consensus analysis using seven machine learning (ML) techniques,
|
||||
Random Forest (RF), Gradient Boosted Machine (GBM), Conditional Inference Forest
|
||||
(CIF), Least Absolute Shrinkage and Selection Operator (LASSO), Partial
|
||||
Least-Squares Regression (PLSR), Support Vector Machine (SVM), and DataModeler’s
|
||||
Symbolic Regression (SR), was implemented to molecularly characterize TN+TCM
|
||||
cells and to extract predictive features of quality early on their expansion
|
||||
process (Fig.1d-e).
|
||||
|
||||
% TODO this table looks like crap, break it up into smaller tables
|
||||
\begin{table}[!h] \centering
|
||||
\caption{Results for data-driven modeling}
|
||||
|
@ -2272,6 +2283,25 @@ Venn diagram from the venn R package.
|
|||
\input{../tables/model_results.tex}
|
||||
\end{table}
|
||||
|
||||
SR models achieved the highest predictive performance (R2>93\%) when using
|
||||
multi-omics predictors for all endpoint responses (\cref{tab:mod_results}). SR achieved R2>98\%
|
||||
while GBM tree-based ensembles showed leave-one-out cross-validated R2 (LOO-R2)
|
||||
>95\% for CD4+ and CD4+/CD8+ TN+TCM responses. Similarly, LASSO, PLSR, and SVM
|
||||
methods showed consistent high LOO-R2, 92.9\%, 99.7\%, and 90.5\%, respectively,
|
||||
to predict the CD4+/CD8+ TN+TCM. Yet, about 10\% reduction in LOO-R2,
|
||||
72.5\%-81.7\%, was observed for CD4+ TN+TCM with these three methods. Lastly, SR
|
||||
and PLSR achieved R2>90\% while other ML methods exhibited exceedingly variable
|
||||
LOO-R2 (0.3\%,RF-51.5\%,LASSO) for CD8+ TN+TCM cells.
|
||||
|
||||
% FIGURE the CD4/CD8 model results using SR
|
||||
|
||||
The top-performing technique, SR, showed that the median aggregated predictions
|
||||
for CD4+ and CD8+ TN+TCM cells increases when IL2 concentration, IL15, and IL2R
|
||||
increase while IL17a decreases in conjunction with other features. These
|
||||
patterns combined with low values of DMS concentration and GM-CSF uniquely
|
||||
characterized maximum CD8+ TN+TCM. Meanwhile, higher glycine but lower IL13 in
|
||||
combination with others showed maximum CD4+ TN+TCM predictions (Fig.2).
|
||||
|
||||
\begin{figure*}[ht!]
|
||||
\begingroup
|
||||
|
||||
|
@ -2291,6 +2321,14 @@ Venn diagram from the venn R package.
|
|||
\label{fig:mod_flower}
|
||||
\end{figure*}
|
||||
|
||||
Selecting CPPs and CQAs candidates consistently for T cell memory is desired.
|
||||
Here, \gls{tnfa} was found in consensus across all seven ML methods for predicting
|
||||
CD4+/CD8+ TN+TCM when considering features with the highest importance scores
|
||||
across models (Fig.3a;Methods). Other features, IL2R, IL4, IL17a, and DMS
|
||||
concentration, were commonly selected in >=5 ML methods (Fig.3a,c). Moreover,
|
||||
IL13 and IL15 were found predictive in combination with these using SR
|
||||
(Supp.Table.S4).
|
||||
|
||||
\section{discussion}
|
||||
|
||||
\chapter{aim 2b}\label{aim2b}
|
||||
|
|
Loading…
Reference in New Issue