ENH make the modeling section better

This commit is contained in:
Nathan Dwarshuis 2021-08-02 18:45:33 -04:00
parent 9e9d897fae
commit be2bf9fa95
2 changed files with 77 additions and 30 deletions

View File

@ -2425,6 +2425,32 @@ CONCLUSIONS: We developed a simplified, semi-closed system for the initial selec
publisher = {Wiley}, publisher = {Wiley},
} }
@Article{Qiu2011,
author = {Peng Qiu and Erin F Simonds and Sean C Bendall and Kenneth D Gibbs and Robert V Bruggner and Michael D Linderman and Karen Sachs and Garry P Nolan and Sylvia K Plevritis},
journal = {Nature Biotechnology},
title = {Extracting a cellular hierarchy from high-dimensional cytometry data with {SPADE}},
year = {2011},
month = {oct},
number = {10},
pages = {886--891},
volume = {29},
doi = {10.1038/nbt.1991},
publisher = {Springer Science and Business Media {LLC}},
}
@Article{Qiu2017,
author = {Peng Qiu},
journal = {Cytometry Part A},
title = {Toward deterministic and semiautomated {SPADE} analysis},
year = {2017},
month = {feb},
number = {3},
pages = {281--289},
volume = {91},
doi = {10.1002/cyto.a.23068},
publisher = {Wiley},
}
@Comment{jabref-meta: databaseType:bibtex;} @Comment{jabref-meta: databaseType:bibtex;}
@Comment{jabref-meta: grouping: @Comment{jabref-meta: grouping:

View File

@ -123,6 +123,7 @@
\newacronym{colb}{COL-B}{collagenase B} \newacronym{colb}{COL-B}{collagenase B}
\newacronym{cold}{COL-D}{collagenase D} \newacronym{cold}{COL-D}{collagenase D}
\newacronym{tsne}{tSNE}{t-stochastic neighbor embedding} \newacronym{tsne}{tSNE}{t-stochastic neighbor embedding}
\newacronym{umap}{UMAP}{uniform manifold approximation and projection}
\newacronym{anv}{AXV}{Annexin-V} \newacronym{anv}{AXV}{Annexin-V}
\newacronym{pi}{PI}{propidium iodide} \newacronym{pi}{PI}{propidium iodide}
\newacronym{rt}{RT}{room temperature} \newacronym{rt}{RT}{room temperature}
@ -1103,48 +1104,68 @@ directions for future work. To this end, the types of \glspl{doe} we generally
used in this work were fractional factorial designs with three levels, which used in this work were fractional factorial designs with three levels, which
enable the estimation of both main effects and second order quadratic effects. enable the estimation of both main effects and second order quadratic effects.
\subsection*{identification and standardization of CPPs and CQAs}\label{sec:background_cqa} \subsection*{identification and standardization of CPPs and
CQAs}\label{sec:background_cqa}
Ultimately the identification of relevant \glspl{cpp} and \glspl{cqa} is an % BACKGROUND at least attempt to show that there is alot of work in the space
interative process % identifying signaling networks
A number of multiomics strategies exist which can generate rich datasets for T In the context of T cell manufacturing, ideally we would have a set of
cells. We will consider several multiomics strategies within this proposal: non-destructive biomarkers that could both identify functional T cells and
predict when a process is on track to deliver such functional T cells. T cells
secrete numerous cytokines and metabolites in the media, which may reflect the
internal state accurately and thus serve as a potential set of \glspl{cqa}.
The complexity of these pathways dictates that we take a big-data approach to
this problem. To this end, there are several pertinent multi-omic (or simply
`omic') techniques that can be used to collect such datasets, which can then be
mined, modeled, and correlated to relevent responses (such as an endpoint
quantification of memory T cells) to identify pertinent \glspl{cqa}.
An overview of the techniques used in this work are:
\begin{description} \begin{description}
\item[Luminex:] A multiplexed bead-based \gls{elisa} that can measure \item[Luminex --] This is a multiplexed bead-based assay similar to \gls{elisa} that can measure
many bulk (not single cell) cytokine concentrations simultaneously many bulk (not single cell) cytokine concentrations simultaneously
in a media sample. Since this only requires media (as opposed to in a media sample. This is a destructive assay but does not require cells to
destructively measuring cells) we will use this as a longitudinal obtain a measurement.
measurement. \item[\gls{nmr} --] It is well known that T cells of different
\item[Metabolomics:] It is well known that T cells of different
lineages have different metabolic profiles; for instance memory T lineages have different metabolic profiles; for instance memory T
cells have larger aerobic capacity and fatty acid cells have larger aerobic capacity and fatty acid
oxidation\cite{Buck2016, van_der_Windt_2012}. We will interrogate oxidation\cite{Buck2016, van_der_Windt_2012}. \gls{nmr} is a technique that
key metabolic species using \gls{nmr} in collaboration with the can non-destructively quantify small molecules in a media sample, and thus is
Edison Lab at the University of Georgia. This will be both a an attractive method that could be used for inline, real-time monitoring.
longitudinal assay using media samples (since some metabolites may \item[Flow and Mass Cytometry --] Flow cytometry using fluorophores has been
be expelled from cells that are indicative of their phenotype) and used extensively for immune cell analysis, but has a practical limit of
at endpoint where we will lyse the cells and interogate their entire
metabolome.
\item[Flow and Mass Cytometry:] Flow cytometry using fluorophores has been used
extensively for immune cell analysis, but has a practical limit of
approximately 18 colors\cite{Spitzer2016}. Mass cytometry is analogous to approximately 18 colors\cite{Spitzer2016}. Mass cytometry is analogous to
traditional flow cytometry except that it uses heavy-metal \gls{mab} traditional flow cytometry except that it uses heavy-metal \gls{mab}
conjugates, which has a practical limit of over 50 markers. This will be conjugates, which has a practical limit of over 50 markers. While mass
useful in determining precise subpopulations and phenotypes that may be cytometry is less practical than simple flow cytometers such as the BD Accuri,
influencing responses, especially when one considers that many cell types can we may find that only a few markers are required to accurately predict
be defined by more than one marker combination. We will perform this at performance, and thus this could easily translate to industry using relatively
endpoint. While mass cytometry is less practical than simple flow cytometers cost-effective equipment. Both of these destructively analyze the cells
such as the BD Accuri, we may find that only a few markers are required to themselves, but they have the advantage in that they are measuring a direct
accurately predict performance, and thus this could easily translate to property of the cells and not a secreted product.
industry using relatively cost-effective equipment.
\end{description} \end{description}
% TODO add a computational section % BACKGROUND what about ssRNAseq?
% TODO add a section explaining causal inference since this is a big part of Upon collecting these omic datasets, determining the \glspl{cqa} becomes a
% the end of aim 1 computational problem. Predictions of the final product using data collected
earlier in time can be made using any number of supervised learning techniques
(linear and non-linear regression in all its forms) which in turn can be used to
develop process control models. Unsupervised learning and dimensionality
reduction techniques such as \gls{tsne}, \gls{umap}, and
\gls{spade}\cite{Qiu2011, Qiu2017} can be performed to delineate clusters of
interesting cell types and the markers that define them.
Ultimately, identifying \glspl{cqa} will likely be an iterative process, wherein
putative \glspl{cqa} will be identified, the corresponding \glspl{cpp} will be
set and optimized to maximize products with these \glspl{cpp} and then
additional data will be collected in the clinic as the product is tested on
various patients with different indications. Additional \glspl{cqa} may be
identified which better predict specific clinical outcomes, which can be fed
back into the process model and optimized again.
\section{Innovation} \section{Innovation}