diff --git a/tables/doe_ratio.tex b/tables/doe_ratio.tex index d96eaa3..10b62ba 100644 --- a/tables/doe_ratio.tex +++ b/tables/doe_ratio.tex @@ -1,23 +1,23 @@ % Table created by stargazer v.5.2.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu -% Date and time: Thu, Jul 29, 2021 - 04:15:57 PM +% Date and time: Thu, Jul 29, 2021 - 05:45:05 PM \begin{tabular}{@{\extracolsep{5pt}}lc} \\[-1.8ex]\hline \hline \\[-1.8ex] \\[-1.8ex] & CD4:CD8 CD62L+CCR7+ Ratio \\ \hline \\[-1.8ex] - Dataset [2] & 893,357.900$^{***}$ \\ - Functional mAb \% & 28,209.730$^{***}$ \\ - IL2 Conc. (IU/ml) & 50,896.490$^{***}$ \\ - DMS Conc. (1/ml) & 926.925$^{***}$ \\ - Intercept & $-$3,368,762.000$^{***}$ \\ + Dataset [2] & 0.020 \\ + Functional mAb \% & 0.002$^{***}$ \\ + IL2 Conc. (IU/ml) & 0.001 \\ + DMS Conc. (1/ml) & 0.0001$^{***}$ \\ + Intercept & $-$0.144$^{*}$ \\ \hline \\[-1.8ex] Observations & 30 \\ -R$^{2}$ & 0.835 \\ -Adjusted R$^{2}$ & 0.808 \\ -Residual Std. Error & 493,168.700 (df = 25) \\ -F Statistic & 31.571$^{***}$ (df = 4; 25) \\ +R$^{2}$ & 0.879 \\ +Adjusted R$^{2}$ & 0.860 \\ +Residual Std. Error & 0.039 (df = 25) \\ +F Statistic & 45.554$^{***}$ (df = 4; 25) \\ \hline \hline \\[-1.8ex] \textit{Note:} & \multicolumn{1}{r}{$^{*}$p$<$0.1; $^{**}$p$<$0.05; $^{***}$p$<$0.01} \\ -\end{tabular} \ No newline at end of file +\end{tabular} diff --git a/tex/thesis.tex b/tex/thesis.tex index 8c281e1..d474d48 100644 --- a/tex/thesis.tex +++ b/tex/thesis.tex @@ -50,6 +50,7 @@ \newacronym{cpp}{CPP}{critical process parameter} \newacronym{dms}{DMS}{degradable microscaffold} \newacronym{doe}{DOE}{design of experiments} +\newacronym{adoe}{ADOE}{adaptive design of experiments} \newacronym{gmp}{GMP}{Good Manufacturing Practices} \newacronym{cho}{CHO}{Chinese hamster ovary} \newacronym{all}{ALL}{acute lymphoblastic leukemia} @@ -155,10 +156,12 @@ \end{flushleft} } +% a BME's best friend \newcommand{\invivo}{\textit{in vivo}} \newcommand{\invitro}{\textit{in vitro}} \newcommand{\exvivo}{\textit{ex vivo}} +% various CD-whatever crap \newcommand{\cd}[1]{CD{#1}} \newcommand{\anti}[1]{anti-{#1}} \newcommand{\antih}[1]{anti-human {#1}} @@ -180,17 +183,21 @@ \newcommand{\ptcar}{\gls{car}+} \newcommand{\ptcarp}{\ptcar~\si{\percent}} +% DOE responses I don't feel like typing ad-nauseam +\newcommand{\pilII}{\gls{il2} concentration} +\newcommand{\pdms}{\gls{dms} concentration} +\newcommand{\pmab}{functional \gls{mab} surface density} + +% vendor and product stuff I don't feel like typing \newcommand{\catnum}[2]{(#1, #2)} \newcommand{\product}[3]{#1 \catnum{#2}{#3}} - \newcommand{\thermo}{Thermo Fisher} \newcommand{\miltenyi}{Miltenyi Biotech} \newcommand{\bl}{Biolegend} +% the obligatory misc category \newcommand{\inlinecode}{\texttt} - \newcommand{\subcap}[2]{\subref{#1}) #2} - \newcommand{\sigkey}{Significance test key: *p<0.1; **p < 0.05; ***p<0.01} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -2201,7 +2208,6 @@ between T cells. Since \gls{il2} is secreted by activated T cells themselves, T cells in the \gls{dms} system may need less or no \gls{il2} if this hypothesis were true. - % TODO this plots proportions look dumb % TODO explain what the NLS lines are in b % TODO plot the differences in lower IL2 concentrations to better show this @@ -2263,6 +2269,30 @@ at \SI{10}{\IU\per\ml} throughout the remainder of this aim. \input{../tables/doe_runs.tex} \end{table} +% FIGURE first DOE results to show how the second DOE was motivated + +We conducted two consecutive \glspl{doe} to optimize the \pth{} and \ptmem{} +responses for the \gls{dms} system. In the first \gls{doe} we, tested \pilII{} in +the range of \SIrange{10}{30}{\IU\per\ml}, \pdms{} in the range of +\SIrange{500}{2500}{\dms\per\ml}, and \pmab{} in the range of +\SIrange{60}{100}{\percent}. + +% TODO explain why not all runs were used +After performing the first \gls{doe} we augmented the original design matrix +with an \gls{adoe} which was built with three goals in mind. Firstly we wished +to validate the first \gls{doe} by assessing the strength and responses of each +effect. Secondly, we wished to improve our confidence in regions that showed +high complexity, such as the peak in the \gls{dms} concentration for the total +\ptmem{} cell response. Thirdly, we wished to explore additional ranges of each +response. Since \pilII{} and \pdms{} appeared to continue positively influence +multiple responses beyond our tested range, we were curious if there was an +optimum at some higher setting of either of these values. For this reason, we +increased the \pilII{} to include \SI{40}{\IU\per\ml} and the \pdms{} to +\SI{3500}{\dms\per\ml}. Note that it was impossible to go beyond +\SI{100}{\percent} for the \pmab{}, so runs were positioned for this parameter +with validation and confidence improvements in mind. The runs for each \gls{doe} +were shown in \cref{tab:doe_runs}. + \begin{figure*}[ht!] \begingroup @@ -2316,6 +2346,57 @@ at \SI{10}{\IU\per\ml} throughout the remainder of this aim. \input{../tables/doe_ratio.tex} \end{table} +The response plots from both \glspl{doe} are shown in \cref{fig:doe_responses} +for total \ptmem{} cells, total \pth{} cells, total \ptmemh{} cells, and CD4:CD8 +ratio in the \ptmem{} compartment. In general, the responses for the first and +second \gls{doe} seemed to overlap, although not perfectly. Interestingly, only +the \ptmem{} response seemed to have anything more complex than a linear +relationship, particularly in the case of \pilII{} and \pdms{}, which showed +intermediate optimums (\cref{fig:doe_responses_mem}). In the case of \pilII{}, +it was not clear if this optimum was simply due to a batch effect of being from +the first or second \gls{doe}. The optimum for \pdms{} appeared in the same +location albeit more pronounced in the second \gls{doe} so, giving more +confidence to the location of this second order feature. The remainder of the +responses showed mostly linear relationships in all parameter cases +(\cref{fig:doe_responses_cd4,fig:doe_responses_mem4,fig:doe_responses_ratio}). + +% TODO it seems arbitrary that I went straight to a third order model, the real +% reason is because it seemed weird that a second order model didn't find +% anything to be significant +We performed linear regression on the three input parameters as well as a binary +parameter representing if a given run came from the first or second \gls{doe} +(called `dataset'). Starting with the total \ptmem{} cells response, we fit a +first order regression model using these four parameters +(\cref{tab:doe_mem1.tex}). While \pilII{} was found to be a significant +predictor, the model fit was extremely poor ($R^2$ of 0.331). This was not +surprising given the apparent complexity of this response +(\cref{fig:doe_responses_mem}). To obtain a better fit, we added second and +third degree terms (\cref{tab:doe_mem2.tex}). Note that the dataset parameter +was not included in the second order interaction as this was treated as a +blocking variable, which are typically not assumed to have interaction effects. +Also note that the response was log-transformed, which yielded a better fit. In +this model many more parameters emerged as being significant, including the +quadratic terms for \pdms{} and \pilII{}, in agreement with what can be +qualitatively observed in the response plot (\cref{fig:doe_responses_mem}). +Furthermore, the dataset parameter was weakly significant, indicating a possible +batch effect between the \glspl{doe}. We should also note that despite many +parameters being significant, this model was still only mediocre in describing +this response; the $R^2$ was 0.741 but the adjusted $R^2$ was 0.583, indicating +that our data might be underpowered for a model this complex. Further +experiments beyond what was performed here may be needed to fully describe this +response. + +% TODO combine these tables into one +We performed linear regression on the other three responses, all of which +performed much better than the \ptmem{} response as expected given the much +lower apparent complexity in the response plots +(\cref{fig:doe_responses_cd4,fig:doe_responses_mem4,fig:doe_responses_ratio}). +All these models appeared to fit will, with $R^2$ and adjusted $R^2$ upward of +0.8. In all but the CD4:CD8 \ptmem{} ratio, the dataset parameter emerged as +significant, indicating a batch effect between the \glspl{doe}. All other +parameters except \pilII{} in the case of CD4:CD8 \ptmem{} ratio were +significant predictors. + \begin{figure*}[ht!] \begingroup @@ -2330,9 +2411,17 @@ at \SI{10}{\IU\per\ml} throughout the remainder of this aim. \subcap{fig:doe_sr_contour_ratio}{CD4:CD8 ratio in the \ptmem{} compartment}. } - \label{fig:doe_responses} + \label{fig:doe_sr_contour} \end{figure*} +We then visualized the total \ptmemh{} cells and CD4:CD8 \ptmem{} ratio using +the response explorer in DataModeler to create contour plots around the maximum +responses. For both, it appeared that maximizing all three input parameters +resulted in the maximum value for either response (\cref{fig:doe_responses}). +While not all combinations at and around this optimum were tested, the model +nonetheless showed that there were no other optimal values or regions elsewhere +in the model. + % TODO this section header sucks \subsection{AI modeling reveals highly predictive species}