huggingchat/papers-content / 1504 /1504.07642.md

|

59.8 kB

Infinitesimal Perturbation Analysis for Personalized Cancer Therapy Design ^*

Julia L. Fleck ^* Christos G. Cassandras ^*

^* Division of Systems Engineering and Center for Information and Systems Engineering, Boston University, Brookline, MA 02446 USA
(e-mail: jfleck@bu.edu, cgc@bu.edu)

Abstract: We use a Stochastic Hybrid Automaton (SHA) model of prostate cancer evolution under intermittent androgen suppression (IAS) to study a threshold-based policy for therapy design. IAS is currently one of the most widely used treatments for advanced prostate cancer. Patients undergoing IAS are submitted to cycles of treatment (in the form of androgen deprivation) and off-treatment periods in an alternating manner. One of the main challenges in IAS is to optimally design a therapy scheme, i.e., to determine when to discontinue and recommence androgen suppression. The level of prostate specific antigen (PSA) in a patient's serum is frequently monitored to determine when the patient will be taken off therapy and when therapy will resume. The threshold-based policy we propose is parameterized by lower and upper PSA threshold values and is associated with a cost metric that combines clinically relevant measures of therapy success. Using Infinitesimal Perturbation Analysis (IPA), we derive unbiased gradient estimators of this cost metric with respect to the controllable PSA threshold values based on actual data and show how these estimators can be used to adaptively adjust controllable parameters so as to improve therapy outcomes based on the cost metric defined.

Keywords: stochastic hybrid system (SHS), perturbation analysis, personalized cancer therapy.

1. INTRODUCTION

Cancer is widely viewed as a “disease of stages” in which tumors must progress through a series of discrete “states” in order to ultimately become malignant [Hanahan and Weinberg (2011)]. This view is particularly suitable to the development of prostate cancer, which is known to be a multistep process [Longo et al. (2012)]. For instance, a patient diagnosed with localized prostate cancer who has had all the tumor surgically removed is considered to remain in the state of “localized disease” until he progresses to a new state. At each state, distinct therapies can be prescribed, and the time spent by the patient in any given state is a measure of the efficacy of the corresponding intervention.

The standard treatment for advanced prostate cancer patients is hormone therapy in the form of androgen deprivation [Longo et al. (2012)]. The initial response to androgen deprivation therapy (ADT) is frequently positive, leading to a significant decrease in tumor size. However, most patients eventually develop resistance to ADT and relapse. A generally acceptable mechanism for explaining such relapse is the existence of an androgen-independent cancer cell phenotype that is resistant to secondary endocrine therapy and whose outgrowth leads to tumor recurrence [Jackson (2004a)].

Intermittent androgen suppression (IAS) therapy has been recently proposed as a strategy for delaying or even preventing time to relapse. The purpose of IAS is to prevent the existing tumor from progressing into an androgen-independent state. In spite of significant clinical experience with this approach, defining ideal protocols for any given patient remains one of the main challenges associated with effective IAS therapy [Hirata et al. (2010a)]. In fact, recent clinical trials suggest that the success of IAS ultimately translates into the ability to tailor on and

off-treatment schemes to individual patients [Bruchovsky et al. (2006), Bruchovsky et al. (2007)]. The design of optimal personalized IAS treatment schemes remains, therefore, an unsolved problem.

Recent attempts at addressing this problem have led to the development of several mathematical models that explain the evolution of prostate cancer under hormone therapy. The model from Jackson (2004a) describes the growth of prostate tumors formed by two subpopulations of cancer cells, only one of which is sensitive to androgen deprivation, and successfully reproduces the experimentally observed three phases of tumor evolution; however, the issue of IAS therapy design was not explicitly addressed. Ideta et al. (2008) applied a hybrid dynamical system approach to model prostate tumor evolution under IAS and then used it to study the effect of different therapy protocols on tumor growth and time to relapse through numerical and bifurcation analyses. Several extensions of the works by Jackson (2004a) and Ideta et al. (2008) have been proposed and we briefly review some of them. A nonlinear model was developed by Shimada and Aihara (2008) to account for the competition between different cancer cell subpopulations, while Tao et al. (2010) proposed a model based on switched ordinary differential equations. The problem of individualized prostate cancer treatment was formulated as an optimal control problem for which a piecewise affine system model was developed by Suzuki et al. (2010). Hirata et al. (2010a) modeled the prostate tumor under IAS as a feedback control system for the purpose of patient classification, while Hirata et al. (2010b) solved the model from Hirata et al. (2010a) analytically to derive conditions for patient relapse.

Most of the existing models provide insights into the dynamics of prostate cancer evolution under ADT but do not address the issue of therapy design. Moreover, previous work focusing on classifying patients into groups in order to infer optimal treatment schemes have been based on more manageable, albeit less accurate, approaches to nonlinear hybrid dynamical systems. In contrast, a nonlinear

^* The authors' work is supported in part by NSF under Grants CNS-1239021 and IIP-1430145, by AFOSR under grant FA9550-12-1-0113, and by ONR under grant N00014-09-1-1051.hybrid automaton model was recently proposed by Liu et al. (2015) and $\delta$ -reachability analysis was used to identify patient-specific therapy schemes. Although this model was shown to be in good agreement with published clinical data, it did not account for noise and fluctuations that are inherently associated with cell population dynamics and monitoring of clinical data. Stochastic effects were incorporated into a hybrid model of tumor growth under IAS therapy by Tanaka et al. (2010), but the ensuing analysis was performed considering a pre-determined therapy scheme, i.e., no design of personalized therapy was carried out. This paper is motivated by the need to develop optimal personalized IAS therapy based on stochastic models of prostate cancer evolution and represents a first attempt towards this goal using a Stochastic Hybrid Automaton (SHA) model of cancer progression.

In this paper, we draw upon the deterministic hybrid automaton model from Liu et al. (2015) to which we incorporate stochastic effects. We propose a cost metric in terms of the desired outcome of IAS therapy that is parameterized by a controllable parameter vector, and formulate the problem of optimal personalized therapy design as the search for the parameter values which minimize our cost metric. We use Infinitesimal Perturbation Analysis (IPA) for hybrid systems [Cassandras et al. (2010)] to derive gradient estimates of the cost metric with respect to the controllable vector of interest, which can be subsequently incorporated into a standard gradient-based optimization algorithm. Our main focus is on establishing a framework for IPA applications to biologically-inspired problems which is illustrated here with the case of prostate cancer therapy design. The advantages of our approach are twofold: first, we can obtain sensitivity estimates with respect to the various model parameters from actual data so as to differentiate critical ones from others that are not. Moreover, IPA efficiently yields sensitivities with respect to controllable parameters in a therapy (i.e., control policy), which is arguably the ultimate goal of personalized therapy design.

In Section 2 we formulate the problem of personalized therapy design based on an SHA model of prostate cancer evolution. Section 3 presents the general framework of IPA and details the derivation of IPA estimators for therapy evaluation and optimization. We include final remarks in Section 4.

2. PROBLEM FORMULATION

2.1 SHA Model of Prostate Cancer Progression

The system we consider comprises a prostate tumor under IAS therapy, which is modeled as a Stochastic Hybrid Automaton (SHA). We adopt a standard SHA definition [Cassandras and Lafortune (2008)]:

$G_h = (Q, X, E, U, f, \phi, Inv, guard, \rho, q_0, x_0)$

where $Q$ is a set of discrete states; $X$ is a continuous state space; $E$ is a finite set of events; $U$ is a set of admissible controls; $f$ is a vector field, $f : Q \times X \times U \rightarrow X$ ; $\phi$ is a discrete state transition function, $\phi : Q \times X \times E \rightarrow Q$ ; $Inv$ is a set defining an invariant condition (when this condition is violated at some $q \in Q$ , a transition must occur); $guard$ is a set defining a guard condition, $guard \subseteq Q \times Q \times X$ (when this condition is satisfied at some $q \in Q$ , a transition is allowed to occur); $\rho$ is a reset function, $\rho : Q \times Q \times X \times E \rightarrow X$ ; $q_0$ is an initial discrete state; $x_0$ is an initial continuous state.

With this framework in place, we define a SHA model of prostate cancer progression in terms of the following:

Discrete state set $Q$ . Hormone therapy for prostate cancer consists of administering medical agents that cause androgen suppression in an effort to decrease the population of prostate cancer cells and hence the size of the

tumor. A common biomarker used to monitor the efficacy of such treatment is the serum level of Prostate-Specific Antigen (PSA), whose value provides an estimate of the size of the prostate cancer population.

In IAS therapy, medication is suspended when a sufficient reduction in the size of the cancer cell populations is achieved. Since population sizes are not directly observable, this reduction is estimated in terms of the patient's PSA level; hence, the patient goes off therapy once his PSA reaches a lower threshold value. Similarly, medication is reinstated once the cancer cell populations have significantly recovered, which corresponds to when the patient's PSA level reaches an upper threshold value. Thus, we define $Q = {q^{ON}, q^{OFF}}$ , where $q^{ON}$ ( $q^{OFF}$ , respectively) is the on-treatment (off-treatment, respectively) operational mode of the system.

State space $X$ . The continuous state space $X$ is defined in terms of the biomarkers commonly monitored during IAS therapy, namely the PSA level and the androgen concentration in the patient's serum. We assume the co-existence of two subpopulations of cancer cells within the tumor: Hormone Sensitive Cells (HSCs) and Castration Resistant Cells (CRCs). The proliferation of the former is negatively affected by hormone therapy, while the survival rate of the latter decreases in androgen-rich environments.

We thus define a state vector $x(t) = [x_1(t), x_2(t), x_3(t)]$ with $x_i(t) \in \mathbb{R}^+$ , such that $x_1(t)$ is the total population of HSCs, $x_2(t)$ is the total population of CRCs, and $x_3(t)$ is the concentration of androgen in the serum. Since prostate cancer cells secrete high levels of PSA, it is frequently assumed that the serum PSA concentration can be modeled as a linear combination of the cancer cell subpopulations, i.e., $c_1 x_1(t) + c_2 x_2(t)$ . Another common assumption is that both HSCs and CRCs secrete PSA equivalently, so that $c_1 = c_2 = 1$ [Ideta et al. (2008)]. In this work, we adopt these assumptions. We also define a "clock" state variable $z_i(t) \in \mathbb{R}^+$ , $i = 1, 2$ , where $z_1(t)$ ( $z_2(t)$ , respectively) measures the time spent by the system in state $q^{ON}$ ( $q^{OFF}$ , respectively). The clock is reset to zero once a state transition takes place. Setting $z(t) = [z_1(t), z_2(t)]$ , the complete state vector is $[x(t), z(t)]$ .

Event set $E$ . We define the SHA event set as $E = {e_1, e_2}$ , where $e_1$ corresponds to the condition $[x_1(t) + x_2(t) = \theta_1$ from above] and $e_2$ corresponds to $[x_1(t) + x_2(t) = \theta_2$ from below].
Admissible control set $U$ . As described earlier, IAS therapy consists of cycles of androgen deprivation delivered intermittently with off-treatment periods. The cycles of androgen deprivation are suspended when the patient's PSA level reaches a lower threshold value, while therapy recommences once the PSA level reaches an upper threshold value. Hence, an IAS therapy can be viewed as a controlled process characterized by two parameters: $\theta = [\theta_1, \theta_2] \in \Theta$ , where $\theta_1 \in [\theta_1^{\min}, \theta_1^{\max}]$ is the lower threshold value of the patient's PSA level, and $\theta_2 \in [\theta_2^{\min}, \theta_2^{\max}]$ is the upper threshold value of the patient's PSA level, with $\theta_1^{\max} < \theta_2^{\min}$ . At any time $t$ , the feasible control set for the IAS therapy controller is $U = {0, 1}$ and the control is defined as:

$u(x(t), z(t)) \equiv \begin{cases} 0 & \text{if } x_1(t) + x_2(t) < \theta_2, q(t) = q^{OFF} \\ 1 & \text{if } x_1(t) + x_2(t) > \theta_1, q(t) = q^{ON} \end{cases} \quad (1)$

This is a simple form of hysteresis control to ensure that hormone therapy will be suspended whenever a patient's PSA level drops below a minimum threshold value, and that therapy will resume whenever a patient's PSA level reaches a maximum threshold value.5. System dynamics. The SHA system dynamics describe the evolution of continuous state variables over time, as well as the rules for discrete state transitions. First, the continuous (time-driven) dynamics capture the prostate cancer cell population dynamics, which are defined in terms of their proliferation, apoptosis, and conversion rates. Existing studies commonly use Michaelis-Menten-like functions to model the rates of proliferation and apoptosis [Ideta et al. (2008), Jackson (2004a), Jackson (2004b)]. Recently Liu et al. (2015) obtained greater consistency between clinical data and simulated population dynamics by modeling these rates using sigmoid functions. In what follows, we incorporate stochastic effects into the deterministic model from Liu et al. (2015) and obtain:

$\begin{aligned} \dot{x}_1(t) = & \alpha_1 \left[ 1 + e^{-(x_3(t)-k_1)k_2} \right]^{-1} \cdot x_1(t) \\ & - \beta_1 \left[ 1 + e^{-(x_3(t)-k_3)k_4} \right]^{-1} \cdot x_1(t) \\ & - \left[ m_1 \left( 1 - \frac{x_3(t)}{x_{3,0}} \right) + \lambda_1 \right] \cdot x_1(t) \\ & + \mu_1 + \zeta_1(t) \end{aligned} \quad (2)$

$\begin{aligned} \dot{x}_2(t) = & \left[ \alpha_2 \left( 1 - d \frac{x_3(t)}{x_{3,0}} \right) - \beta_2 \right] x_2(t) \\ & + m_1 \left( 1 - \frac{x_3(t)}{x_{3,0}} \right) x_1(t) + \zeta_2(t) \end{aligned} \quad (3)$

$\dot{x}_3(t) = \begin{cases} -\frac{x_3(t)}{\sigma} + \mu_3 + \zeta_3(t) & \text{if } x_1(t) + x_2(t) > \theta_1 \\ \frac{x_{3,0} - x_3(t)}{\sigma} + \mu_3 + \zeta_3(t) & \text{if } x_1(t) + x_2(t) < \theta_2 \end{cases} \quad \text{and } q(t) = q^{OFF} \quad (4)$

$\dot{z}_1(t) = \begin{cases} 1 & \text{if } q(t) = q^{ON} \\ 0 & \text{otherwise} \end{cases} \quad (5)$

$\begin{aligned} z_1(t^+) = & 0 \quad \text{if } x_1(t) + x_2(t) = \theta_1 \\ & \text{and } q(t) = q^{ON} \\ \dot{z}_2(t) = & \begin{cases} 1 & \text{if } q(t) = q^{OFF} \\ 0 & \text{otherwise} \end{cases} \end{aligned} \quad (6)$

$z_2(t^+) = 0 \quad \text{if } x_1(t) + x_2(t) = \theta_2 \quad \text{and } q(t) = q^{OFF}$

where $\alpha_1$ and $\alpha_2$ are the HSC proliferation constant and CRC proliferation constant, respectively; $\beta_1$ and $\beta_2$ are the HSC apoptosis constant and CRC apoptosis constant, respectively; $k_1$ through $k_4$ are HSC proliferation and apoptosis exponential constants; $m_1$ is the HSC to CRC conversion constant; $x_{3,0}$ corresponds to the patient-specific androgen constant; $\sigma$ is the androgen degradation constant; $\lambda_1$ is the HSC basal degradation rate; $\mu_1$ and $\mu_3$ are the HSC basal production rate and androgen basal production rate, respectively. Finally, ${\zeta_i(t)}$ , $i = 1, 2, 3$ , are stochastic processes which we allow to have arbitrary characteristics and only assume them to be piecewise continuous w.p. 1.

Observe that (2) and (3) seem to be independent of the discrete state (mode) $q(t)$ ; however, their dependence on $x_3(t)$ , whose dynamics are affected by mode transitions, implies that $x_1(t)$ , $x_2(t)$ also change due to such transitions. To make this behavior explicit, we can solve (4) for $x_3(t)$ and substitute this solution into (2) and (3), as detailed next.

Consider a sample path of the system over $[0, T]$ and denote the time of occurrence of the $k$ th event (of any type) by $\tau_k(\theta)$ . Since our complete system state vector is

$[x(t), z(t)]$ , we shall denote the state dynamics over any interevent interval $[\tau_k(\theta), \tau_{k+1}(\theta)]$ as follows:

$\dot{x}_n(t) = f_{n,k}^x(t), \quad \dot{z}_i(t) = f_{i,k}^z(t), \quad n = 1, \dots, 3, \quad i = 1, 2$

Although we include $\theta$ as an argument in the expressions above to stress the dependence on the controllable parameter, we will subsequently drop this for ease of notation as long as no confusion arises.

We start our analysis by assuming $q(t) = q^{ON}$ for $t \in [\tau_k, \tau_{k+1})$ . It is clear from (4) that $\dot{x}3(t) = -\frac{x_3(t)}{\sigma} + \mu_3 + \zeta_3(t)$ , which implies that, for $t \in [\tau_k, \tau{k+1})$ ,

$\begin{aligned} x_3(t) = & x_3(\tau_k^+) e^{-(t-\tau_k)/\sigma} \\ & + e^{-t/\sigma} \cdot \int_{\tau_k}^t e^{\varepsilon/\sigma} [\mu_3 + \zeta_3(\varepsilon)] d\varepsilon \end{aligned}$

For notational simplicity, let

$\tilde{\zeta}_3(t) = \int_{\tau_k}^t e^{-(t-\varepsilon)/\sigma} \zeta_3(\varepsilon) d\varepsilon \quad (7)$

and define, for $t \in [\tau_k, \tau_{k+1})$ ,

$h^{ON}(t, \tilde{\zeta}_3(t)) \equiv x_3(\tau_k^+) e^{-(t-\tau_k)/\sigma} + \mu_3 \sigma [1 - e^{-(t-\tau_k)/\sigma}] + \tilde{\zeta}_3(t) \quad (8)$

Now let $q(t) = q^{OFF}$ for $t \in [\tau_k, \tau_{k+1})$ , so that (4) implies that, for $t \in [\tau_k, \tau_{k+1})$ ,

$\begin{aligned} x_3(t) = & x_3(\tau_k^+) e^{-(t-\tau_k)/\sigma} \\ & + (\mu_3 \sigma + x_{3,0}) [1 - e^{-(t-\tau_k)/\sigma}] + \tilde{\zeta}_3(t) \end{aligned}$

Similarly as above, we define, for $t \in [\tau_k, \tau_{k+1})$ ,

$\begin{aligned} h^{OFF}(t, \tilde{\zeta}_3(t)) \equiv & x_3(\tau_k^+) e^{-(t-\tau_k)/\sigma} \\ & + (\mu_3 \sigma + x_{3,0}) [1 - e^{-(t-\tau_k)/\sigma}] + \tilde{\zeta}_3(t) \end{aligned} \quad (9)$

It is thus clear that

$x_3(t) = \begin{cases} h^{ON}(t, \tilde{\zeta}_3(t)) & \text{if } q(t) = q^{ON} \\ h^{OFF}(t, \tilde{\zeta}_3(t)) & \text{if } q(t) = q^{OFF} \end{cases}$

Although we include $\tilde{\zeta}_3(t)$ as an argument in (8)-(9) to stress the dependence on the stochastic process, we will subsequently drop this for ease of notation as long as no confusion arises. It is now clear that, by using (8)-(9) in (2)-(3), we may rewrite the state variable dynamics as

$\dot{x}_1(t) = \begin{cases} \left\{ \alpha_1 [1 + \phi_\alpha^{ON}(t)]^{-1} - \beta_1 [1 + \phi_\beta^{ON}(t)]^{-1} \right. \\ \left. + m_1 \left( \frac{h^{ON}(t)}{x_{3,0}} \right) - (m_1 + \lambda_1) \right\} \cdot x_1(t) \\ + \mu_1 + \zeta_1(t) & \text{if } q(t) = q^{ON} \\ \left\{ \alpha_1 [1 + \phi_\alpha^{OFF}(t)]^{-1} - \beta_1 [1 + \phi_\beta^{OFF}(t)]^{-1} \right. \\ \left. + m_1 \left( \frac{h^{OFF}(t)}{x_{3,0}} \right) - (m_1 + \lambda_1) \right\} \cdot x_1(t) \\ + \mu_1 + \zeta_1(t) & \text{if } q(t) = q^{OFF} \end{cases} \quad (10)$

and$$\dot{x}2(t) = \begin{cases} \left[ \alpha_2 \left( 1 - d \frac{h^{ON}(t)}{x{3,0}} \right) - \beta_2 \right] x_2(t) \ + m_1 \left( 1 - \frac{h^{ON}(t)}{x_{3,0}} \right) x_1(t) + \zeta_2(t) \ \text{if } q(t) = q^{ON} \ \left[ \alpha_2 \left( 1 - d \frac{h^{OFF}(t)}{x_{3,0}} \right) - \beta_2 \right] x_2(t) \ + m_1 \left( 1 - \frac{h^{OFF}(t)}{x_{3,0}} \right) x_1(t) + \zeta_2(t) \ \text{if } q(t) = q^{OFF} \end{cases} \quad (11)$$

with

$\begin{aligned} \phi_\alpha^{ON}(t) &= e^{-(h^{ON}(t)-k_1)k_2} \\ \phi_\beta^{ON}(t) &= e^{-(h^{ON}(t)-k_3)k_4} \\ \phi_\alpha^{OFF}(t) &= e^{-(h^{OFF}(t)-k_1)k_2} \\ \phi_\beta^{OFF}(t) &= e^{-(h^{OFF}(t)-k_3)k_4} \end{aligned}$

The discrete (event-driven) dynamics are dictated by the occurrence of events that cause state transitions. Based on the event set $E = {e_1, e_2}$ we have defined, the occurrence of $e_1$ results in a transition from $q^{ON}$ to $q^{OFF}$ and the occurrence of $e_2$ results in a transition from $q^{OFF}$ to $q^{ON}$ .

2.2 IAS Therapy Evaluation and Optimization

The effect of an IAS therapy $u(\theta, t)$ depends on the controllable parameter vector $\theta$ , as in (1), and can be quantified in terms of performance metrics of the form $J[\mathbf{u}(\theta, t)]$ . Evaluating $J[\mathbf{u}(\theta, t)]$ over all possible values of $\theta$ is clearly infeasible. However, there exist very efficient ways to accomplish this goal for stochastic hybrid systems. In particular, Perturbation Analysis (PA) is a methodology to efficiently estimate the sensitivity of the performance with respect to $\theta$ , i.e., when $\theta$ is a real-valued scalar, the derivative $dJ/d\theta$ . This is accomplished by extracting data from a sample path (simulated or actual) of the observed system based on which an unbiased estimate of $dJ/d\theta$ can indeed be obtained. The attractive feature of PA is that the resulting estimates are extracted from a single sample path in a non-intrusive manner and the computational cost of doing so is, in most cases of interest, minimal [Cassandras and Lafortune (2008)]. This is in contrast to the conventional finite difference estimate of $dJ/d\theta$ obtained through $[J(\theta + \Delta) - J(\theta)]/\Delta$ . Thus, for a vector $\theta$ of dimension $N$ , estimating the gradient $\nabla J(\theta)$ requires a single sample path (with some overhead) instead of $N + 1$ sample paths. The simplest family of PA estimators is Infinitesimal Perturbation Analysis (IPA). For the SHA model we consider here, IPA has been recently shown to provide unbiased gradient estimates [Cassandras et al. (2010)] for virtually arbitrary stochastic hybrid systems. Our goal, therefore, is to adapt IPA estimators of the form $dJ[\mathbf{u}(\theta, t)]/d\theta$ for different therapies $\mathbf{u}(\theta, t)$ , thus estimating their effects, and to ultimately show that optimal therapy schemes can be designed by solving problems of the form $\min_{\theta \in \Theta} J[\mathbf{u}(\theta, t)]$ .

We define a sample function in terms of complementary measures of therapy success. In particular, we take the most adequate treatment schemes to be those that (i) ensure PSA levels are kept as low as possible; (ii) reduce the frequency of on and off-treatment cycles. There is an obvious trade-off between (i) and the cost associated with (ii), which is related to the duration of the therapy and could potentially include fixed set up costs incurred when therapy is reinstated. For simplicity, we disconsider fixed

set up costs and take (ii) to be linearly proportional to the length of the on-treatment cycles. We thus associate weights $W_i$ , $i = 1, 2$ , with (i) and (ii), respectively, and define our sample function as the sum of the average PSA level and the average duration of an on-treatment cycle over a fixed time interval $[0, T]$ . We also normalize our sample function to ensure that the trade-off between (i) and (ii) is captured appropriately: we divide (i) by the value of the patient's PSA level at the start of the first on-treatment cycle ( $PSA_{init}$ ), and note that (ii) is naturally normalized by $T$ . Recall that the total population size of prostate cancer cells is assumed to reflect the serum PSA concentration, and that we have defined clock variables which measure the time elapsed in each of the treatment modes, so that our sample function can be written as

$L(\theta, x(0), z(0), T) = \frac{1}{T} \left[ W_1 \int_0^T \left[ \frac{x_1(\theta, t) + x_2(\theta, t)}{PSA_{init}} \right] dt + W_2 \int_0^T z_1(t) dt \right] \quad (12)$

where $x(0)$ and $z(0)$ are given initial conditions. We can then define the overall performance metric as

$J(\theta, x(0), z(0), T) = E[L(\theta, x(0), z(0), T)] \quad (13)$

Hence, the problem of determining the optimal IAS therapy can be formulated as

$\min_{\theta \in \Theta} E[L(\theta, x(0), z(0), T)] \quad (14)$

3. INFINITESIMAL PERTURBATION ANALYSIS

Consider a sample path generated by the SHA over $[0, T]$ and recall that we have defined $\tau_k(\theta)$ to be the time of occurrence of the $k$ th event (of any type), where $\theta$ is a controllable parameter vector of interest. We denote the state and event time derivatives with respect to parameter $\theta$ as $x'(\theta, t) \equiv \frac{dx(\theta, t)}{d\theta}$ and $\tau'k(\theta) \equiv \frac{d\tau_k(\theta)}{d\theta}$ , respectively, for $k = 1, \dots, N$ . As mentioned earlier, the dynamics of $x(\theta, t)$ are fixed over any interevent interval $[\tau_k(\theta), \tau{k+1}(\theta)]$ and we write $\dot{x}(t) = f_k(\theta, x, t)$ to represent the state dynamics over this interval. Although we include $\theta$ as an argument in the expressions above to stress the dependence on the controllable parameter, we will subsequently drop this for ease of notation as long as no confusion arises. It is shown in Cassandras et al. (2010) that the state derivative satisfies

$\frac{d}{dt} x'(t) = \frac{df_k(t)}{dx} x'(t) + \frac{df_k(t)}{d\theta} \quad (15)$

with the following boundary condition:

$x'(\tau_k^+) = x'(\tau_k^-) + [f_{k-1}(\tau_k^-) - f_k(\tau_k^+)] \cdot \tau'_k \quad (16)$

We note that (16) is valid when $x(\theta, t)$ is continuous in $t$ at $t = \tau_k$ . If this is not the case and the value of $x(\tau_k^+)$ is determined by the reset function $\rho(q, q', x, e)$ , then

$x'(\tau_k^+) = \frac{d\rho(q, q', x, e)}{d\theta} \quad (17)$

Knowledge of $\tau'_k$ is, therefore, needed in order to evaluate (16). Following the framework in Cassandras et al. (2010), there are three types of events for a general stochastic hybrid system: (i) Exogenous Events. These events cause a discrete state transition independent of $\theta$ and satisfy $\tau'k = 0$ . (ii) Endogenous Events. Such an event occurs at time $\tau_k$ if there exists a continuously differentiable function $g_k : \mathbb{R}^n \times \Theta \rightarrow \mathbb{R}$ such that $\tau_k = \min{t > \tau{k-1} : g_k(x(\theta, t), \theta) = 0}$ , where the function $g_k$ usually corresponds to a guard condition in a hybridautomaton. Taking derivatives with respect to $\theta$ , it is straightforward to obtain

$\tau'_k = - \left[ \frac{dg_k}{dx} \cdot f_{k-1}(\tau_k^-) \right]^{-1} \cdot \left( \frac{dg_k}{d\theta} + \frac{dg_k}{dx} \cdot x'(\tau_k^-) \right) \quad (18)$

as long as $\frac{dg_k}{dx} \cdot f_{k-1}(\tau_k^-) \neq 0$ . (iii) Induced Events. Such an event occurs at time $\tau_k$ if it is triggered by the occurrence of another event at time $\tau_m \leq \tau_k$ (details can be found in Cassandras et al. (2010)).

Returning to our problem of personalized prostate cancer therapy design, we define the derivatives of the states $x_n(\theta, t)$ and $z_j(\theta, t)$ and event times $\tau_k(\theta)$ with respect to $\theta_i$ , $i, j = 1, 2, n = 1, \dots, 3$ , as follows:

$x'_{n,i}(t) \equiv \frac{\partial x_n(\theta, t)}{\partial \theta_i}, \quad z'_{j,i}(t) \equiv \frac{\partial z_j(\theta, t)}{\partial \theta_i}, \quad \tau'_{k,i} \equiv \frac{\partial \tau_k(\theta)}{\partial \theta_i} \quad (19)$

Our goal is to obtain an estimate of $\nabla J(\theta)$ by evaluating the sample gradient $\nabla L(\theta)$ , which is computed using data extracted from a sample path of the system (e.g., by simulating a sample path of our SHA model using clinical data). We will assume that the derivatives $dL/d\theta_i$ exist w.p. 1 for all $\theta_i \in \mathbb{R}^+$ . It is also easy to verify that $L(\theta)$ is Lipschitz continuous for $\theta_i \in \mathbb{R}^+$ . We will further assume that no two events can occur at the same time w.p.1. Under these conditions, it has been shown in Cassandras et al. (2010) that $dL/d\theta_i$ is an unbiased estimator of $dJ/d\theta_i$ , $i = 1, 2$ .

In what follows, we derive the IPA state and event time derivatives for the events identified in our SHA model of prostate cancer progression.

3.1 State and Event Time Derivatives

We begin by analyzing the state evolution considering each of the states ( $q^{ON}$ and $q^{OFF}$ ) and events ( $e_1$ and $e_2$ ) defined for our SHA model.

$q(t) = q^{ON}$ for $t \in [\tau_k, \tau_{k+1})$ . Using (15) for $x_1(t)$ , we have, for $i = 1, 2$ ,

$\frac{d}{dt} x'_{1,i}(t) = \frac{\partial f_k^{x_1}(t)}{\partial x_1} x'_1(t) + \frac{\partial f_k^{x_1}(t)}{\partial x_2} x'_2(t) + \frac{\partial f_k^{x_1}(t)}{\partial x_3} x'_3(t) + \frac{\partial f_k^{x_1}(t)}{\partial z_1} z'_1(t) + \frac{\partial f_k^{x_1}(t)}{\partial z_2} z'_2(t) + \frac{\partial f_k^{x_1}(t)}{\partial \theta_i} \tau'_i$

From (10), we have $\frac{\partial f_k^{x_1}(t)}{\partial x_n} = \frac{\partial f_k^{x_1}(t)}{\partial z_i} = \frac{\partial f_k^{x_1}(t)}{\partial \theta_i} = 0$ , $n = 2, 3, i = 1, 2$ , and

$\frac{\partial f_k^{x_1}(t)}{\partial x_1} = \alpha_1 [1 + \phi_\alpha^{ON}(t)]^{-1} - \beta_1 [1 + \phi_\beta^{ON}(t)]^{-1} - m_1 \left( 1 - \frac{h^{ON}(t)}{x_{3,0}} \right) - \lambda_1 \quad (20)$

Combining the last two equations and solving for $x'_{1,i}(t)$ , we obtain

$x'_{1,i}(t) = x'_{1,i}(\tau_k^+) e^{A(t)}, \quad t \in [\tau_k, \tau_{k+1}) \quad (21)$

and, in particular,

$x'_{1,i}(\tau_{k+1}^-) = x'_{1,i}(\tau_k^+) e^{A(\tau_k)} \quad (22)$

with

$A(\tau_k) \equiv \int_{\tau_k}^{\tau_{k+1}} \left[ \frac{\alpha_1}{1 + \phi_\alpha^{ON}(t)} - \frac{\beta_1}{1 + \phi_\beta^{ON}(t)} \right] dt - \int_{\tau_k}^{\tau_{k+1}} \frac{m_1}{x_{3,0}} h^{ON}(t) dt - (m_1 + \lambda_1) (\tau_{k+1} - \tau_k)$

Similarly for $x_2(t)$ , we have from (11) that $\frac{\partial f_k^{x_2}(t)}{\partial x_3} = \frac{\partial f_k^{x_2}(t)}{\partial z_i} = \frac{\partial f_k^{x_2}(t)}{\partial \theta_i} = 0$ , $i = 1, 2$ , and

$\frac{\partial f_k^{x_2}(t)}{\partial x_1} = m_1 \left( 1 - \frac{h^{ON}(t)}{x_{3,0}} \right) \quad (23)$

$\frac{\partial f_k^{x_2}(t)}{\partial x_2} = \alpha_2 \left( 1 - d \frac{h^{ON}(t)}{x_{3,0}} \right) - \beta_2$

Combining the last two equations and solving for $x'{2,i}(t)$ yields, for $t \in [\tau_k, \tau{k+1})$ ,

$x'_{2,i}(t) = x'_{2,i}(\tau_k^+) e^{B_1(t)} + B_2(t, x'_{1,i}(\tau_k^+), A(t)) \quad (24)$

and, in particular,

$x'_{2,i}(\tau_{k+1}^-) = x'_{2,i}(\tau_k^+) e^{B_1(\tau_k)} + B_2(\tau_k, x'_{1,i}(\tau_k^+), A(\tau_k)) \quad (25)$

with

$B_1(\tau_k) \equiv \int_{\tau_k}^{\tau_{k+1}} \left[ \alpha_2 \left( 1 - d \frac{h^{ON}(t)}{x_{3,0}} \right) - \beta_2 \right] dt$

$B_2(\cdot) \equiv e^{B_1(\tau_k)} \int_{\tau_k}^{\cdot} G_1(t, \tau_k) e^{-B_1(\tau_k)} dt$

where $G_1(t, \tau_k) = m_1 \left( 1 - \frac{h^{ON}(t)}{x_{3,0}} \right) x'{1,i}(\tau_k^+) e^{A(t)}$ , $t \in [\tau_k, \tau{k+1})$ .

In the case of $x_3(t)$ , it is clear from (4) that $\frac{\partial f_k^{x_3}(t)}{\partial x_i} = \frac{\partial f_k^{x_3}(t)}{\partial z_i} = \frac{\partial f_k^{x_3}(t)}{\partial \theta_i} = 0$ , $i = 1, 2$ , and $\frac{\partial f_k^{x_3}(t)}{\partial x_3} = -\frac{1}{\sigma}$ . Substituting in (15) and solving for $x'{3,i}(t)$ , for $i = 1, 2$ , yields $x'{3,i}(t) = x'{3,i}(\tau_k^+) e^{-(t-\tau_k)/\sigma}$ , $t \in [\tau_k, \tau{k+1})$ , and, in particular,

$x'_{3,i}(\tau_{k+1}^-) = x'_{3,i}(\tau_k^+) e^{-(t-\tau_k)/\sigma} \quad (26)$

Finally, in the case of the "clock" state variable $z_i(\theta, t)$ , $i = 1, 2$ , based on (5) and (6), we have $\frac{\partial f_k^{z_i}(t)}{\partial x_n} = \frac{\partial f_k^{z_i}(t)}{\partial z_i} = \frac{\partial f_k^{z_i}(t)}{\partial \theta_i} = 0$ , $n = 1, \dots, 3$ , $i = 1, 2$ , so that $\frac{d}{dt} z'i(t) = 0$ for $t \in [\tau_k, \tau{k+1})$ . This means that the value of the state derivative of the "clock" variable remains unaltered while $q(t) = q^{ON}$ , i.e., $z'_i(t) = z'i(\tau_k^+)$ , $t \in [\tau_k, \tau{k+1})$ .

$q(t) = q^{OFF}$ for $t \in [\tau_k, \tau_{k+1})$ . Starting with $x_1(t)$ , a similar reasoning as above applies, but now we have

$\frac{\partial f_k^{x_1}(t)}{\partial x_1} = \alpha_1 [1 + \phi_\alpha^{OFF}(t)]^{-1} - \beta_1 [1 + \phi_\beta^{OFF}(t)]^{-1} - m_1 \left( 1 - \frac{h^{OFF}(t)}{x_{3,0}} \right) - \lambda_1$

As a result, (15) implies that, for $i = 1, 2$ ,

$x'_{1,i}(t) = x'_{1,i}(\tau_k^+) e^{C(t)}, \quad t \in [\tau_k, \tau_{k+1}) \quad (27)$

and, in particular,

$x'_{1,i}(\tau_{k+1}^-) = x'_{1,i}(\tau_k^+) e^{C(\tau_k)} \quad (28)$

with

$C(\tau_k) \equiv \int_{\tau_k}^{\tau_{k+1}} \left[ \frac{\alpha_1}{1 + \phi_\alpha^{OFF}(t)} - \frac{\beta_1}{1 + \phi_\beta^{OFF}(t)} \right] dt - \int_{\tau_k}^{\tau_{k+1}} \frac{m_1}{x_{3,0}} h^{OFF}(t) dt - (m_1 + \lambda_1) (\tau_{k+1} - \tau_k)$

Similarly for $x_2(t)$ , we have

$\frac{\partial f_k^{x_2}(t)}{\partial x_1} = m_1 \left( 1 - \frac{h^{OFF}(t)}{x_{3,0}} \right) \quad (29)$

$\frac{\partial f_k^{x_2}(t)}{\partial x_2} = \alpha_2 \left( 1 - d \frac{h^{OFF}(t)}{x_{3,0}} \right) - \beta_2$

It is thus straightforward to verify that (15) yields, for $t \in [\tau_k, \tau_{k+1})$ ,$$x'{2,i}(t) = x'{2,i}(\tau_k^+) e^{D_1(t)} + D_2(t, x'_{1,i}(\tau_k^+), C(t)) \quad (29)$$

and, in particular,

$x'_{2,i}(\tau_{k+1}^-) = x'_{2,i}(\tau_k^+) e^{D_1(\tau_k)} + D_2(\tau_k, x'_{1,i}(\tau_k^+), C(\tau_k)) \quad (30)$

with

$\begin{aligned} D_1(\tau_k) &\equiv \int_{\tau_k}^{\tau_{k+1}} \left[ \alpha_2 \left( 1 - d \frac{h^{OFF}(t)}{x_{3,0}} \right) - \beta_2 \right] dt \\ D_2(\cdot) &\equiv e^{D_1(\tau_k)} \int_{\tau_k}^{\tau_{k+1}} G_2(t, \tau_k) e^{-D_1(\tau_k)} dt \end{aligned}$

where $G_2(t, \tau_k) = m_1 \left( 1 - \frac{h^{OFF}(t)}{x_{3,0}} \right) x'{1,i}(\tau_k^+) e^{C(t)}$ , $t \in [\tau_k, \tau{k+1}]$ .

In the case of $x_3(t)$ , we will have $\frac{\partial f_k^{x_3}(t)}{\partial x_i} = \frac{\partial f_k^{x_3}(t)}{\partial z_i} = \frac{\partial f_k^{x_3}(t)}{\partial \theta_i} = 0$ , $i = 1, 2$ , and $\frac{\partial f_k^{x_3}(t)}{\partial x_3} = -\frac{1}{\sigma}$ , so that (15) implies $x'{3,i}(t) = x'{3,i}(\tau_k^+) e^{-(t-\tau_k)/\sigma}$ , $t \in [\tau_k, \tau_{k+1}]$ , and, in particular,

$x'_{3,i}(\tau_{k+1}^-) = x'_{3,i}(\tau_k^+) e^{-(t-\tau_k)/\sigma} \quad (31)$

Finally, in the case of the "clock" state variable $z_i(\theta, t)$ , $i = 1, 2$ , based on (5) and (6), we have $\frac{\partial f_k^{z_i}(t)}{\partial x_n} = \frac{\partial f_k^{z_i}(t)}{\partial z_i} = \frac{\partial f_k^{z_i}(t)}{\partial \theta_i} = 0$ , $n = 1, \dots, 3$ , $i = 1, 2$ , so that $\frac{d}{dt} z'(t) = 0$ for $t \in [\tau_k, \tau_{k+1}]$ . This means that the value of the state derivative of the "clock" variable remains unaltered while $q(t) = q^{OFF}$ , i.e., $z'_i(t) = z'i(\tau_k^+)$ , $t \in [\tau_k, \tau{k+1}]$ .

A state transition from $q^{ON}$ to $q^{OFF}$ takes place at time $\tau_k$ . This necessarily implies that event $e_1$ occurred at time $\tau_k$ . From (16) we have, for $i = 1, 2$ ,

$x'_{1,i}(\tau_k^+) = x'_{1,i}(\tau_k^-) + [f_k^{x_1}(\tau_k^-) - f_{k+1}^{x_1}(\tau_k^+)] \cdot \tau'_{k,i} \quad (32)$

Observe that, from (10), $f_k^{x_1}(\tau_k^-)$ and $f_{k+1}^{x_1}(\tau_k^+)$ ultimately depend on $h^{ON}(\tau_k^-)$ and $h^{OFF}(\tau_k^+)$ , respectively. Also, a transition from $q^{ON}$ to $q^{OFF}$ at time $\tau_k$ implies that $q(t) = q^{ON}$ , $t \in [\tau_{k-1}, \tau_k)$ and $q(t) = q^{OFF}$ , $t \in [\tau_k, \tau_{k+1}]$ . Hence, evaluating $h^{ON}(\tau_k^-)$ from (8) over the appropriate time interval results in

$\begin{aligned} h^{ON}(\tau_k^-) &= x_3(\tau_{k-1}^+) e^{-(\tau_k - \tau_{k-1})/\sigma} \\ &\quad + \mu_3 \sigma [1 - e^{-(\tau_k - \tau_{k-1})/\sigma}] + \tilde{\zeta}_3(\tau_k) \end{aligned}$

and it follows directly from (9) that $h^{OFF}(\tau_k^+) = x_3(\tau_k^+)$ .

Furthermore, by continuity of $x_n(t)$ (due to conservation of mass), $x_n(\tau_k^+) = x_n(\tau_k^-)$ , $n = 1, 2, 3$ . Also, since we have assumed that ${\zeta_i(t)}$ , $i = 1, 2, 3$ , is piecewise continuous w.p.1 and that no two events can occur at the same time w.p.1, $\zeta_i(\tau_k^-) = \zeta_i(\tau_k^+)$ , $i = 1, 2, 3$ . Hence, by evaluating $\Delta_f^1(\tau_k) \equiv f_k^{x_1}(\tau_k^-) - f_{k+1}^{x_1}(\tau_k^+)$ we obtain

$\begin{aligned} \Delta_f^1(\tau_k, \zeta_3(\tau_k)) &= \left\{ \alpha_1 [1 + \phi_\alpha^{ON}(\tau_k^-)]^{-1} \right. \\ &\quad - \alpha_1 [1 + \phi_\alpha^{OFF}(\tau_k^+)]^{-1} - \beta_1 [1 + \phi_\beta^{ON}(\tau_k^-)]^{-1} \\ &\quad + \beta_1 [1 + \phi_\beta^{OFF}(\tau_k^+)]^{-1} \\ &\quad \left. + \frac{m_1}{x_{3,0}} [h^{ON}(\tau_k^-) - x_3(\tau_k)] \right\} \cdot x_1(\tau_k) \end{aligned} \quad (33)$

Finally, the term $\tau'_{k,i}$ , which corresponds to the event time derivative with respect to $\theta_i$ at event time $\tau_k$ , is determined using (18), as will be detailed in Lemma 1 later.

A similar analysis applies to $x_2(t)$ , so that, for $i = 1, 2$ ,

$x'_{2,i}(\tau_k^+) = x'_{2,i}(\tau_k^-) + [f_k^{x_2}(\tau_k^-) - f_{k+1}^{x_2}(\tau_k^+)] \cdot \tau'_{k,i} \quad (34)$

where $\tau'{k,i}$ will be derived in Lemma 1, and $f_k^{x_2}(\tau_k^-)$ and $f{k+1}^{x_2}(\tau_k^+)$ ultimately depend on $h^{ON}(\tau_k^-)$ and $h^{OFF}(\tau_k^+)$ , respectively. Hence, evaluating $\Delta_f^2(\tau_k) \equiv f_k^{x_2}(\tau_k^-) - f_{k+1}^{x_2}(\tau_k^+)$ from (11) yields

$\begin{aligned} \Delta_f^2(\tau_k, \zeta_3(\tau_k)) &= \frac{\alpha_2 d}{x_{3,0}} [x_3(\tau_k) - h^{ON}(\tau_k^-)] \cdot x_2(\tau_k) \\ &\quad - \frac{m_1}{x_{3,0}} [h^{ON}(\tau_k^-) - x_3(\tau_k)] \cdot x_1(\tau_k) \end{aligned} \quad (35)$

Finally, for $x_3(t)$ , (16) can be easily seen to yield, for $i = 1, 2$ ,

$x'_{3,i}(\tau_k^+) = x'_{3,i}(\tau_k^-) - \frac{x_{3,0}}{\sigma} \cdot \tau'_{k,i}$

In the case of the "clock" state variable, $z_1(t)$ is discontinuous in $t$ at $t = \tau_k$ , while $z_2(t)$ is continuous. Applying (17) to the reset function defined in (5) yields $z'{1,i}(\tau_k^+) = 0$ , $i = 1, 2$ , i.e., the value of $z'{1,i}(t)$ is reset to zero whenever event $e_1$ takes place. Based on (16) and (6), it is simple to verify that, for $i = 1, 2$ ,

$z'_{2,i}(\tau_k^+) = z'_{2,i}(\tau_k^-) - \tau'_{k,i}$

A state transition from $q^{OFF}$ to $q^{ON}$ takes place at time $\tau_k$ . This necessarily implies that event $e_2$ occurred at time $\tau_k$ . The same reasoning as above applies and from (16) we have, for $i = 1, 2$ ,

$x'_{1,i}(\tau_k^+) = x'_{1,i}(\tau_k^-) + [f_k^1(\tau_k^-) - f_{k+1}^1(\tau_k^+)] \cdot \tau'_{k,i} \quad (36)$

where $f_k^1(\tau_k^-) - f_{k+1}^1(\tau_k^+)$ can be evaluated from (10) and ultimately depends on $h^{OFF}(\tau_k^-)$ and $h^{ON}(\tau_k^+)$ . Recall that a transition from $q^{OFF}$ to $q^{ON}$ at time $\tau_k$ implies that $q(t) = q^{OFF}$ , $t \in [\tau_{k-1}, \tau_k)$ and $q(t) = q^{ON}$ , $t \in [\tau_k, \tau_{k+1}]$ . Hence, evaluating $h^{OFF}(\tau_k^-)$ from (9) over the appropriate time interval results in

$\begin{aligned} h^{OFF}(\tau_k^-) &= x_3(\tau_{k-1}^+) e^{-(\tau_k - \tau_{k-1})/\sigma} \\ &\quad + (\mu_3 \sigma + x_{3,0}) [1 - e^{-(\tau_k - \tau_{k-1})/\sigma}] + \tilde{\zeta}_3(\tau_k) \end{aligned}$

and it follows directly from (8) that $h^{ON}(\tau_k^+) = x_3(\tau_k^+)$ .

As in the previous case, continuity due to conservation of mass applies, so that evaluating $\Delta_f^1(\tau_k) \equiv f_k^1(\tau_k^-) - f_{k+1}^1(\tau_k^+)$ yields

$\begin{aligned} \Delta_f^1(\tau_k, \zeta_3(\tau_k)) &= \left\{ \alpha_1 [1 + \phi_\alpha^{OFF}(\tau_k^-)]^{-1} \right. \\ &\quad - \alpha_1 [1 + \phi_\alpha^{ON}(\tau_k^+)]^{-1} - \beta_1 [1 + \phi_\beta^{OFF}(\tau_k^-)]^{-1} \\ &\quad + \beta_1 [1 + \phi_\beta^{ON}(\tau_k^+)]^{-1} \\ &\quad \left. + \frac{m_1}{x_{3,0}} [h^{OFF}(\tau_k^-) - x_3(\tau_k)] \right\} \cdot x_1(\tau_k) \end{aligned} \quad (37)$

The term $\tau'_{k,i}$ corresponds to the event time derivative with respect to $\theta_i$ at event time $\tau_k$ and its derivation will be detailed in Lemma 1.

Similarly for $x_2(t)$ , we have, for $i = 1, 2$ ,

$x'_{2,i}(\tau_k^+) = x'_{2,i}(\tau_k^-) + [f_k^2(\tau_k^-) - f_{k+1}^2(\tau_k^+)] \cdot \tau'_{k,i} \quad (38)$

where $\tau'{k,i}$ will be derived in Lemma 1. Evaluating $\Delta_f^2(\tau_k) \equiv f_k^2(\tau_k^-) - f{k+1}^2(\tau_k^+)$ from (11), and making the appropriate simplifications due to continuity, we obtain

$\begin{aligned} \Delta_f^2(\tau_k, \zeta_3(\tau_k)) &= \frac{\alpha_2 d}{x_{3,0}} [x_3(\tau_k) - h^{OFF}(\tau_k^-)] \cdot x_2(\tau_k) \\ &\quad - \frac{m_1}{x_{3,0}} [h^{OFF}(\tau_k^-) - x_3(\tau_k)] \cdot x_1(\tau_k) \end{aligned} \quad (39)$ Finally, for $x_3(t)$ , (16) can be easily seen to yield $x'{3,i}(\tau_k^+) = x'{3,i}(\tau_k^-) + \frac{x_{3,0}}{\sigma} \cdot \tau'_{k,i}$ , $i = 1, 2$ .

In the case of the "clock" state variable, $z_1(t)$ is continuous in $t$ at $t = \tau_k$ , while $z_2(t)$ is discontinuous. Based on (16) and (5), it is simple to verify that, for $i = 1, 2$ ,

$z'_{1,i}(\tau_k^+) = z'_{1,i}(\tau_k^-) - \tau'_{k,i}$

Applying (17) to the reset function defined in (6) yields $z'{2,i}(\tau_k^+) = 0$ , $i = 1, 2$ , i.e., the value of $z'{2,i}(t)$ is reset to zero whenever event $e_2$ takes place.

Note that, since $z'{j,i}(t) = z'{j,i}(\tau_k^+)$ , $t \in [\tau_k, \tau_{k+1})$ , we will have that $z'{j,i}(\tau_k^-) = z'{j,i}(\tau_{k-1}^+)$ , $j, i = 1, 2$ . Moreover, the sample path of our SHA consists of a sequence of alternating $e_1$ and $e_2$ events, which implies that $z'{1,i}(\tau_k^-) = 0$ if event $e_1$ occurred at $\tau{k-1}$ , while $z'{2,i}(\tau_k^-) = 0$ if event $e_2$ occurred at $\tau{k-1}$ . As a result,

$z'_{1,i}(\tau_k^+) = \begin{cases} -\tau'_{k,i} & \text{if event } e_2 \text{ occurs at } \tau_k \\ 0 & \text{otherwise} \end{cases} \quad (40)$

and

$z'_{2,i}(\tau_k^+) = \begin{cases} -\tau'_{k,i} & \text{if event } e_1 \text{ occurs at } \tau_k \\ 0 & \text{otherwise} \end{cases} \quad (41)$

We now proceed with a general result which applies to all events defined for our SHA model. Let us denote the time of occurrence of the $j$ th state transition by $\tau_j$ , and define its derivative with respect to the control parameters as $\tau'_{j,i} \equiv \frac{\partial \tau_j}{\partial \theta_i}$ , $i = 1, 2$ . We also define $f_j^{x_n}(\tau_j) \equiv \dot{x}_n(\tau_j)$ , $n = 1, \dots, 3$ and note that at each state transition at time $\tau_j$ an event $e_p$ , $p = 1, 2$ , will take place.

Lemma 1. When an event $e_p$ , $p = 1, 2$ , occurs, the derivative $\tau'_{j,i}$ , $i = 1, 2$ , of state transition times $\tau_j$ , $j = 1, 2, \dots$ with respect to the control parameters $\theta_i$ , $i = 1, 2$ , satisfies:

$\tau'_{j,i} = \frac{\mathbf{1}[p = i] - x'_{1}(\tau_j^-) - x'_{2}(\tau_j^-)}{f_{j-1}^{x_1}(\tau_j^-) + f_{j-1}^{x_2}(\tau_j^-)} \quad (42)$

where $\mathbf{1}[p = i]$ is the usual indicator function.

Proof. We begin with an occurrence of event $e_1$ which causes a transition from state $q^{ON}$ to state $q^{OFF}$ at time $\tau_j$ . This implies that $g_j(x, \theta) = x_1 + x_2 - \theta_1 = 0$ . As a result, $\frac{\partial g_k}{\partial x_1} = \frac{\partial g_k}{\partial x_2} = 1$ , $\frac{\partial g_k}{\partial x_3} = \frac{\partial g_k}{\partial z_i} = \frac{\partial g_k}{\partial \theta_2} = 0$ , $i = 1, 2$ , and $\frac{\partial g_k}{\partial \theta_1} = -1$ , and it is simple to verify that (42) follows from (18).

Next, consider event $e_2$ at time $\tau_j$ , leading to a transition from state $q^{OFF}$ to state $q^{ON}$ . In this case, $g_j(x, \theta) = x_1 + x_2 - \theta_2 = 0$ , so that $\frac{\partial g_k}{\partial x_1} = \frac{\partial g_k}{\partial x_2} = 1$ , $\frac{\partial g_k}{\partial x_3} = \frac{\partial g_k}{\partial z_i} = \frac{\partial g_k}{\partial \theta_1} = 0$ , $i = 1, 2$ , and $\frac{\partial g_k}{\partial \theta_2} = -1$ . Substituting into (18) once again yields (42). ■

We note that the numerator in (42) is determined using (22) and (25) if $q(\tau_j^-) = q^{ON}$ , or (28) and (30) if $q(\tau_j^-) = q^{OFF}$ . Moreover, the denominator in (42) is computed using (10)-(11) and it is simple to verify that, if event $e_1$ takes place at time $\tau_j$ ,

$\begin{aligned} f_{j-1}^{x_1}(\tau_j^-) + f_{j-1}^{x_2}(\tau_j^-) &= \alpha_1 [1 + \phi_\alpha^{ON}(\tau_j^-)]^{-1} \cdot x_1(\tau_j) \\ &\quad - \left\{ \beta_1 [1 + \phi_\beta^{ON}(\tau_j^-)]^{-1} + \lambda_1 \right\} \cdot x_1(\tau_j) + \mu_1 \\ &\quad + \left[ \alpha_2 \left( 1 - d \frac{h^{ON}(\tau_j^-)}{x_{3,0}} \right) - \beta_2 \right] \cdot x_2(\tau_j) \\ &\quad + \zeta_1(\tau_j) + \zeta_2(\tau_j) \end{aligned}$

and, if event $e_2$ takes place at time $\tau_j$ ,

$\begin{aligned} f_{j-1}^{x_1}(\tau_j^-) + f_{j-1}^{x_2}(\tau_j^-) &= \alpha_1 [1 + \phi_\alpha^{OFF}(\tau_j^-)]^{-1} \cdot x_1(\tau_j) \\ &\quad - \left\{ \beta_1 [1 + \phi_\beta^{OFF}(\tau_j^-)]^{-1} + \lambda_1 \right\} \cdot x_1(\tau_j) + \mu_1 \\ &\quad + \left[ \alpha_2 \left( 1 - d \frac{h^{OFF}(\tau_j^-)}{x_{3,0}} \right) - \beta_2 \right] \cdot x_2(\tau_j) \\ &\quad + \zeta_1(\tau_j) + \zeta_2(\tau_j) \end{aligned}$

This completes the derivation of all state and event time derivatives required to apply IPA to the hybrid automaton model of prostate cancer. In what follows, we shall derive the cost derivatives corresponding to the performance metric defined in (12).

3.2 Cost Derivative

Let us denote the total number of on and off-treatment periods (complete or incomplete) in $[0, T]$ by $K_T$ . Also let $\xi_k$ denote the start of the $k^{th}$ period and $\eta_k$ denote the end of the $k^{th}$ period (of either type). Finally, let $M_T = \lfloor \frac{K_T}{2} \rfloor$ be the total number of complete on-treatment periods, and $\Delta_m^{ON}$ denote the duration of the $m^{th}$ complete on-treatment period, where clearly

$\Delta_m^{ON} \equiv \eta_m - \xi_m$

Theorem 1. The derivative of the sample function $L(\theta)$ with respect to the control parameters satisfies:

$\begin{aligned} \frac{dL(\theta)}{d\theta_i} &= \frac{W_1}{T} \sum_{k=1}^{K_T} \int_{\xi_k}^{\eta_k} \left[ \frac{x'_{1,i}(\theta, t) + x'_{2,i}(\theta, t)}{PSA_{init}} \right] dt \\ &\quad + \frac{W_2}{T} \sum_{m=1}^{M_T} \Delta_m^{ON} \cdot (\eta'_{m,i} - \xi'_{m,i}) \\ &\quad - \frac{W_2}{T} \mathbf{1} \left[ \left\lfloor \frac{K_T}{2} \right\rfloor \neq \frac{K_T}{2} \right] \cdot \xi'_{M_T+1,i} \cdot (T - \xi_{M_T+1}) \end{aligned} \quad (43)$

Proof. We assume, without loss of generality, that the start of our sample path will coincide with the start of the first on-treatment period. Note also that we choose to end our sample path at time $T$ , and that this choice is independent of $\theta_i$ , $i = 1, 2$ . Consequently, we will have $[0, T] \equiv [\xi_1, \eta_{K_T}]$ , which implies that $\frac{\partial \xi_1}{\partial \theta_i} = \frac{\partial \eta_{K_T}}{\partial \theta_i} = 0$ , $i = 1, 2$ . Recall, from the definition of an intermittent hormone therapy scheme, that the sample path of our SHA will consist of alternating on and off-treatment periods. Since $z_1(t) = 0$ when $q(t) = q^{OFF}$ , we can rewrite (12) as

$\begin{aligned} L(\theta, x(0), z(0), T) &= \frac{W_1}{T} \sum_{k=1}^{K_T} \int_{\xi_k}^{\eta_k} \left[ \frac{x_1(\theta, t) + x_2(\theta, t)}{PSA_{init}} \right] dt \\ &\quad + \frac{W_2}{T} \left[ \sum_{m=1}^{M_T} \int_{\xi_m}^{\eta_m} z_1(t) dt + \int_{\xi_{M_T+1}}^T z_1(t) dt \right] \end{aligned} \quad (44)$

Note that our sample path can either (a) end with an incomplete on-treatment period, or (b) end with an incomplete off-treatment period. In (44), we assume that (a) holds, since (b) is a special case of (a) for which $\int_0^T z_1(t) dt = \sum_{m=1}^{M_T} \int_{\xi_m}^{\eta_m} z_1(t) dt$ . Observe that the end of an on-treatment period is coupled with the start of the subsequent off-treatment period, i.e., $x_i(\eta_k) = x_i(\xi_{k+1})$ , $i = 1, 2$ , $k = 1, \dots, K_T - 1$ . Using this notation and taking the derivative of (44) yields$$\begin{aligned} \frac{dL(\theta)}{d\theta_i} &= \frac{W_1}{T \cdot PSA_{init}} \sum_{k=1}^{K_T-1} \int_{\xi_k}^{\xi_{k+1}} [x'{1,i}(\theta, t) + x'{2,i}(\theta, t)] dt \ &+ \frac{W_1}{T \cdot PSA_{init}} \sum_{k=1}^{K_T-1} [x_1(\xi_{k+1}) + x_2(\xi_{k+1})] \frac{\partial \xi_{k+1}}{\partial \theta_i} \ &- \frac{W_1}{T \cdot PSA_{init}} \sum_{k=1}^{K_T-1} [x_1(\xi_k) + x_2(\xi_k)] \frac{\partial \xi_k}{\partial \theta_i} \ &+ \frac{W_1}{T \cdot PSA_{init}} \int_{\xi_{K_T}}^T [x'{1,i}(\theta, t) + x'{2,i}(\theta, t)] dt \ &+ \frac{W_1}{T \cdot PSA_{init}} [x_1(T) + x_2(T)] \frac{\partial T}{\partial \theta_i} \ &- \frac{W_1}{T \cdot PSA_{init}} [x_1(\xi_{K_T}) + x_2(\xi_{K_T})] \frac{\partial \xi_{K_T}}{\partial \theta_i} \ &+ \frac{W_2}{T} \sum_{m=1}^{M_T} \left[ \int_{\xi_m}^{\eta_m} z'{1,i}(t) dt + z_1(\eta_m^-) \frac{\partial \eta_m}{\partial \theta_i} - z_1(\xi_m^+) \frac{\partial \xi_m}{\partial \theta_i} \right] \ &+ \frac{W_2}{T} \int{\xi_{M_T+1}}^T z'{1,i}(t) dt + z_1(T^-) \frac{\partial T}{\partial \theta_i} - z_1(\xi{M_T+1}^+) \frac{\partial \xi_{M_T+1}}{\partial \theta_i} \end{aligned} \tag{45}$$

Observe that the first two summation terms in (45) simplify to

$\begin{aligned} &\sum_{k=1}^{K_T-1} [x_1(\xi_{k+1}) + x_2(\xi_{k+1})] \frac{\partial \xi_{k+1}}{\partial \theta_i} \\ &- \sum_{k=1}^{K_T-1} [x_1(\xi_k) + x_2(\xi_k)] \frac{\partial \xi_k}{\partial \theta_i} \\ &= [x_1(\xi_1) + x_2(\xi_1)] \frac{\partial \xi_1}{\partial \theta_i} \\ &+ [x_1(\xi_{K_T}) + x_2(\xi_{K_T})] \frac{\partial \xi_{K_T}}{\partial \theta_i} \end{aligned} \tag{46}$

Further note that the sixth term in (45) cancels out with the second term on the right hand side of (46). Moreover, it is clear from (5) that $z_1(\xi_{M_T+1}^+) = z_1(\xi_m^+) = 0$ and $z_1(\eta_m^-) = \eta_m - \xi_m$ , $m = 1, \dots, M_T$ . Since $z'{j,i}(t) = z'{j,i}(\tau_k^+)$ , $j, i = 1, 2$ , over any interevent interval $[\tau_k, \tau_{k+1})$ , and recalling that $\frac{\partial T}{\partial \theta_i} = \frac{\partial \xi_1}{\partial \theta_i} = 0$ , the last two terms in (45) simplify to

$\begin{aligned} &\frac{W_2}{T} \sum_{m=1}^{M_T} \left[ z'_{1,i}(\xi_m^+) (\eta_m - \xi_m) + (\eta_m - \xi_m) \frac{\partial \eta_m}{\partial \theta_i} \right] \\ &+ \frac{W_2}{T} z'_{1,i}(\xi_{M_T+1}^+) (T - \xi_{M_T+1}) \end{aligned}$

Recall that $\xi_m$ is the start of the $m$ th on-treatment period, which necessarily corresponds to the $m-1$ th occurrence of event $e_2$ . Hence, it follows from (40) that $z'{1,i}(\xi_m^+) = -\xi'{m,i}$ , $m = 1, \dots, M_{T+1}$ . As a result, (45) can be further simplified to

$\begin{aligned} \frac{dL(\theta)}{d\theta_i} &= \frac{W_1}{T \cdot PSA_{init}} \sum_{k=1}^{K_T-1} \int_{\xi_k}^{\xi_{k+1}} [x'_{1,i}(\theta, t) + x'_{2,i}(\theta, t)] dt \\ &+ \frac{W_1}{T \cdot PSA_{init}} \int_{\xi_{K_T}}^T [x'_{1,i}(\theta, t) + x'_{2,i}(\theta, t)] dt \\ &+ \frac{W_2}{T} \left[ \sum_{m=1}^{M_T} -\xi'_{m,i} (\eta_m - \xi_m) + (\eta_m - \xi_m) \eta'_{m,i} \right] \\ &- \frac{W_2}{T} \xi'_{M_T+1} (T - \xi_{M_T+1}) \end{aligned} \tag{47}$

The result in (47) is obtained under the assumption that our sample path ends with an incomplete on-treatment period. This condition is satisfied when $\lfloor \frac{K_T}{2} \rfloor \neq \frac{K_T}{2}$ . If this is not the case, i.e., if the sample path ends with an incomplete off-treatment period and $\lfloor \frac{K_T}{2} \rfloor = \frac{K_T}{2}$ , the last term in (47) can be disregarded. It is then straightforward to verify that (47) can be rewritten as (43). ■

Observe that evaluating (43) requires knowledge of $x'{1,i}(\theta, t)$ and $x'{2,i}(\theta, t)$ over all on and off-treatment periods. Over on-treatment periods, this can be determined using (21) and (24), which ultimately depend on (8), so that it is necessary to evaluate the integral of the noise process $\zeta_3(t)$ . In the case of off-treatment periods, (27) and (29) must be used, so that (9) must be evaluated, for which knowledge of the integral of the noise process $\zeta_3(t)$ is also needed. In the second and third terms in (43), $\Delta_m^{ON}$ can be computed using timers whose start and end times are observable events, while $\eta'{m,i}$ and $\xi'{m,i}$ , $m = 1, \dots, M_T$ , and eventually $\xi'_{M_T+1,i}$ , can be computed through (42), which requires knowledge of the noise processes $\zeta_1(t)$ and $\zeta_2(t)$ evaluated at event times only.

4. CONCLUSION

Biological systems are inherently sensitive to physiologic cues, such as the timing and dosage of drugs and related procedures. Hence, performing sensitivity analysis of the mathematical models that infer patient response to such cues will aid the development of personalized treatment schemes. Such sensitivity analysis should not only allow for evaluating the sensitivities with respect to model parameters, but also, and most importantly, provide sensitivity estimates with respect to controllable parameters in a therapy. The methodology we have laid out in this paper addresses both these needs.

Indeed, this work is the first step towards the development of a methodology for personalized therapy design applicable to stochastic models of cancer progression. We illustrate our analysis with a case study of optimal IAS therapy design for prostate cancer. For such, we propose an SHA model to describe the evolution of prostate cancer under IAS therapy and derive a cost metric in terms of the desired outcome of IAS therapy that is parameterized by an appropriately chosen controllable parameter vector. The problem of optimal personalized therapy design is then formulated as the search for the parameter values which minimize our cost metric. In this context, we apply Infinitesimal Perturbation Analysis (IPA) and derive unbiased gradient estimates of the cost metric with respect to the controllable vector of interest, which can be used for sensitivity analysis of therapy schemes.

More importantly, however, since (43) provides an unbiased estimate of $dJ(\theta)/d\theta_i$ , it can also be ultimately used for therapy estimation and optimization. To this end, it is possible to implement an algorithm for updating the value of $dL(\theta)/d\theta_i$ after each observed event. Such value can then be used to compute an optimal $\theta^*$ through an interactive optimization procedure of the form $\theta_{i,l+1} = \theta_{i,l} - \rho_l H_{i,l}(\theta_l, x(0), T, \omega_l)$ , where $\rho_l$ is the step size at the $l$ th iteration, $l = 0, 1, \dots$ , and $\omega_l$ denotes a sample path from which data are extracted and used to compute $H_{i,l}(\theta_l, x(0), T, \omega_l)$ defined to be an estimate of $dJ(\theta)/d\theta_i$ . The sample paths can be generated through simulation of existing models (e.g., our SHA model), so that, by varying the model parameters, different patient behaviors can be analyzed. Alternatively, it is possible to apply our IPA estimators to real patient data obtained from clinical trials (available e.g., in Bruchovsky et al. (2006) and Bruchovsky et al. (2007)), and ultimately contrast the optimal therapyscheme $\theta^*$ with the prescribed one. Our ongoing work involves implementing the IPA estimators derived in this work for personalized IAS therapy design.

REFERENCES

Bruchovsky, N., Klotz, L., Crook, J., Malone, S., Ludgte, C., Morris, W., Gleave, M., Goldenberg, S., and Rennie, P. (2006). Final results of the canadian prospective phase ii trial of intermittent androgen suppression for men in biochemical recurrence after radiotherapy for locally advanced prostate cancer. Cancer, 107, 389–395.

Bruchovsky, N., Klotz, L., Crook, J., Malone, S., Ludgte, C., Morris, W., Gleave, M., Goldenberg, S., and Rennie, P. (2007). Locally advanced prostate cancer biochemical results from a prospective phase ii study of intermittent androgen suppression for men with evidence of prostate-specific antigen recurrence after radiotherapy. Cancer, 109, 858–867.

Cassandras, C. and Lafortune, S. (2008). Introduction to Discrete Event Systems. Springer, 2nd edition.

Cassandras, C., Wardi, Y., Panayiotou, C., and Yao, C. (2010). Perturbation analysis and optimization of stochastic hybrid systems. European Journal of Control, 6(6), 642–664.

Hanahan, D. and Weinberg, R. (2011). Hallmarks of cancer: The next generation. Cell, 144, 646–674.

Hirata, Y., Bruchovsky, N., and Aihara, K. (2010a). Development of a mathematical model that predicts the outcome of hormone therapy for prostate cancer. J. Theor. Biology, 264, 517–527.

Hirata, Y., di Bernardo, M., Bruchovsky, N., and Aihara, K. (2010b). Hybrid optimal scheduling for intermittent androgen suppression of prostate cancer. Chaos, 20, 045125.

Ideta, A., Tanaka, G., Takeuchi, T., and Aihara, K. (2008). A mathematical model for intermittent androgen suppression for prostate cancer. J. Nonlinear Sci., 18, 593–614.

Jackson, T. (2004a). A mathematical investigation of the multiple pathways to recurrent prostate cancer: comparison with experimental data. Neoplasia, 6, 697–704.

Jackson, T. (2004b). A mathematical model of prostate tumor growth and androgen-independent replace. Discrete Cont. Dyn. Syst. Ser. B, 4, 187–201.

Liu, B., Kong, S., Gao, S., Zuliani, P., and Clarke, E. (2015). Towards personalized cancer therapy using delta-reachability analysis. HSCC2015.

Longo, D., Fauci, A., Kasper, D., Hauser, S., Jameson, J., and Loscalzo, J. (eds.) (2012). Harrison's principles of internal medicine. McGraw-Hill, Medical Pub. Division, New York, 18th edition.

Shimada, T. and Aihara, K. (2008). A nonlinear model with competition between prostate tumor cells and its application to intermittent androgen suppression therapy of prostate cancer. Mathematical Biosciences, 214, 134–139.

Suzuki, T., Bruchovsky, N., and Aihara, K. (2010). Piecewise affine systems modelling for optimizing therapy of prostate cancer. Philos. Trans. R. Soc., 368, 5045–5059.

Tanaka, G., Hirata, Y., Goldenberg, S., Bruchovsky, N., and Aihara, K. (2010). Mathematical modelling of

prostate cancer growth and its application to hormone therapy. Philos. Trans. R. Soc., 368, 5029–5044.

Tao, Y., Guo, Q., and Aihara, K. (2010). A mathematical model of prostate tumor growth under hormone therapy with mutation inhibitor. J. Nonlinear Sci., 20, 219–240.

Buckets:

huggingchat
/

papers-content

Infinitesimal Perturbation Analysis for Personalized Cancer Therapy Design ^*

1. INTRODUCTION

2. PROBLEM FORMULATION

2.1 SHA Model of Prostate Cancer Progression

2.2 IAS Therapy Evaluation and Optimization

3. INFINITESIMAL PERTURBATION ANALYSIS

3.1 State and Event Time Derivatives

3.2 Cost Derivative

4. CONCLUSION

REFERENCES

Xet Storage Details

Infinitesimal Perturbation Analysis for Personalized Cancer Therapy Design *

1. INTRODUCTION

2. PROBLEM FORMULATION

2.1 SHA Model of Prostate Cancer Progression

2.2 IAS Therapy Evaluation and Optimization

3. INFINITESIMAL PERTURBATION ANALYSIS

3.1 State and Event Time Derivatives

3.2 Cost Derivative

4. CONCLUSION

REFERENCES

Xet Storage Details

Infinitesimal Perturbation Analysis for Personalized Cancer Therapy Design ^*