Consider a controlled SDE on \([0,T]\) with scalar noise schedule \(\sigma_t > 0\),
\[ dX_t = \sigma_t \, u(X_t, t) \, dt + \sigma_t \, dW_t, \quad X_0 \sim p_0. \tag{1}\]
The goal: find \(u\) such that \(X_T \sim \pi\), where \(\pi(x) = \rho(x)/\mathcal{Z}\) is a target density known up to normalization. Adjoint sampling solves this with a Dirac prior and memoryless condition; adding a corrector for arbitrary priors requires alternating optimization. The Bridge Matching Sampler (BMS) identifies a single coupling, the independent coupling, that makes the regression target fully tractable and removes the need for alternation.
Nelson’s relation
Let \(\mathbb{P}^u\) denote the path measure of Equation 1 with time marginals \(p_t\). Write the Euler discretization with step \(\delta\):
\[ X_{t+\delta} = X_t + \sigma_t \, u(X_t,t) \, \delta + \sigma_t \sqrt{\delta} \, \mathbf{n}, \qquad \mathbf{n}\sim \mathcal{N}(0,I). \tag{2}\]
The forward conditional mean is \(\mathbb{E}[X_{t+\delta} \mid X_t = x] = x + \sigma_t \, u(x,t) \, \delta\). Now compute the backward conditional mean \(\mathbb{E}[X_t \mid X_{t+\delta} = y]\) using Bayes’ rule, exactly as in the reverse diffusions note. For \(\delta \ll 1\):
\[ \mathbb{P}(X_t \in dx \mid X_{t+\delta} = y) \;\propto\; p_t(x) \, \exp {\left\{ -\frac{\|y - x - \sigma_t \, u(x,t) \, \delta\|^2}{2 \sigma_t^2 \, \delta} \right\}} . \]
Expanding \(p_t(x) \approx p_t(y) \exp(\left< \nabla \log p_t(y), x - y \right>)\) and completing the square, the conditional mean is
\[ \mathbb{E}[X_t \mid X_{t+\delta} = y] = y - \sigma_t \, u(y,t) \, \delta + \sigma_t^2 \, \nabla \log p_t(y) \, \delta + O(\delta^2). \tag{3}\]
(The second-order correction to \(\log p_t\) affects the conditional variance at \(O(\delta)\) but not the conditional mean, which is all we need.)
Completing the square:
Drop multiplicative constants independent of \(x\). The exponent in the posterior is
\[ \left< \nabla \log p_t(y), x - y \right> - \frac{\|y - x - \sigma_t u \delta\|^2}{2\sigma_t^2 \delta}. \]
Write \(z = x - y\). The quadratic piece is \(-\|z + \sigma_t u \delta\|^2/(2\sigma_t^2\delta)\), with mean at \(z = -\sigma_t u \delta\). The linear piece \(\left< \nabla \log p_t(y), z \right>\) shifts the Gaussian mean by \(\sigma_t^2 \delta \, \nabla \log p_t(y)\) (the standard “linear tilt of a Gaussian” identity: if \(f(z) \propto \exp(-\|z - m\|^2/(2s^2) + \left< a,z \right>)\), then the mean shifts from \(m\) to \(m + s^2 a\), with \(s^2 = \sigma_t^2\delta\) and \(a = \nabla \log p_t(y)\)). So \(\mathbb{E}[z] = -\sigma_t u \delta + \sigma_t^2 \nabla \log p_t(y)\delta + O(\delta^2)\), giving \(\mathbb{E}[X_t \mid X_{t+\delta} = y] = y - \sigma_t u \delta + \sigma_t^2 \nabla \log p_t(y)\delta\).From the reverse diffusions note, the time-reversed process \(\overleftarrow{X}_s = X_{T-s}\) satisfies
\[ d\overleftarrow{X}_s = {\left[ -\sigma_{T-s} \, u(\overleftarrow{X}_s, T-s) + \sigma_{T-s}^2 \, \nabla \log p_{T-s}(\overleftarrow{X}_s) \right]} ds + \sigma_{T-s} \, d\overleftarrow{B}_s. \]
The reversed drift is \(-\sigma_t u + \sigma_t^2 \nabla \log p_t\). Define \(v\) by writing this reversed drift as \(\sigma_t v\), so that the reversed SDE takes the form \(d\overleftarrow{X}_s = \sigma_{T-s} v(\overleftarrow{X}_s, T-s) ds + \sigma_{T-s} d\overleftarrow{B}_s\). Then \(\sigma_t v = -\sigma_t u + \sigma_t^2 \nabla \log p_t\), i.e. \(v = -u + \sigma_t \nabla \log p_t\). Rearranging:
This is Nelson’s relation:
\[ \textcolor{blue}{u(x,t) + v(x,t) = \sigma_t \, \nabla \log p_t(x).} \tag{4}\]
As a sanity check, Equation 3 confirms this: the backward conditional mean \(y - \sigma_t u \delta + \sigma_t^2 \nabla \log p_t \delta\) identifies the reversed drift as \(-\sigma_t u + \sigma_t^2 \nabla \log p_t = \sigma_t v\), consistent with the time-reversal formula.
This holds for any Markov diffusion of the form Equation 1 with scalar noise schedule \(\sigma_t\) and marginals \(p_t\).
Reciprocal class and Markovian projection
Let \(\mathbb{P}\) denote the reference (uncontrolled) process \(dX_t = \sigma_t \, dW_t\). Write \(\mathbb{P}_{|0,T}(\cdot | x_0, x_T)\) for the law of the reference process conditioned on \(X_0 = x_0, X_T = x_T\). A path measure \(\Pi\) belongs to the reciprocal class \(\mathcal{R}(\mathbb{P})\) if it has the form \(\Pi = \Pi_{0,T} \, \mathbb{P}_{|0,T}\), where \(\Pi_{0,T}\) is an endpoint coupling and \(\mathbb{P}_{|0,T}\) is the reference bridge (Brownian bridge for Brownian \(\mathbb{P}\)); see the Schrodinger bridges note.
A reciprocal measure is generally non-Markovian: the bridge drift depends on \(X_T\). The Markovian projection finds a Markovian drift \(u^\star\) whose time marginals match those of \(\Pi^\star\) (see (Brunick and Shreve 2013) for existence/uniqueness). (Existence and uniqueness of the Markovian projection requires the path-dependent drift to satisfy a linear growth condition; see Brunick and Shreve (2013).) This is an \(L^2\) projection: if \(\xi(X,t)\) is the path-dependent drift of \(\Pi^\star\), then
\[ u^\star(x,t) = \mathbb{E}_{\Pi^\star} {\left[ \xi(X,t) \mid X_t = x \right]} . \tag{5}\]
Why? For any Markovian \(u\), expand \(\mathbb{E}_{\Pi^\star}[\|\xi - u(X_t,t)\|^2]\) and use the tower property:
\[ \mathbb{E}_{\Pi^\star} {\left[ \|\xi - u\|^2 \right]} = \mathbb{E}_{\Pi^\star} {\left[ \|\xi - u^\star\|^2 \right]} + \mathbb{E}_{\Pi^\star} {\left[ \|u^\star - u\|^2 \right]} . \]
The cross-term vanishes because \(\mathbb{E}_{\Pi^\star}[\xi - u^\star \mid X_t] = 0\) by definition of \(u^\star\). So \(u^\star\) minimizes the matching loss
\[ u^\star = \mathop{\mathrm{argmin}}_{u} \; \mathbb{E}_{\Pi^\star} {\left[ \int_0^T \frac{1}{2} \| \xi(X,t) - u(X_t,t) \|^2 \, dt \right]} . \tag{6}\]
Fixed-point iteration
All three methods (adjoint sampling with Dirac prior, adjoint sampling with corrector, BMS) follow the same template. Starting from some control \(u_0\):
- Simulate the current SDE \(\mathbb{P}^{u_i}\) to generate endpoint pairs \((X_0, X_T)\).
- Reciprocal projection: form a coupling \(\Pi^i_{0,T}\) from the endpoints, define \(\Pi^i = \Pi^i_{0,T} \, \mathbb{P}_{|0,T}\).
- Markovianize: update \(u_{i+1}\) by regressing onto the bridge drift via Equation 6.
If \(u_i = u^\star\), then \(\Pi^i = \Pi^\star\) and \(u_{i+1} = u^\star\), so \(u^\star\) is a fixed point. Convergence of this iteration is not guaranteed in general; the BMS paper treats it as empirically effective. What distinguishes the methods is the coupling in step 2, which determines the regression target \(\xi\).
Target score identity
To get a tractable \(\xi\), we need the score \(\nabla \log \Pi^\star_t(x)\). Define the cumulative variance \(\nu_t = \int_0^t \sigma_s^2 \, ds\) and \(\gamma_t = \nu_t/\nu_T\). The bridge \(\mathbb{P}_{|0,T}\) is Gaussian: for \(t \in (0,T)\),
\[ X_t \mid (X_0, X_T) \;\sim\; \mathcal{N} {\left( (1-\gamma_t) X_0 + \gamma_t X_T, \;\; \nu_T \gamma_t(1-\gamma_t) \, I \right)} . \tag{7}\]
The marginal density of \(\Pi^\star\) at time \(t\) is
\[ \Pi^\star_t(x) = \int \mathbb{P}_{t|0,T}(x \mid x_0, x_T) \, \Pi^\star_{0,T}(x_0, x_T) \, dx_0 \, dx_T. \tag{8}\]
Differentiate \(\log \Pi^\star_t(x)\):
\[ \nabla_x \log \Pi^\star_t(x) = \frac{\int \nabla_x \mathbb{P}_{t|0,T}(x \mid x_0, x_T) \, \Pi^\star_{0,T}(x_0, x_T) \, dx_0 \, dx_T}{\Pi^\star_t(x)}. \]
Since \(\mathbb{P}_{t|0,T}\) is Gaussian with mean \((1-\gamma_t)x_0 + \gamma_t \, x_T\),
\[ \nabla_x \log \mathbb{P}_{t|0,T}(x \mid x_0, x_T) = -\frac{x - (1-\gamma_t)x_0 - \gamma_t \, x_T}{\nu_T\gamma_t(1-\gamma_t)}. \]
Integration by parts to swap gradients:
The identity \(\nabla_x \mathbb{P}_{t|0,T} = -\frac{1}{1-\gamma_t} \nabla_{x_0} \mathbb{P}_{t|0,T}\) relies on the Gaussian bridge having a mean that is affine in \((x_0, x_T)\) with endpoint-independent variance. Shifting \(x\) by \(\varepsilon\) is equivalent to shifting \(x_0\) by \(-\varepsilon/(1-\gamma_t)\), which holds because the mean is \((1-\gamma_t)x_0 + \gamma_t x_T\). Beyond Gaussian references, the TSI still holds conceptually but the gradient swap takes a different form.
Integrate by parts in \(x_0\):
\[ \nabla_x \log \Pi^\star_t(x) = \frac{1}{1-\gamma_t} \frac{\int \mathbb{P}_{t|0,T}(x \mid x_0, x_T) \, \nabla_{x_0} \Pi^\star_{0,T}(x_0, x_T) \, dx_0 \, dx_T}{\Pi^\star_t(x)}, \]
where we moved \(\nabla_{x_0}\) from \(\mathbb{P}_{t|0,T}\) onto \(\Pi^\star_{0,T}\) (boundary terms vanish by decay of \(\Pi^\star_{0,T} \cdot \mathbb{P}_{t|0,T}\) at infinity). (The minus from \(\nabla_x \mathbb{P}= -\frac{1}{1-\gamma_t} \nabla_{x_0} \mathbb{P}\) and the minus from integration by parts cancel, giving a positive coefficient \(+\frac{1}{1-\gamma_t}\).) Write \(\Pi^\star_{0,T|t}\) for the conditional distribution of \((X_0, X_T)\) given \(X_t = x\) under \(\Pi^\star\). Recognizing the conditional expectation:
\[ \nabla_x \log \Pi^\star_t(x) = \mathbb{E}_{\Pi^\star_{0,T|t}} {\left[ \frac{1}{1-\gamma_t} \nabla_{X_0} \log \Pi^\star_{0,T}(X_0,X_T) \;\Big|\; X_t = x \right]} . \]
The same argument with integration by parts in \(x_T\) gives
\[ \nabla_x \log \Pi^\star_t(x) = \mathbb{E}_{\Pi^\star_{0,T|t}} {\left[ \frac{1}{\gamma_t} \nabla_{X_T} \log \Pi^\star_{0,T}(X_0,X_T) \;\Big|\; X_t = x \right]} . \]
Since both expressions equal the same score, any convex combination \((1-c(t))\) times the first plus \(c(t)\) times the second is also valid, for any \(c(t) \in (0,1]\).This gives the generalized target score identity: for any \(c(t) \in (0,1]\),
\[ \nabla \log \Pi^\star_t(x) = \mathbb{E}_{\Pi^\star_{0,T|t}} {\left[ \frac{1-c(t)}{1-\gamma_t} \nabla_{X_0} \log \Pi^\star_{0,T} + \frac{c(t)}{\gamma_t} \nabla_{X_T} \log \Pi^\star_{0,T} \;\Big|\; X_t = x \right]} . \tag{9}\]
(The endpoints \(c = 0\) and \(c = 1\) are excluded because they would divide by \(\gamma_t\) or \(1 - \gamma_t\), which vanish at \(t = 0\) or \(t = T\).)
(The gradient-integral interchange and vanishing boundary terms require regularity of \(\Pi^\star_{0,T}\) and decay of the integrand at infinity; these hold for sub-Gaussian couplings.)
General regression target
Now combine everything. For a reciprocal process \(\Pi^\star = \Pi^\star_{0,T} \mathbb{P}_{|0,T}\), the bridge drift decomposes into forward and backward pieces: \(\nabla_{X_t} \log \mathbb{P}_{t|0,T}(X_t | X_0, X_T) = \nabla_{X_t} \log \mathbb{P}_{t|0}(X_t | X_0) + \nabla_{X_t} \log \mathbb{P}_{T|t}(X_T | X_t)\). The first piece points back toward \(X_0\); the second points forward toward \(X_T\). The backward Markovian drift \(v^\star\) extracts the backward piece and averages it over \(X_0 | X_t\):
\[ v^\star(x,t) = \mathbb{E}_{\Pi^\star_{0|t}} {\left[ \sigma_t \, \nabla_{X_t} \log \mathbb{P}_{t|0}(X_t \mid X_0) \;\Big|\; X_t = x \right]} . \tag{10}\]
Since \(\mathbb{P}_{t|0}\) is Gaussian \(\mathcal{N}(X_0, \nu_t I)\), this is \(-\sigma_t(X_t - X_0)/\nu_t\) averaged over \(X_0 \mid X_t\). This follows from the bridge drift decomposition above: the \(\nabla \log \mathbb{P}_{T|t}\) piece, when averaged over \(X_T | X_t\), gives the forward drift \(u^\star\) by the Doob h-transform. So \(v^\star\) is the remaining backward piece, averaged over \(X_0 | X_t\).
Nelson’s relation Equation 4 was derived for \(\mathbb{P}^u\) with marginals \(p_t\). Here we apply it to \(\mathbb{P}^{u^\star}\), which has marginals \(\Pi^\star_t\) (the Markovian projection preserves marginals). This gives the Markovian forward drift: \(u^\star = \sigma_t \nabla \log \Pi^\star_t - v^\star\). Both the score \(\nabla \log \Pi^\star_t\) and the backward drift \(v^\star\) are conditional expectations (from Equation 9 and Equation 10 respectively). To get a regression target for the matching loss Equation 6, we need a non-Markovian drift \(\xi(X,t)\) whose conditional expectation \(\mathbb{E}[\xi \mid X_t]\) equals \(u^\star\). Substituting the integrands (before conditioning) from Equation 9 and Equation 10 into Nelson gives such a \(\xi\):
\[ \sigma_t^{-1} \, \xi(X,t) = \frac{1-c(t)}{1-\gamma_t} \nabla_{X_0} \log \Pi^\star_{0,T}(X_0,X_T) + \frac{c(t)}{\gamma_t} \nabla_{X_T} \log \Pi^\star_{0,T}(X_0,X_T) - \nabla_{X_t} \log \mathbb{P}_{t|0}(X_t \mid X_0). \tag{11}\]
The first two terms are the TSI integrand (the quantity inside \(\mathbb{E}[\cdot \mid X_t]\) in Equation 9, which conditions on \(X_t\) to give \(\sigma_t \nabla \log \Pi^\star_t\)). The third term is the backward drift integrand from Equation 10 (conditioning on \(X_t\) gives \(v^\star\)). Their combination, before conditioning, is a valid \(\xi\) with \(\mathbb{E}_{\Pi^\star}[\xi \mid X_t] = u^\star\). Both the TSI integrand (depending on \((X_0, X_T)\)) and the backward drift integrand (depending on \(X_0\)) are expectations under \(\Pi^\star\) conditioned on \(X_t\). Since \(X_0\) is a marginal of \((X_0, X_T)\), the two integrands live in the same conditional probability space and can be combined before taking \(\mathbb{E}[\cdot | X_t]\).
Here \((X_0, X_t, X_T)\) are all determined by the bridge path: \(X_0, X_T\) from the coupling, \(X_t\) from Equation 7. The Markovianization step fits \(u(X_t,t)\) to \(\mathbb{E}[\xi(X,t) \mid X_t]\) via Equation 6.
The tractability of Equation 11 depends entirely on the coupling scores \(\nabla \log \Pi^\star_{0,T}\).
Three couplings, three algorithms
Half-bridge / adjoint sampling. Set \(\Pi^\star_{0,T} = \delta_{x_0} \otimes \pi\) (Dirac prior, memoryless condition). With \(x_0 = 0\) and \(c(t) = \gamma_t\): since \(\Pi^\star_0 = \delta_0\), we use only the \(X_T\) branch of the TSI (the \(X_0\) integration-by-parts is not needed when \(X_0\) is deterministic), and Equation 11 reduces to Equation 12. The paper derives this in Prop 4 (Appendix C.2); the key steps are below.
Reduction from Equation 11 to Equation 12:
Start from Equation 11 with \(\Pi^\star_{0,T} = \delta_0 \otimes \pi\). Since \(\Pi^\star_0 = \delta_0\), we use only the \(X_T\) branch of the TSI (the \(X_0\) integration-by-parts is not needed when \(X_0\) is deterministic). With \(c(t) = \gamma_t\):
\[ \sigma_t^{-1} \xi = \nabla \log \pi(X_T) - \nabla_{X_t} \log \mathbb{P}_{t|0}(X_t \mid 0). \]
Since \(\mathbb{P}_{t|0}(\cdot \mid 0) = \mathcal{N}(0, \nu_t I)\), its score at \(X_t\) is \(-X_t/\nu_t\). So
\[ \sigma_t^{-1} \xi = \nabla \log \pi(X_T) + \frac{X_t}{\nu_t}. \]
Since \(\mathbb{P}_T = \mathcal{N}(0, \nu_T I)\), its score is \(\nabla \log \mathbb{P}_T(x) = -x/\nu_T\), so \(\nabla \log \pi(X_T) + X_t/\nu_t\) can be rewritten after Markovianization. Under the half-bridge coupling \(\Pi^\star = \delta_0 \otimes \pi \cdot \mathbb{P}_{|0,T}\), the bridge from \(0\) to \(X_T\) gives \(X_t \mid X_T \sim \mathcal{N}(\gamma_t X_T, \nu_T\gamma_t(1-\gamma_t)I)\). The conditional expectation \(\mathbb{E}[X_t/\nu_t \mid X_T] = \gamma_t X_T / \nu_t = X_T/\nu_T\) (using \(\gamma_t = \nu_t/\nu_T\)). So \(\nabla \log \pi(X_T) + X_T/\nu_T = \nabla \log \pi(X_T) - \nabla \log \mathbb{P}_T(X_T) = \nabla \log[\pi/\mathbb{P}_T](X_T)\). After Markovianization:
\[ u^\star(x,t) = \sigma_t\mathbb{E} {\left[ \nabla \log \pi(X_T) + \frac{X_T}{\nu_T} \;\Big|\; X_t = x \right]} = \sigma_t\mathbb{E} {\left[ \nabla \log \frac{\pi(X_T)}{\mathbb{P}_T(X_T)} \;\Big|\; X_t = x \right]} , \]
since \(\nabla \log \mathbb{P}_T(X_T) = -X_T/\nu_T\) for \(\mathbb{P}_T = \mathcal{N}(0,\nu_T I)\). The non-Markovian \(\xi\) that gives this after conditioning is simply \(\sigma_t \nabla_{X_T} \log[\pi/\mathbb{P}_T](X_T)\), which depends only on \(X_T\). The \(X_t\)-dependent bridge score \(X_t/\nu_t\) cancels against the \(X_t\)-dependent part of \(\mathbb{E}[\nabla \log \pi(X_T) \mid X_t]\) under the Markovian projection, leaving only a function of \(X_T\).
Alternatively: the paper shows directly that setting \(c(t) = \gamma_t\) in the general SHB formula (Lemma C.2) causes all \(X_0\)-dependent terms to cancel, leaving \(\sigma_t^{-1}\xi = \nabla_{X_T}\log[\pi/\mathbb{P}_T](X_T)\).\[ \sigma_t^{-1} \, \xi(X,t) = \nabla_{X_T} \log \frac{\pi(X_T)}{\mathbb{P}_T(X_T)}, \tag{12}\]
where \(\mathbb{P}_T = \mathcal{N}(0, \nu_T I)\) (the terminal marginal of the reference). Simple, but requires Dirac prior and large \(\sigma_t\) for exploration.
Full Schrodinger bridge / adjoint sampling with corrector. Set \(\Pi^\star_{0,T} = \hat\varphi_0(x_0) \, \mathbb{P}_{T|0}(x_T \mid x_0) \, \varphi_T(x_T)\), the Schrodinger bridge coupling. The drift becomes
\[ \sigma_t^{-1} \, \xi(X,t) = \nabla_{X_T} \log \frac{\pi(X_T)}{\hat\varphi_T(X_T)}, \tag{13}\]
where \(\hat\varphi_T\) is the backward Schrodinger potential. This allows arbitrary priors, but \(\hat\varphi_T\) is unknown and must be learned alongside \(u\), requiring alternating IPF-style updates.
Independent coupling / BMS. Set
\[ \Pi^\star_{0,T} = p_0 \otimes \pi. \tag{14}\]
Plug into Equation 11. The coupling scores factor trivially: \(\nabla_{X_0} \log \Pi^\star_{0,T} = \nabla \log p_0(X_0)\) and \(\nabla_{X_T} \log \Pi^\star_{0,T} = \nabla \log \pi(X_T)\). The regression target becomes
\[ \sigma_t^{-1} \, \xi(X,t) = \frac{1-c(t)}{1-\gamma_t} \nabla \log p_0(X_0) + \frac{c(t)}{\gamma_t} \nabla \log \pi(X_T) - \frac{X_t - X_0}{\nu_t}. \tag{15}\]
Every term on the right is known: \(\nabla \log p_0\) is the prior score (assumed known, e.g. Gaussian), \(\nabla \log \pi = \nabla \log \rho\) is the target score (computable from the unnormalized density), and \((X_t - X_0)/\nu_t\) is the Gaussian transition score. No unknown potentials, no alternation.
The independent coupling: why it works
The independent coupling \(p_0 \otimes \pi\) satisfies the boundary constraints by construction: marginalizing over \(X_T\) gives \(p_0\), marginalizing over \(X_0\) gives \(\pi\). The terminal marginal of \(\Pi^\star = (p_0 \otimes \pi) \mathbb{P}_{|0,T}\) is \(\pi\): since \(\mathbb{P}_{T|0,T}(x \mid x_0, x_T) = \delta(x - x_T)\) (the bridge is pinned at its endpoint),
\[ \Pi^\star_T(x) = \int \mathbb{P}_{T|0,T}(x \mid x_0, x_T) \, p_0(x_0) \, \pi(x_T) \, dx_0 \, dx_T = \int \delta(x - x_T) \, \pi(x_T) \, dx_T = \pi(x). \]
The Markovian projection preserves time marginals, so \(\mathbb{P}^{u^\star}_T = \pi\): the controlled SDE with drift \(u^\star\) hits the target at time \(T\). Combined with \(\mathbb{P}^{u^\star}_0 = p_0\) (from the initial condition), \(u^\star\) is a fixed point.
The Schrodinger bridge coupling minimizes path-space KL (as shown in the SB notes). The independent coupling sacrifices this optimality for a fully tractable regression target.
At each iteration, the coupling is \(\Pi^i_{0,T} = \mathbb{P}^{u_i}_0 \otimes \mathbb{P}^{u_i}_T\): independently resample \(X_0\) and \(X_T\) from their marginals under the current SDE. In practice, simulate trajectories, then randomly pair the initial and terminal samples.
Sampling the bridge is cheap: given \((x_0, x_T)\), draw \(X_t\) from Equation 7 and evaluate Equation 15. No full trajectory simulation needed during regression.
Damped iteration
The undamped iteration \(u_{i+1} = \Phi(u_i)\) can overshoot in high dimensions. The damped version uses step size \(\alpha \in (0,1]\):
\[ u_{i+1} = \alpha \, \Phi(u_i) + (1-\alpha) \, u_i. \tag{16}\]
Setting \(\eta = (1-\alpha)/\alpha\), this solves
\[ u_{i+1} = \mathop{\mathrm{argmin}}_u \; {\left\{ \mathbb{E}_{\Pi^i} {\left[ \int_0^T \frac{1}{2} \| \xi - u(X_t,t) \|^2 \, dt \right]} \;+\; \textcolor{blue}{\eta \, \mathbb{E}_{\Pi^i} {\left[ \int_0^T \frac{1}{2} \| u_i(X_t,t) - u(X_t,t) \|^2 \, dt \right]} } \right\}} . \tag{17}\]
Deriving Equation 16 from Equation 17:
Apply the bias-variance decomposition (Pythagorean identity from the Markovian projection) to the first term:
\[ \mathbb{E}_{\Pi^i} {\left[ \|\xi - u\|^2 \right]} = \mathbb{E}_{\Pi^i} {\left[ \|\xi - \Phi(u_i)\|^2 \right]} + \mathbb{E}_{\Pi^i} {\left[ \|\Phi(u_i) - u\|^2 \right]} , \]
where \(\Phi(u_i) = \mathbb{E}_{\Pi^i}[\xi \mid X_t]\). The first piece is independent of \(u\) (irreducible noise from the non-Markovian \(\xi\)). Dropping it, Equation 17 reduces to
\[ u_{i+1} = \mathop{\mathrm{argmin}}_u \; \mathbb{E}_{\Pi^i} {\left[ \tfrac{1}{2}\|\Phi(u_i) - u\|^2 + \tfrac{\eta}{2}\|u_i - u\|^2 \right]} . \]
Pointwise first-order condition: \(-(\Phi(u_i) - u) - \eta(u_i - u) = 0\), giving \((1+\eta)u = \Phi(u_i) + \eta \, u_i\). So \(u = \frac{1}{1+\eta}\Phi(u_i) + \frac{\eta}{1+\eta}u_i\). With \(\eta = (1-\alpha)/\alpha\): \(\frac{1}{1+\eta} = \alpha\) and \(\frac{\eta}{1+\eta} = 1-\alpha\), recovering Equation 16.The \( \textcolor{blue}{\text{second term}}\) penalizes deviation from the previous iterate. Each step balances fitting new bridge data against staying close to \(u_i\), preventing mode collapse from aggressive updates.
Summary
| Method | Coupling \(\Pi^i_{0,T}\) | Regression target \(\sigma_t^{-1} \xi\) | Limitation |
|---|---|---|---|
| AS | \(\delta_{x_0} \otimes \mathbb{P}^{u_i}_T\) | \(\nabla \log [\pi/\mathbb{P}_T](X_T)\) | Dirac prior |
| AS + corrector | \(\mathbb{P}^{u_i}_{0,T}\) | \(\nabla \log [\pi/\hat\varphi_T](X_T)\) | Alternating opt. |
| BMS | \(\mathbb{P}^{u_i}_0 \otimes \mathbb{P}^{u_i}_T\) | Equation 15 | None (single obj.) |
All three converge to a fixed point \(u^\star\) transporting \(p_0\) to \(\pi\). The matching loss Equation 6 is a forward KL objective: \(u^\star = \mathop{\mathrm{argmin}}_u D_{\text{KL}}(\Pi^\star \mid \mathbb{P}^u)\). This follows from the Girsanov KL decomposition: \(D_{\text{KL}}(\Pi^\star \| \mathbb{P}^u) = \text{(irreducible variance)} + \mathbb{E}_{\Pi^\star}[\int \frac{1}{2}\|\xi - u\|^2 dt]\), so minimizing the matching loss over \(u\) is equivalent to minimizing \(D_{\text{KL}}(\Pi^\star \| \mathbb{P}^u)\). Forward KL is mode-covering (it penalizes placing zero mass where \(\Pi^\star\) has mass); since the Markovian projection preserves time marginals, mode coverage at the path level implies mode coverage at the terminal marginal level, which drives mode diversity in practice.