This repository has been archived by the owner on Dec 29, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpart1.tex
193 lines (156 loc) · 9.3 KB
/
part1.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
\begin{abstract}
The Kalman-Bucy filter has emerged to become a very important application of stochastic differential equations, since it is common that the problem of estimating a hidden state arise in stochastic systems. In this report, we investigate the essential mathematical theory for the linear Kalman-Bucy filter. We finally will provide a simple example.
\end{abstract}
\section{Introduction }
The problem of filtering often arise when we want to establish estimation of a unknown state based on noisy observations. For this reason are filtering very useful in general state estimation and calibrations problems. This leads to countless applications, such as in finance for estimation of inflation rate given noisy price fluctuations. However, solving these problems generally leads to challenging problems which requires a deep underlying mathematical framework. In this report will we focus mainly on the basic theory of linear Kalman-Bucy filter (KBF) and provide an example.
\section{Filtered Probability Space}
It is important to be able to organize the total information structure in the system. Firstly, we can define a given probability space $(\Omega, \mathcal{A}, P)$, which is consisting of the sample space $\Omega$, the $\sigma$-algebra $\mathcal{A}$ generated by all events and the given probability measure $P$.
Let is introduce the family $\underbar A=\{ \mathcal{A}_t, t \le 0 \}$,
where the $\sigma$-algebra $\mathcal{A}_t = \sigma \{ X_s: \ s \in [ 0, t ] \}$ and is generated by all the events up to time $t$. We can then define the filtered probability space to be
$(\Omega, \mathcal{A}, \underbar{A}, P)$.
Generally in this report is the notation is based on Platen \cite{PlatBrut10}.
\section{The Filtering Problem}.
Let the filtered probability space have the form $(\Omega, \mathcal{A}, \underbar{A}, P)$ and suppose that we have a unknown hidden state $X_t$ at time $t$ following this linear stochastic differential equation (SDE),
%Let the filtered probability space be $(\Omega, \mathcal{A}_T, \underbar{A}. P)$.
%Assume the linear SDE of a hidden state $X_t$ given by
\begin{equation*}
dX_t = A X_t dt + B dW_t,
\end{equation*}
where $t \in [0,T]$, $A \in \mathbb{R} ^{ d \times d}$, $X_t \in \mathbb{R} ^{ d }$ and $B \in \mathbb{R}^{d \times m}$. $W_t$ is defined as
\begin{equation}
W = \{ W_t = (W_t^1, \dots, W^m_t), t \in [ 0, T ] \}
\end{equation}
which is a $m$ dimensional Wiener process. In addition to this will we assume that the first value $X_0$ is a Gaussian random variable. The general concept is to approximate the hidden state $X_t$ given a set of observations,
$$
Y = \{Y_t = (Y_t^1, \dots , Y_t^r), t \in [0, T] \},
$$
so that it is a $r$ dimensional process. Assuming there is a linear relationship in between observations $Y_t$ and the hidden state $X_t$ can we define the linear observational model,
$$
dY_t = HX_t dt + \Gamma dW^*_t,
$$
where $H \in \mathbb{R}^{d \times e}$, $\Gamma \in \mathbb{R}^{e \times n}$ and the $n$-dimentional Wiener process,
\begin{equation}
W^* = \{ W^*_t = (W^{*,1}_t, \dots, W^{*,n}_t), t \in [ 0, T ] \}.
\end{equation}
Also assume that the observational model noise $W^*$ and hiddens state model noise $W$ is independent. For each $t \in [0,T]$, let some $\sigma$- algebra have the form
$$
\mathcal{A}_t = \sigma \{ X_0, Y_s, W_s, s \in [0,T] \}
$$
generated by $X_0$, $Y_s $ and $W_s$ for $s \in [0,t]$ which expresses the total information up to time $t$. Furthermore, let the $\sigma$- algebra,
$$
\mathcal{Y}_t = \{ Y_s: s \in [0,t]\},
$$
provide the observational information . Thus we have $\mathcal{Y}_t \subset \mathcal{A}_t$ for $t \in [0, T]$, see Platen \cite[p. 420]{PlatBrut10}.
\section{Kalman-Bucy Filter}
Using the the previous results can we define the Kalman Bucy filter (KBF) as the optimal estimate,
$$
\hat{X}_t = E(X_t \mid \mathcal{Y}_t).
$$
We define it so the least square estimate of $X_t$,
$$
E(\|X_t - \hat{X}_t \|^2) \le E(\| X_t - Z_t \| ^2),
$$
for all $\mathcal{Y}_t$- measurable random variable $Z_t$. Since $\hat{X}_t$ is $\mathcal{Y}_t$ measurably must it also be $\mathcal{A}_t$ measureable. In fact, let us define the covariance matrix,
$$
C_t = E((X_t - \hat{X}_t)(X_t - \hat{X}_t)^T)),
$$
then can it shown in the linear case that it does satisfy the Riccati equation \cite[421]{PlatBrut10},
\begin{align*}
\frac{d C_t}{dt} =& A C_t + C_t A^T + B B^T \\
& - C_t H^T (\Gamma \Gamma ^T)^{-1} H C_t .
\end{align*}
The inital value is assumed to be $C_0 = E(X_0 X_0 ^T)$.
Using the results from given in Kallianpur \cite{kallianpur1980}, can we observe that the linear KBF estimate $\hat{X}_t$ does satisfy the SDE,
\begin{align*}
\label{eq:1}
d \hat{X} _t =& (A - C_t H^T (\Gamma \Gamma^T)^{-1}) \hat{X}_t dt \\
&+ C_t H^T (\Gamma \Gamma^T)^{-1} dY_t ,
\end{align*}
for $ t\in [0,T]$. Observing that $Y_t$ is a driving process. However, this formulation is also equivalent with the multi-dimensional linear KBF formulated in Øksendal \cite[pp. 104 ]{øksendal2010stochastic}.
\section{Orthogonal Projection in Hilbert Spaces}
A more deeper intuition of KBF comes from that least square estimate of a estimate in Hilbert space with respect to a subspace is a orthogonal projection onto the subspace.
Let a square integrable process,
$$
f = \{ f_s , s \in [0,T]\},
$$
be adapted to the total filtration
$
\underbar \mathcal{A} = (\mathcal{A}_t)_{t \in [0,T]}
$
and formed by the observation filtration $ \mathcal{Y} = \{ \mathcal{Y}\}_{t \in [0,T]} $.
Additionally define the orthogonal inner product such that
$$
(f,g) = E\Big(\int_0^t f_s^T g \\\ ds \Big) = 0.
$$
Furthermore, the filter error $\varepsilon_t = X_t - \hat{X}_t$ and in a KBF shown to obtain $(\varepsilon_t, \hat{X}_t) = 0$, see Platen \cite[pp. 421]{PlatBrut10}.
%\section{Notes related to Multivariate Normalized Variables}
%\begin{theorem}
%\label{eq:12}
%In the case of the linear filter the normalized conditional distribution $\pi _t$ of the state $X_t$ conditional upon the $ \sigma$-algebra $\mathcal{Y}_t$ is a multivariate normal distribution.
%\end{theorem}
%
%\begin{proof}
%Proof can be found in Bain \cite[pp. 149]{Bain2008}.
% Let the orthogonal projection of the state $X_t^i$ , $i= 1, \dots, d$ onto the Hilbert space
%
% $$
% \mathcal{H}_t^Y = \left \{ \sum_{i=1}^m \int_0^t a_i dY_s^i : \quad a_i \in L_2 ([0,t]), \quad i=1, \dots, m \right\}
% $$
% Then it exists a $K: [0,t] \mapsto \mathbb{R}^{d \times m}$ and
% $$
% \hat{X}_t = (\hat{X})_{i = d}^d
% $$
% is orthogonal on $\mathcal{H}_t^Y$ such that
% $$
% X_t = \hat{X}_t + \int_0^t K_s dY_s
% $$
% It can be shown that $\hat{X}_t$ has a Gaussian distribution \cite{Bain2008}. Moreover, for any
% $$
% (Y_{t_1}, \dots , Y_{t_{n-1}}, \hat{X}_t )
% $$
% where $0 \le t_1 \le \dots \le t_{n-1} \le t$ has a multidimensional variate normal distribution.
%
% Since $\hat{X}_t$ is orthogonal to $\mathcal{H}_t^Y$ it follows that $\hat{X}_t$ is independent of
% $$
% (Y_{t_1}, \dots , Y_{t_{n-1}})
% $$
% and since $\{ t_i \}$ is arbitaly chosen it follows that $\hat{X}_t$ is independent of $\mathcal{Y}_t$. Hence, $X_t$ conditioned to $\mathcal{y}_t$ is equivalent to the distribution of $\hat{X}_t$ shifted by $\int_0^t K_s dY_s$.
%\end{proof}
%As described in theorem \eqref{eq:12} can we obtain a multivariate normal distribution assuming a linear SDE for both state and observational models. This is very useful, since we can potentially reduce the complexity by working with covariances and mean.
\section{Example}
To demonstrate the simplest case of a KBF can we assume a linear 1-dimensional filtering problem,
\begin{align*}
dX_t &= F X_t dt + C dW_t \\
dY_t &= G X_t dt + U dW_t^*
\end{align*}
where the coefficients are constant, $F,C,G,U \in \mathbb{R} \setminus \{ 0\}$. Then Riccati equation consequently ends up like,
\begin{equation*}
\frac{d S}{dt} = 2FS - \frac{G^2}{D^2} S^2 + C^2, \ \text{assuming } S(0) = a.
\end{equation*}
This can be solved analytically, see Øksendal \cite[pp. 103]{øksendal2010stochastic} for more details. The solution is
$$
S(t)=\frac{\alpha_{1}-K \alpha_{2} \exp \left(\frac{\left(\alpha_{2}-\alpha_{1}\right) G^{2} t}{D^{2}}\right)}{1-K \exp \left(\frac{\left(\alpha_{2}-\alpha_{1}\right) G^{2} t}{D^{2}}\right)},
$$
where the constants are defined such that
\begin{align*}
\alpha_{1}&=G^{-2}\left(F D^{2}-D \sqrt{F^{2} D^{2}+G^{2}, C^{2}}\right), \\
\alpha_{2}&=G^{-2}\left(F D^{2}+D \sqrt{F^{2} D^{2}+G^{2}, C^{2}}\right), \\
K &=\frac{a^{2}-\alpha_{1}}{a^{2}-\alpha_{2}}.
\end{align*}
The hidden state will then have the form
\begin{align*}
\widehat{X}_{t}=&\exp \left(\int_{0}^{t} H(s) d s\right) \widehat{X}_{0} \\
&+\frac{G}{D^{2}} \int_{0}^{t} \exp \left(\int_{s}^{t} H(u) d u\right) S(s) d Y_{s},
\end{align*}
where we define the function
$$
H(s)=F-\frac{G^{2}}{D^{2}} S(s).
$$
Let assume a large $s$, then the approximation $S(s) \approx \alpha_{2}$ holds. Finally, this gives the optimal estimation ends up on the form
\begin{align*}
\widehat{X}_{t} \approx & \widehat{X}_{0} \exp \left(\left(F-\frac{G^{2} \alpha_{2}}{D^{2}}\right) t\right)+ \\
& \frac{G \alpha_{2}}{D^{2}} \int_{0}^{t} \exp \left(\left(F-\frac{G^{2} \alpha_{2}}{D^{2}}\right)(t-s)\right) d Z_{s} \\
=&\widehat{X}_{0} \exp (-\beta t)+\frac{G \alpha_{2}}{D^{2}} \exp (-\beta t) \int_{0}^{t} \exp (\beta s) d Y_{s}.
\end{align*}
where $\beta=D^{-1} \sqrt{F^{2} D^{2}+G^{2} C^{2}}$.
%So we get approximately the same behaviour as in the previous example.