Introduction to Time Series Analysis - 01
This note is for course MATH 545 at McGill University.
Lecture 1 - Lecture 3
Reference Book
- Introduction to Time Series and Forecasting (by Brockwell and Davis)
- The Analysis of Time Series: an Introduction with R (by Chatfield and Xing)
Time series
{Xt} is a collection of random variables where \(t\) is the index of time.
The process of dealing with time series
- Describe by plotting to have concise summary of data
- Explain by probabilistic models (joint distributions)
- Predict to attain more uncertainty
Note that Xt is mutual independent, so we have the joint distribution Pr(X1≤x1,X2≤x2,...,Xn≤xn)=∏i=1nPr(Xi≤xi). But for most complex models we assume that Pr(X1≤x1,X2≤x2,...,Xn≤xn)=Pr(X1≤x1)Pr(X2≤x2∣X1≤x1)...Pr(Xn≤xn∣X1≤x1...Xn−1≤xn−1).
Semi-parametric model
In semi-parametric models, we do not specify pdf and cdf of random variables, instead we specify E(Xt) and Cov(Xt,Xt+j).
Examples
- iid noise: let E(Xt)=0,∀t and Pr(X1≤x1,X2≤x2,...,Xn≤xn)=∏i=1nPr(Xi≤xi)=∏i=1nF(Xn) where F(⋅) is cumulative distribution function.
- random walk: let {Xt} be iid noise, and St=X1+X2+...+Xt. Here St is a random walk. (Note that here St are not independent, but E(St)=0)
Models with structures
Let {Yt} be a time series where E(Yt)=0,∀t. Let Xt=mt+Yt, where mt is a slowly changing function of time. (Note that Yt here is what makes E(Xt)=0, but mt here is what makes E(Xt)=0)
Some choices for mt including linear function of t and polynomial function of t.
Models with seasonal variation (periodicity)
Let Xt=St+Yt, where E(Yt)=0,∀t and St is a periodic function with period d (i.e. St−d=St).
Common choices for St including sum of harmonic functions St=a0+∑j=1k(ajcos(λjt)+bjsin(λjt)), where aj and bj are estimated, λj are fixed frequencies.
General strategy for analysis
- Plot the data to
- identify potential signal (trend, seasonal)
- identify possible models for the rsifual process
- identify outliers and other weird things
- Remove the signal
- Choose a model to fit the resifual and esitimate the dependence
- Forecast by inventory projected residuals
Why we focus on the residuals (i.e. Xt−mt^,Xt−St^)
Let Wi∼iidN(μ,σ2), then we have Wi−μ∼N(0,σ2). Now we can estimate μ to remove the signal, and also we can estimate σ2.
Stationary process (series)
Let {Xs}s=0,1,...,n has the same properties as {Xt+s}s=0,1,...,n. (Note that we will focus on first and second order moments). iid noise is a special case of a stationary process.
Def. Xt is weakly stationary if
- E(Xt)=μX(t) is independent of t
- Cov(Xr,Xs)=E((Xr−μX(r))(Xs−μX(s)))=γX(r,s), where γX is the covariance function of Xt
We require that γX(t+h,t) is independent of t (i.e. γX(t+h,t)=γX(h,0)=Cov(Xh,X0))
Def. For strongly stationary, we require that the joint distribution of {Xs}s=0,1,...,n is the same as {Xt+s}s=0,1,...,n
We define γX(h,0)=γX(h) is the auto-covariance function of a stationary series of lag h.
We define ρX(h) is the auto-correlation function of lag h and ρX(h)=γX(0)γX(h)=Cor(Xt+h,Xt)
Useful identity
If E(X2)<∞,E(Y2)<∞,E(Z2)<∞ and a,b,c are real constants, then Cov(aX+bY+c,Z)=aCov(X,Z)+bCov(Y,Z).
Example 1: iid noise
Xt∼iidN(0,σ2)
By definition we have E(Xt)=0. If E(X2)=σ2<∞, then γX(h)=Cov(Xt+h,Xt)={σ2, if h=00,∀h=0 by independence
Therefore iid noise process is weakly stationary.
Example 2: White Noise Process
If {Xt} is a sequence of uncorrelated random variables with E(Xt)=0,Var(Xt)=σ2<∞,γX(h)=0∀h=0, then we refer to it as white noise.
Note that iid noise is white noise, but white noise is not necessarily iid noise.
Example 3
Suppose {Wt} and {Zt} are iid sequences, and {Wt}⊥{Zt}.
Let {Wt} follows a Bernoulli distribution, where Pr(Wi=0)=Pr(Wi=1)=1/2.
Let {Zt} follows a transformed Bernoulli distribution, where Pr(Wi=−1)=Pr(Wi=1)=1/2.
Set Xt=Wt(1−Wt−1)Zt, and we have the value table of Xt as follows:
Wt−1 |
Wt |
Xt |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
Zt |
E(Xt)=E(Wt)E(Wt−1)E(Zt)=21×21×0=0
When calculating covariance, there are two cases:
- h=0
Cov(Xt,Xt+h)=E(XtXt+h)=E(Wt2(1−Wt−1)2Zt2)=E(Wt2)E((1−Wt−1)2)E(Zt2)=21×21×1=41
- h=0
Cov(Xt,Xt+h)=E(XtXt+h)=E(Wt(1−Wt−1)ZtWt+h(1−Wt+h−1)Zt+h)=E(Wt)E((1−Wt−1))E(Zt)E(Wt+h)E((1−Wt+h−1))E(Zt+h)=0
Therefore, Xt is a white noise process.
Note that Xt and Xt−1 are dependent but not correlated.
Example 4: Random walk
Let {Xt} be iid noise, and St=X1+...+Xt=∑i=1tXi.
We have E(St)=0, and Var(St)=tσ2.
Cov(St+h,St)=Cov(St+[Xt+1+...+Xt+h],St)=Cov(St,St)+Cov(Xt+1+...+Xt+h,St)=tσ2+0
Therefore, random walk is not stationary.
Example 5: First order moving average process (MA(1))
Let Zt∼WN(0,σ2). Let Xt=Zt+θZt−1,t=0,±1,±2,... where θ is a real-valued constant.
(Graphical representation will be added later)
We have E(Xt)=E(Zt)+θE(Zt−1)=0.
Var(Xt)=E(Xt2)=E((Zt+θZt−1)2)=E(Zt2)+2θE(ZtZt−1)+θ2E(Zt−12)=(1+θ2)σ2
When calculating covariance, there are three cases:
- h=0
γX(t+h,t)=E(Xt+hXt)=E(Xt2)=(1+θ2)σ2
- h=±1
γX(t+h,t)=E(Xt+hXt)=E((Zt+1+θZt)(Zt+θZt−1))=E(Zt+1Zt)+θE(Zt2)+θE(Zt+1Zt−1)+θ2E(ZtZt−1)=θσ2
- ∣h∣>1
γX(t+h,t)=E(Xt+hXt)=E((Zt+h+θZt+h−1)(Zt+θZt−1))=0 becase t=t−1=t+h=t+h−1 if ∣h∣>1
Therefore, MA(1) is stationary, and ρX(h)=⎩⎪⎨⎪⎧1,h=01+θ2θ,h=±10,∣h∣>1