-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathM-X11-decomposition.qmd
739 lines (458 loc) · 27.5 KB
/
M-X11-decomposition.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
# X11 decomposition {.unnumbered #m-X11}
X11 is the decomposition part of X13-ARIMA seasonal adjustment algorithm originally developed by the US Census Bureau. It is a non parametric algorithm based on [moving averages](#m-x11-ma).
The genuine X11 algorithm was meant to decompose monthly ($p=12$) and quarterly series ($p=4$). In JDemetra+ half-yearly data ($p=2$), quadri-monthly ($p=3$) and bi-monthly data ($p=6$) can also be handled. ($p=3$ and $p=6$ only in version 3.x)
In the following chapter, explanations and examples are based on monthly and quarterly data, but the principles can be extended to other supra-monthly frequencies.
In recent years, X11 implementation in JDemetra+ v 3.x has been extended to infra-monthly data (weekly, daily, hourly...). Handling this kind of data gave way to a tailored X11 algorithm whose peculiarities are described in [this chapter](#m-sa-hf).
## In this chapter {#m-X11-intro}
This chapter provides details on
- [algorithm steps](#X11-steps)
- [computation stages and detailed output series](#m-X11-stages-output)
- [quality measures](#m-X11-qm)
- [filter length choices](#m-X11-filters-length-choice)
- [extreme values correction](#X11-I-correc)
The practical implementation as well as all the options, using the graphical user interface or R packages are described in [this chapter](#sa-X11)
<!-- ref to add: the X11 method -->
## Moving averages in X11 {#m-X11-ma}
Moving averages (MA) are the building blocks of X11. They will be used successively to accomplish three goals:
- removing seasonality
- extracting seasonality
- estimating Trend on non seasonal series
The type of MA used for each of these tasks and the computation steps are described in the sections below.
### Definitions
A moving average of *order* $p+f+1$ and coefficients $(\theta_i)$ is the operator $M$ defined as:
$$
MX_t = \sum_{i=-p}^{f}\theta_i X_{t+i}
$$
The series value in $t$ is replaced by a weighted average of $p$ past values, the current value and the $f$ future values. If $p=f$, the moving average is *centered* and if $\theta_{-i} = \theta_i$, it is *symmetrical*.
Example of simple moving average of order 3:
$$
MX_t = \frac{1}{3}(X_{t-1}+X_{t}+ X_{t+1})
$$
A moving average is a **linear operator**, $M(X_t+Y_t) = M(X_t)+M(Y_t)$.
### Combined moving averages
Centred and symmetrical moving averages preserve linear trends, which is a desirable propriety. They cannot have an even order, thus for even orders they are obtained by combining simple moving averages as arithmetic means of p moving averages of the same order (ie. length): $M_{p\times order}$
Combination example for order 12:
There are two intuitive ways to create a Moving Average of order 12:
$$
M1X_t = \frac{1}{12}(X_{t-6}+X_{t-5}+ X_{t-4}+ X_{t-3}+ X_{t-2}+ X_{t-1}
$$
$$
+ X_{t}+ X_{t+1}+ X_{t+2}+ X_{t+3}+ X_{t+4}+X_{t+5})
$$
The other being:
$$
M2X_t = \frac{1}{12}(X_{t-5}+ X_{t-4}+ X_{t-3}+ X_{t-2}+ X_{t-1}+ X_{t}
$$
$$
+ X_{t+1}+ X_{t+2}+ X_{t+3}+ X_{t+4}+X_{t+5}+X_{t+6})
$$
A centred and symmetrical MA with an **even order** (here 12) can be created:
$$
M_{2\times 12}= \frac{1}{2}(M1X_t+M2X_t)
$$
which is:
$$
M_{2\times 12} =\frac{1}{24}(X_{t-6}) +\frac{1}{12}(X_{t-5}+ X_{t-4}
$$
$$
+X_{t-3}+ X_{t-2}+ X_{t-1}+ X_{t}+X_{t+1}+ X_{t+2}
$$
$$
+ X_{t+3}+ X_{t+4}+X_{t+5})+\frac{1}{24}(X_{t+6})
$$
### Supressing locally constant seasonality
Applying a moving average of an order equal to the periodicity of the raw series removes a locally stable seasonality ($\sum_{i=1}^{12}S_{t+i} = 0$)
A moving average of order, 12 will remove a locally stable monthly seasonality: $M_{1\times 12}(S)=0$ and also $M_{2\times 12}(S)=0$ with linear trend preservation.
## X11 algorithm steps {#X11-steps}
The X11 decomposition algorithm has eight main steps, outlined below for a monthly time series. (For a quarterly time series a $2\times 4$
moving average would be used, instead of $2\times 12$)
Step 1: Estimation of the **trend-cycle** with a $2\times 12$ MA
$$
TC_t^{(1)}=M_{2\times 12}(X_t)
$$
Step 2: Estimation of the **seasonal+irregular** component
$$
(S_t+I_t)^{(1)}= X_t - TC_t^{(1)}
$$
Step 3: Estimation of the **seasonal** component by applying a $3\times 3$ MA to **each month**
$$
S_t^{(1)}= M_{3\times 3}\left[(S_t+I_t)^{(1)}\right]
\text{ and normalisation }
Snorm_t^{(1)}=S_t^{(1)} - M_{2\times 12}\left(S_t^{(1)}\right)
$$
Step 4: First estimation of the seasonally adjusted series
$$
Xsa_t^{(1)}= (TC_t+I_t)^{(1)} = X_t-Snorm_t^{(1)}
$$
Step 5: Refined estimation of the **trend-cycle** with a Henderson filter, which yields a better approximation fo trends than $2\times 12$ MA, but cannot be applied on a seasonal series
$$
TC_t^{(2)}=H_{13}(Xsa_t^{(1)})
$$
Step 6: Refined estimation of the **seasonal+irregular** part
$$
(S_t+I_t)^{(2)}= X_t - TC_t^{(2)}
$$
Step 7: Refined estimation of the **seasonal** component by applying a $3\times 5$ MA (generally) to **each month/quarter**
$$
S_t^{(2)}= M_{3\times 5}\left[(S_t+I_t)^{(2)}\right]
\text{ and normalisation }
Snorm_t^{(2)}=S_t^{(2)} - M_{2\times 12}\left(S_t^{(2)}\right)
$$
Step 8: Final estimation of the seasonally adjusted series
$$
Xsa_t^{(2)}= X_t - Snorm_t^{(2)}
$$
## Processing stages and output {#m-X11-stages-output}
To evaluate the different components of a series, while taking into
account the possible presence of extreme observations, X11 will proceed
iteratively:
- estimation of components
- search for disruptive effects in the irregular component
- estimation of components over a corrected series
- search for disruptive effects in the irregular component
- ...
The steps described above will be run six times, at least.
The algorithm is split into
- four processing stages called A, B, C and D
- two diagnostics parts: E tables and Quality Measures (Summary and Detailed )
![](All_images/ESTP_bbk_X11_basic_principle.png)
![](All_images/ESTP_bbk_X11_detailed_computation.png)
### Stage A: Pre-adjustment
If a [pre-treatment](#sa-pre-t) is performed using X-13-Arima algorithm is A step is not used, results in A table appearing in JDemetra+ output are a copy of pre-treatment results.
Detailed series produced at the end of stage A, including estimated effect from the [Reg-Arima part](#sa-pre-t)
(to be checked: v2 vs v3 display, in GUI and R, here info based on v2)
- Table A1: Original raw series;
- Table A1a: Forecast of Original Series;
- Table A2: Leap year effect;
- Table A6: Trading Day effect (1 or 6 variables);
- Table A7: The Easter effect;
- Table A8: Total Outlier Effect;
- Table A8i: Additive outlier effect;
- Table A8t: Level shift effect;
- Table A8s: Transitory effect;
- Table A9: Effect of user-defined regression variables assigned to
the seasonally adjusted series or for which the component has not
been defined;
- Table 9sa: Effect of user-defined regression variables assigned to
the seasonally adjusted series;
- Table9u: Effect of user-defined regression variables for which the
component has not been defined.
### Stage B: First automatic correction of the series
This stage consists of a first estimation and [down-weighting of
extreme observations](#X11-I-correc). This stage is performed by applying twice the algorithm [steps](#X11-steps) outlined above. It starts with the linearised or raw series copied in table B1 lead to table B20, containing adjustment values for extreme observations. B1 corrected with weights from B20 allows to compute the series C1 which will start the next stage.
Detailed series produced at the end of stage B:
- Table B1: Original series after adjustment by the Reg-ARIMA model;
- Table B2: Unmodified Trend (preliminary estimation using composite
moving average);
- Table B3: Unmodified Seasonal: Irregular Component (preliminary
estimation);
- Table B4: Replacement Values for Extreme SI Values;
- Table B5: Seasonal Component;
- Table B6: Seasonally Adjusted Series;
- Table B7: Trend (estimation using Henderson moving average);
- Table B8: Unmodified Seasonal: Irregular Component;
- Table B9: Replacement Values for Extreme SI Values;
- Table B10: Seasonal Component;
- Table B11: Seasonally Adjusted Series;
- Table B13: Irregular Component;
- Table B17: Preliminary Weights for the Irregular;
- Table B20: Adjustment Values for the original series B1, allow to compute C1.
(Up coming here: computation steps from B1 to B20)
### Stage C: Second automatic correction of the series**
Following the same steps as Stage B, this stage leads to table C20, which allows to compute the final "cleaned up" series shown in D1.
Detailed series produced at the end of stage C:
- Table C1: Modified Raw Series;
- Table C2: Trend (preliminary estimation using composite moving
average);
- Table C4: Modified Seasonal: Irregular Component;
- Table C5: Seasonal Component;
- Table C6: Seasonally Adjusted Series;
- Table C7: Trend (estimation using Henderson moving average);
- Table C9: Seasonal: Irregular Component;
- Table C10: Seasonal Component;
- Table C11: Seasonally Adjusted Series;
- Table C13: Irregular Component;
- Table C20: Adjustment Values for the original series B1, allow to compute D1.
(To be checked: C20 weights will be applied to B1 not C1)
(Up coming here: computation steps from C1 to C20)
### Stage D: Final decomposition and seasonal adjustment
In this part, all algorithm steps are applied one last time, finally leading to the computation of the final components
Detailed series produced at the end of stage D:
- Table D1: Modified Raw Series;
- Table D2: Trend (preliminary estimation using composite moving
average);
- Table D4: Modified Seasonal: Irregular Component;
- Table D5: Seasonal Component;
- Table D6: Seasonally Adjusted Series;
- Table D7: Trend (estimation using Henderson moving average);
- Table D8: Unmodified Seasonal: Irregular Component;
- Table D9: Replacement Values for Extreme SI Values;
- Table D10: Final Seasonal Factors;
- Table D10A: Forecast of Final Seasonal Factors;
- Table D11: Final Seasonally Adjusted Series;
- Table D11A: Forecast of Final Seasonally Adjusted Series;
- Table D12: Final Trend (estimation using Henderson moving
average);
- Table D12A: Forecast of Final Trend Component;
- Table D13: Final Irregular Component;
- Table D16: Seasonal and Calendar Effects;
- Table D16A: Forecast of Seasonal and Calendar Component;
- Table D18: Combined Calendar Effects Factors.
All final components include pre-adjustment effects stemming from outliers or regressors used in the pre-treatment step. (to be checked: v2 vs v3,here info based on v2)
D10 doesn't contain calendar effects which are added in D16.
(Up coming here: computation steps from D1 to D20)
### Stage E: Components modified for large extreme values
In this part, additional series will are computed and used in the Quality Measures part
Detailed series produced at the end of stage E:
- Table E1: Raw Series Modified for Large Extreme Values
- Table E2: SA Series Modified for Large Extreme Values
- Table E3: Final Irregular Component Adjusted for Large Extreme
Values
- Table E11: Robust Estimation of the Final SA Series
(Up coming here: computation steps for E Tables)
## Quality Measures {#m-X11-qm}
All the diagnostics below can be displayed in the GUI by expanding the NODES
Decomposition(X11) \> Quality Measures \> Summary
Decomposition(X11) \> Quality Measures \> Details
### M-statisctics
M statistics are specific quality measures (ref: Lothian and Mory (1979))
- $0<M_x<3$, acceptance region $M_x \leq 1$
- 11 statistics of the decomposition quality (M1 to M11) and 2 summary indicators (Q et Q-M2)
Detailed description:
- $M1$ measures the contribution of the irregular component to the
total variance. When it is above 1 some changes in outlier
correction should be considered.
- $M2$, which is a very similar to $M1$, is calculated on the basis of
the contribution of the irregular component to the stationary
portion of the variance. When it is above 1, some changes in an
outlier correction should be considered.
- $M3$ compares the irregular to the trend taken from a preliminary
estimate of the seasonally adjusted series. If this ratio is too
large, it is difficult to separate the two components from each
other. When it is above 1 some changes in outlier correction should
be considered.
- $M4$ tests the randomness of the irregular component. A value above
1 denotes a correlation in the irregular component. In such case a
shorter seasonal moving average filter should be considered.
- $M5$ is used to compare the significance of changes in the trend
with that in the irregular. When it is above 1 some changes in
outlier correction should be considered.
- $M6$ checks the $\text{SI}$ (seasonal: irregular components
ratio). If annual changes in the irregular component are too small
in relation to the annual changes in the seasonal component, the
$3 \times 5$ seasonal filter used for the estimation of the seasonal
component is not flexible enough to follow the seasonal movement. In
such case a longer seasonal moving average filter should be
considered. It should be stressed that $M6$ is calculated only if
the $3 \times 5$ filter has been applied in the model.
- $M7$ is the combined test for the presence of an identifiable
seasonality. The test compares the relative contribution of stable
and moving seasonality[^m-X11-decomposition-2].
- $M8$ to $M11$ measure if the movements due to the short-term
quasi-random variations and movements due to the long-term changes
are not changing too much over the years. If the changes are too
strong then the seasonal factors could be erroneous. In such case a
correction for a seasonal break or the change of the seasonal filter
should be considered.
The $Q$ statistic is a composite indicator calculated from the $M$
statistics.
$$
\small
Q = \frac{10M1 + 11M2 + 10M3 + 8M4 + 11M5 + 10M6 + 18M7 + 7M8 + 7M9 + 4M10 + 4M11}{100}
$$
$Q = Q - M2$ (also called $Q2$) is the $Q$ statistic for which the $M2$
statistics was excluded from the formula, i.e.:
$$
\small
Q - M2 = \frac{10M1 + 10M3 + 8M4 + 11M5 + 10M6 + 18M7 + 7M8 + 7M9 + 4M10 + 4M11}{89}
$$
If a time series does not cover at least 6 years, the $M8$, $M9$, $M10$
and $M11$ statistics cannot be calculated. In this case the $Q$
statistic is computed as:
$$
\small
Q = \frac{14M1 + 15M2 + 10M3 + 8M4 + 11M5 + 10M6 + 32M7}{100}
$$
### Detailed Quality measures
#### Average percent change (or Average differences) without regard to sign over the indicated span
The first table presents the average percent change without regard to sign of the percent changes (multiplicative model) or average differences (additive model) over several periods (from 1 to 12 for a monthly series, from 1 to 4 for a quarterly series) for the following series:
- $O$: Original series (Table A1);
- $\text{CI}$: Final seasonally adjusted series (Table D11);
- $I$: Final irregular component (Table D13);
- $C$: Final trend (Table D12);
- $S$: Final seasonal factors (Table D10);
- $P$: Preliminary adjustment coefficients, i.e. regressors estimated by the Reg-Arima model (Table A2);
- $TD\& H$: Final calendar component (Tables A6 and A7);
- $\text{Mod.O}$: Original series adjusted for extreme values (Table E1);
- $\text{Mod.CI}$: Final seasonally adjusted series corrected for extreme values (Table E2);
- $\text{Mod.I}$: Final irregular component adjusted for extreme values (Table E3).
In the case of an additive decomposition, for each component the average absolute changes over several periods are calculated as:
$$
\text{Component}_{d} = \frac{1}{n - d}\sum_{t = d + 1}^{n}|Table_{t} - Table_{t - d}|
$$
where:
$d$: time lag in periods (from a monthly time series $d$ varies from to 4 or from 1 to 12);
$n$: total number of observations per period;
$\text{Component}$: the name of the component;
$\text{Table}$: the name of the table that corresponds to the component.
![](All_images/RDimage48.png)
For the multiplicative decomposition the following formula is used:
$$
\text{Component}_{d} = \frac{1}{n - d}\sum_{t = d+1}^{n}{|\frac{\text{Tabl}e_{t}}{\text{Table}_{t - d}} - 1|}
$$
.
#### Relative contribution to the variance of the differences in the components of the original series
Relative contributions of the different components to the differences (additive model) or percent changes (multiplicative model) in the original series is displayed express the relative importance of the changes in each component. Assuming that the components are independent, the following relation is valid:
$$
O_{d}^{2} \approx C_{d}^{2} + S_{d}^{2} + I_{d}^{2} + P_{d}^{2} + {TD\& H}_{d}^{2}
$$
In order to simplify the analysis, the approximation can be replaced by the following equation:
$$
O_{d}^{*2} = C_{d}^{2} + S_{d}^{2} + I_{d}^{2} + P_{d}^{2} + {TD\& H}_{d}^{2}
$$
The column $\text{Total}$ denotes total changes in the raw time series.
Data presented in Table F2B indicate the relative contribution of each component to the percent changes (differences) in the original series over each span, and are calculated as:
$\frac{I_{d}^{2}}{O_{d}^{*2}}$, $\frac{C_{d}^{2}}{O_{d}^{*2}}$, $\frac{S_{d}^{2}}{O_{d}^{*2}}$, $\frac{P_{d}^{2}}{O_{d}^{*2}}$ and $\frac{TD\& H_{d}^{2}}{O_{d}^{*2}}$ where: $O_{d}^{*2} = I_{d}^{2} + C_{d}^{2} + S_{d}^{2} + P_{d}^{2}{+ TD\& H}_{d}^{2}$.
The last column presents the *Ratio* calculated as: $100 \times\frac{O_{d}^{*2}}{O_{d}^{2}}$, which is an indicator of how well the approximation ${(O_{d}^{*})}^{2} \approx O_{d}^{2}$ holds.
![](All_images/RDimage49.png)
#### Average differences with regard to sign and standard deviation over indicated span
When an additive decomposition is used, Table F2C presents the average and standard deviation of changes calculated for each time lag $d$, taking into consideration the sign of the changes of the raw series and its components. In case of a multiplicative decomposition the respective table shows the average percent differences and related standard deviations.
![](All_images/RDimage50.png)
#### Average duration of run
Average duration of run is an average number of consecutive monthly (or quarterly) changes in the same direction (no change is counted as a change in the same direction as the preceding change). JDemetra+ displays this indicator for the seasonally adjusted series, for the trend and for the irregular component.
![**Average duration of run**](All_images/RDimage51.png)
#### I/C ratio over indicated span and global
The $\frac{I}{C}$ ratios for each value of time lag $d$, presented in Table F2E, are computed on a basis of the data in Table F2A. Global IC is displayed below the table
![**I/C ratio**](All_images/RDimage52.png)
#### Relative contribution to the stationary part of the variance in the original series
The relative contribution of components to the variance of the stationary part of the original series is calculated for the irregular component ($I$), trend made stationary ($C$), seasonal component ($S$) and calendar effects (TD&H).
The trend is made stationary by by extracting a linear trend from the trend component presented in Table D12.
![**Relative contribution to the stationary part of the variance in the original series**](All_images/RDimage53.png)
#### Autocorrelations in the irregular
The last table shows the autocorrelogram of the irregular component from Table D13. In the case of multiplicative decomposition it is calculated for time lags between 1 and the number of periods per year +2 using the formula:
$$
\text{Corr}_{k}I = \frac{\sum_{t = k + 1}^{N}{(I_{t} - 1)(I_{t - k} - 1)}}{\sum_{t = 1}^{N}{(I_{t} - 1)}^{2}}
$$
where $N$ is number of observations in the time series and $k$ the lag.
For the additive decomposition the formula is:
$$
Corr_{k}I_{t} = \frac{\sum_{t = k + 1}^{N}{(I_{t} \times I_{t - k})}}{\sum_{t = 1}^{N}{(I_{t})}^{2}}
$$
![**Autocorrelations in the irregular**](All_images/RDimage54.png)
#### Heteroskedasticity{#m-X11-hetero}
A Cochran test on equal variances within each period is performed in the extreme value detection procedure to check if the irregular component is heteroskedastic. In this procedure the standard errors of the irregular component are used for an identification of extreme values. If the null hypothesis (for all the periods (months, quarters) the variances of the irregular component are identical) is rejected, the standard errors will be computed separately for each period. This will happen only if in the option *Calendarsigma*=**signif** has been selected.
![**Heteroskedasticity**](All_images/RDimage55.png)
#### Moving seasonality ratios (MSR) {#m-X11-msr}
For each $i^{\text{th}}$ month we will be looking at the mean annual changes for each component by calculating:
$$
{\overline{S}}_{i} = \frac{1}{N_{i} - 1}\sum_{t = 2}^{N_{i}}|S_{i,t} - S_{i,t - 1}|
$$
and
$$
{\overline{I}}_{i} = \frac{1}{N_{i} - 1}\sum_{t = 2}^{N_{i}}| I_{i,t} - I_{i,t - 1}|
$$
where $N_{i}$ refers to the number of months $\text{i}$ in the data, and the moving seasonality ratio of month $i$:
$$
MSR_{i} = \frac{\overline{I}_{i}}{\overline{S}_{i}}
$$
The Moving Seasonality Ratio (MSR) is used to measure the amount of noise in the Seasonal-Irregular component. By studying these values, the user can [select for each period the seasonal filter](#a-X11-seaso-filters) that is the most suitable given the noisiness of the series.
![**Moving Seasonality Ratio (MSR)**](All_images/RDimage56.png)
## Filter length choice {#m-X11-filters-length-choice}
### Trend estimation with Henderson Moving average {#m-X11-trend-filters}
In iteration B (Table B7), iteration C (Table C7) and iteration D (Table
D7 and Table D12) the trend component is extracted from an estimate of
the seasonally adjusted series using Henderson moving averages.
The algorithm chooses between different filter lengths automatically according to the $I/C$ ratio, the user can modify this choice (first step is computed with $H_{13}$)
![**Calculation of I-C ratios**](All_images/ESTP_bbk_X11_Henderson_choice_formula.png)
### Seasonality extraction filters {#m-X11-seaso-filters}
In iteration D, Table D10 shows an estimate of the seasonal factors
implemented on the basis of the modified SI (Seasonal: Irregular)
factors estimated in Tables D4 and D9bis. This component will have to be
smoothed to estimate the seasonal component; depending on the importance
of the irregular in the SI component
#### Step 1: Estimating the irregular and seasonal components
An estimate of the seasonal component is obtained by smoothing, month by
month and therefore column by column, Table D9bis using a simple 7-term
moving average, i.e. of coefficients
$\frac{1}{7} \left\{1,\ 1,\ 1,\ 1,\ 1,\ 1,\ 1\right\}$. In order not to
lose three points at the beginning and end of each column, all columns
are completed as follows. Let us assume that the column that corresponds
to the month is composed of $N$ values
$\left\{ x_{1},\ x_{2},\ x_{3},\ \ldots x_{N - 1},\ x_{N} \right\}$. It
will be transformed into a series
$\left\{ {x_{- 2},x_{- 1}{,x}_{0},x}_{1},\ x_{2},\ x_{3},\ \ldots x_{N - 1},\ x_{N},x_{N + 1},\ x_{N + 1},\ x_{N + 2},\ x_{N + 3} \right\}$
with $x_{- 2} = x_{- 1} = x_{0} = \frac{x_{1} + x_{2} + x_{3}}{3}$ and
$x_{N + 1} = x_{N + 2} = x_{N + 3} = \frac{x_{N} + x_{N - 1} + x_{N - 2}}{3}$.
We then have the required estimates: $S = M_{7}(D9bis)$ and
$I = D9bis - S$.
#### Step 2: Calculating the Moving Seasonality Ratios
For each $i^{\text{th}}$ month the mean annual changes for each
component is obtained by calculating
$$
{\overline{S}}_{i} = \frac{1}{N_{i} - 1}\sum_{t = 2}^{N_{i}}\left| S_{i,t} - S_{i,t - 1} \right|
$$
and
$$
{\overline{I}}_{i} = \frac{1}{N_{i} - 1}\sum_{t = 2}^{N_{i}}\left| I_{i,t} - I_{i,t - 1} \right|
$$
where $N_{i}$ refers to the number of months $\text{i}$in the data,
and the moving seasonality ratio of month $i$:
$$
MSR_{i} = \frac{\ {\overline{I}}_{i}}{ {\overline{S}}_{i}}
$$
These ratios are presented in [Detailed Quality Measures](#m-X11-msr)
#### Step 3: Calculating the overall Moving Seasonality Ratio
The overall Moving Seasonality Ratio is calculated as follows:
$$
\text{MSR}_{i} = \frac{\sum_{i}^{}{N_{i}\ }\ {\overline{I}}_{i}}{\sum_{i}^{}N_{i}{\overline{S}}_{i}}
$$
#### Step 4: Selecting a moving average and estimating the seasonal
component
Depending on the value of the ratio, the program automatically selects a
moving average that is applied, column by column (i.e. month by month)
to the Seasonal/Irregular component in Table D8 modified, for extreme
values, using values in Table D9.
The default selection procedure of a moving average is based on the
Moving Seasonality Ratio in the following way:
- If this ratio occurs within zone A (MSR \< 2.5), a $3 \times 3$
moving average is used; if it occurs within zone C (3.5 \< MSR \<
5.5), a $3 \times 5$ moving average is selected; if it occurs within
zone E (MSR 6.5), a $3 \times 9$ moving average is used;
- If the MSR occurs within zone B or D, one year of observations is
removed from the end of the series, and the MSR is recalculated. If
the ratio again occurs within zones B or D, we start over again,
removing a maximum of five years of observations. If this does not
work, i.e. if we are again within zones B or D, a $3 \times 5$
moving average is selected.
The chosen symmetric moving average corresponds, as the case may be 5
($3 \times 3$), 7 ($3 \times 5$) or 11 ($3 \times 9$ $3 \times 9$)
terms, and therefore does not provide an estimate for the values of
seasonal factors in the first 2 (or 3 or 5) and the last 2 (or 3 or 5)
years. These are then calculated using associated asymmetric moving
averages.
## Extreme values: identification and replacement {#X11-I-correc}
Though it is recommended rely on the pre-adjustment stage to correct for outliers (transparent method with explicit modelling), X11 has its own (historical) module for identification and treatment of extreme values based on a comparison between the actual and the theoretical value of $I$.
Stages B and C aim only at correcting for extreme values and contain several iterations of the following steps.
If the irregular is heteroskedastic the standard deviations used for identifying outliers will be computed separately period by period or for distinct groups of several periods.
**Step 1:** $I$ is estimated once $S$ has been extracted from $S+I$
- for each year the standard deviation $\sigma$ is computed on the 5 neighbouring years
- $I$ has a theoretical value $m$, for multiplicative model $m=1$, $m=0$ for an additive model
- for a given year $y$: any point such as $\lvert I_t - m\rvert >2,5 \sigma_y$ is considered as an extreme value and suppressed....
- ...all the yearly sigmas ($\sigma_y$) are computed without those points (more robust sigmas )
**Step 2:** The distance $\lvert I_t - m\rvert$ is computed for each point and evaluated with $\sigma_{y}$ as a benchmark, a weight $w_t$ is then assigned to each point, 3 cases:
1) value unchanged
$$
\lvert I_t - m\rvert <1.5 \sigma_y \Rightarrow w_t=1
$$
2) value downsized
$$
1,5 \sigma_y<\lvert I_t - m\rvert <2,5 \sigma_y \Rightarrow w_t=\frac{2.5 \sigma_y -\lvert I_t - m\rvert}{2.5 \sigma_y-1.5 \sigma_y}
$$
3) value removed and replaced
$$
\lvert I_t - m\rvert >2,5 \sigma_y \Rightarrow w_t=0
$$
**Step3:**
Using this weights, a new value of $S+I$ will be computed:
- if $w_{t}=1$, $S+I$ remains unchanged for point {t}
- if $w_{t}<1$ then the new value of $S+I$ will be an average of $w_{t}*(S+I)_{t}$ and the values of $(S+I)$ of the two closest neighbours in the future and in the past with $w=1$