Table of Contents
 Introduction
 Wavelet Transforms
 Variance Decomposition
 Wavelet Thresholding
 Outlier Detection
 Conclusion
 Files
 References
Introduction to Wavelets
The new EViews 12 release has introduced several new statistical and econometric procedures. Among them is an engine for wavelet analysis. This is a complement to the existing battery of techniques in EViews used to analyze and isolate features which characterize a time series. While there are undoubtedly numerous applications to wavelets such as regression, unit root testing, fractional integration order estimation, and bootstrapping (wavestrapping), here we highlight the new EViews wavelet engine. In particular, we focuses on four popular and most often used areas of wavelet analysis: Transforms
 Variance decomposition
 Thresholding
 Outlier detection
Wavelet Transforms
The first step in wavelet analysis is usually a wavelet transform of a time series of interest. This is similar in spirit to a Fourier transform. The time series is decomposed into its constituent spectral (frequency) features on a scalebyscale basis. Recall that the idea of scale in wavelet analysis is akin to frequency in Fourier analysis. This is nothing more than a reexpression of time series observations in time, to their behaviour in the frequency domain. This allows us to see which scales (frequencies) dominate in terms of activity.Example 1: Wavelet Transforms as Informal Tests for (Non)Stationarity
Many important and routine tasks in time series analysis require classifying data as stationary or nonstationary. Any of the unit root tests available in EViews are designed to formally address such classifications. Nevertheless, wavelet transforms such as the discrete wavelet transform (DWT) or the maximum overlap discrete wavelet transform (MODWT) can also be used for a similar purpose. While formal waveletbased unit root tests are available in the literature, here we focus on demonstrating how wavelets can be used as an exploratory tool for stationarity determination in lieu of a formal test.Recall from the theoretical discussion of Mallat's algorithm in Part I that discrete wavelet transforms partition the frequency range into finer and finer blocks. For instance, at the first scale, the frequency range is split into two equal parts. The first, lower frequency part, is captured by the scaling coefficients and corresponds to the traditional (Fourier) frequency range $ \sbrace{0,\, \pi} $. The second, higher frequency part, is captured by the wavelet coefficients and corresponds to the traditional frequency range $ \sbrace{\pi,\, 2\pi} $. At the second stage, the lower frequency from the previous scale, namely the frequency region roughly corresponding to $ \sbrace{0,\, \pi} $ in the traditional Fourier context, is again split into two equal portions. Accordingly, the wavelet coefficients at scale 2 would roughly correspond to the traditional frequency region $ \sbrace{\frac{\pi}{2},\, \pi} $, whereas the scaling coefficients would roughly correspond to the traditional frequency region $ \sbrace{0,\, \frac{\pi}{2}} $, and so on.
This decomposition affords the ability to identify which features of the original time series data are dominant at which scale. In particular, if the spectra (read wavelet/scaling coefficient magnitudes) at a given scale are high, this would indicate that those coefficients are registering behaviours in the underlying data which dominate at said scale and frequency region. For instance, in the traditional Fourier context, if a series has very pronounced spectra near the frequency zero, this indicates that observations of that time series are very persistent (die off slowly). Naturally, one would classify such a series as nonstationary, possibly exhibiting a unit root. Alternatively, if a series has very pronounced spectra at higher frequencies, this indicates that the time series is driven by dynamics that frequently appear and disappear. In other words, the time series is driven by transient features and one would classify the time series as stationary. The analogue of this analysis in the context of wavelet analysis would proceed as follows.
At the first scale, if wavelet spectra dominate scaling spectra, the underlying series is dominated by higher frequency (transitory) forces and the series is most likely stationary. At scale two, if the scaling spectra dominate the wavelet spectra from the first and second scales, this indicates that lower frequency forces dominate higher frequency dynamics, providing evidence of nonstationarity. Naturally, this scalebased analysis carries on until the final decomposition scale.
To demonstrate the dynamics outlined above, we'll consider Canadian real exchange rate data extracted from the dataset in Pesaran (2007). This is a quarterly time series running from 1973Q1 to 1998Q4. The data can be found in WAVELETS.WF1. The series we're interested in is CANADA_RER. We'll demonstrate with a discrete wavelet transform (DWT) and the Haar wavelet filter. To facilitate the discussion to follow, we will consider the transformation only up to the first scale.
To perform the transform, proceed in the following steps:
 Double click on CANADA_RER to open the series window.
 Click on View/Wavelet Analysis/Transforms...
 From the Max scale dropdown, select 1.
 Click on OK.




The first plot in the output is a plot of the original series, in addition to the padded values in case a dyadic adjustment was applied. The last two plots are respectively the wavelet and scaling coefficients. Recall that at the first scale, the wavelet decomposition effectively splits the frequency spectrum into two equal portions: the low and high frequency portions, respectively. Recall further that the low frequency portion is associated with the scaling coefficients $ \mathbf{V} $ whereas the high frequency portion is associated with the wavelet coefficients $ \mathbf{W} $.
Evidently, the spectra characterizing the wavelet coefficients are significantly less pronounced than those characterizing the scaling coefficients. This is an indication that the Canadian real exchange series is possibly nonstationary. Furthermore, observe that the wavelet plot has two dashed red lines. These represent the $ \pm 1 $ standard deviation of the coefficients at that scale. This is particularly useful in visualizing which wavelet coefficients should be shrunk to zero (are insignificant) in wavelet shrinkage applications. (We will return to this later when we discuss wavelet thresholding outright.) Recall that coefficients exceeding some threshold bound (in this case the standard deviation) ought to be retained, while the remaining coefficients are shrunk to zero. From this we see that the majority of wavelet coefficients at scale 1 can be discarded. This is further evidence that high frequency forces in the CANADA_RER series are not very pronounced.
To justify the intuition, we can perform a quick ADF unit root test on CANADA_RER. To do so, from the open CANADA_RER series window, proceed as follows:
 Click on View/Unit Root Tests/Standard Unit Root Test...
 Click on OK.


While the wavelet decomposition is not a formal test, it is certainly a great way of identifying which scales (read frequencies) dominate the underlying series behaviour. Naturally, this analysis is not limited to the first scale. To see this, we will repeat the exercise above using the maximum overlap discrete wavelet transform (MODWT) with the Daubechies (daublet) filter of length 6. We will also perform the transform upto the maximum scale possible, and also indicate which and how many wavelet coefficient are affected by the boundary. (See Part I for a discussion of boundary conditions.)
From the open CANADA_RER series window, we proceed in the following steps:
 Click on View/Wavelet Analysis/Transforms...
 Change the Decomposition dropdown to Overlap transform  MODWT.
 Change the Class dropdown to Daubechies.
 From the Length dropdown select 6.
 Click on OK.






Analogous to the DWT, the MODWT partitions the frequency range into finer and finer blocks. At the first scale, we see that only a few wavelet coefficients exhibit significant spikes (ie. exceed the threshold bounds). At scales two and three, it is evident that transient features persist, but after that, don't seem to contribute much. Alternatively, the scaling coefficients at the final scale (scale 6) are roughly twice as large (0.20) as the largest wavelet spectrum (0.10) which manifests at scales 1 and 2. These are all indications that lower frequency forces dominate those at higher frequencies and that the underlying series is most likely nonstationary.
Finally, notice that for each scale, those coefficients affected by the boundary are displayed in red, and their count reported in the legends. A vertical dashed black line shows the region upto which the boundary conditions persist. Boundary coefficients are an important consequence of longer filters and higher scales. Evidently, as the scale is increased, boundary coefficients consume the entire set of coefficients. Moreover, since the MODWT is a redundant transform, the number of boundary coefficients will always be greater than those in the orthonormal DWT. As before the $ \pm 1 $ standard deviation bounds are available for reference.
Example 2: MRA as Seasonal Adjustment
It's worth noting that multiresolution analysis (MRA) is often used as an intermediate step toward some final inferential procedure. For instance, if the objective is to run a unit root test on some series, we may we wish to do so on the true signal, having discarded the noise, in order to get a more reliable test. Similarly, we may wish to run regressions on series which have been smoothed. Discarding noise from regressors may prevent clouding of inferential conclusions. This is the idea behind most existing smoothing techniques in the literature.In fact, wavelets are very well adapted to isolating many different kinds of trends and patterns, whether seasonal, nonstationary, nonlinear, etc. Here we demonstrate their potential using an artificial dataset with a quarterly seasonality. In particular, we generate 128 random normal variates and excite every first quarter with a shock. These modified normal variates are then fed as innovations into a stationary autoregressive (AR) process. This is achieved with a few commands in the command window or an EViews program as follows:
rndseed 128 'set the random seed
wfcreate q 1989 2020 'make quarterly workfile with 128 quarter
series eps = 8*(@quarter=1) + @rnorm 'create random normal innovations with each first quarter having mean 8
series x 'create a series x
x(1) = @rnorm 'set the first observation to a random normal value
smpl 1989q2 @last 'start the sample at the 2nd quarter
x = 0.75*x(1) + eps 'generate an AR process using eps as innovations
smpl @all 'reset the sample to the full workfile range
To truly appreciate the idea behind MRA, one ought to set the maximum decomposition level to a lower value. This is because the smooth series extracts the ''signal'' from the original series for all scales beyond the maximum decomposition level, whereas the ''noise'' portion of the original series is decomposed on a scalebyscale basis for all scales upto the maximum decomposition level. We now perform a MODWT MRA on the X series using a Daubechies filter of length 4 and maximum decomposition level 2, as follows:
 Double click on X to open the series.
 Click on View/Wavelet Analysis/Transforms...
 Change the Decomposition dropdown to Overlap multires.  MODWT MRA.
 Set the Max scale textbox to 2.
 Change the Class dropdown to Daubechies.
 Click on OK.




It is clear from the smooth series that seasonal patterns have been dropped from the underlying trend approximation of the original data. This is precisely what we want and the idea behind other well known seasonal adjustments techniques such as TRAMO/SEATS, X12, X13, STL Decompositions, etc., all of which can also be performed in EViews for comparison. In fact, the figure below plots our MRA smooth series against the STL decomposition trend series performed on the same data.


This figure above also suggests that the STL seasonal series should be very similar to the details from our MODWT MRA decomposition. Before demonstrating this, we remind readers that whereas the STL decomposition produces a single series estimate of the seasonal pattern, wavelet MRA procedures decompose noise (in this case seasonal patterns) on a scale by scale basis. Accordingly, at scale 1, the the MRA detail series captures all movements on a scale of 0 to 2 quarters. At scale 2, the MRA detail series captures movements on a scale of 2 to 4 quarters, and so on. In general, for each scale $ j $, the detail series capture patterns on a scale $ 2^{j1} $ to $ 2^{j} $ units, whereas the smooth series captures patterns on a scale of $ 2^{j} $ units.
Finally, turning to the comparison of seasonal variation estimates between the MRA and STL, we need to sum all detail series to compound their effect and produce a single series estimate of noise. We can then compare this with single series estimate of seasonality from the STL decomposition.


To demonstrate this in the context of nonartificial data, we'll run a MODWT MRA on the Canadian real exchange rate data using a Least Asymmetric filter of length 12 and a maximum decomposition scale 3.




Example 3: DWT vs. MODWT
We have already mentioned that the primary difference between the DWT and MODWT is redundancy. The DWT is an orthonormal decomposition whereas the MODWT is not. This is certainly an advantage of the DWT over its MODWT counterpart since it guarantees that at each scale, the decomposition captures only those features which characterize that scale, and that scale alone. Nevertheless, the DWT requires input series to be of dyadic length, whereas the MODWT does not. This is an advantage of the MODWT since information is never dropped or added to derive the transform. Nevertheless, the MODWT has an additional advantage over the DWT and it has to do with spectraltime alignment  any pronounced observations in the time domain register as spikes in the wavelet domain at the same time spot. This is unlike the DWT where this alignment fails to hold. Formally, it is said that the MODWT is associated with a zerophase filter, whereas the DWT does not. In practice, this means that outlying characteristics (spikes) in the DWT MRA will not align with outlying features of the original time series, whereas they will in the case of the MODWT MRA.To demonstrate this difference we will generate a time series of length 128 and fill it with random normal observations. We will then introduce a large outlying observation at observation 64. We will then perform a DWT MRA and a MODWT MRA decomposition of the same data using a Daubechies filter of length 4 and study the differences. We will also only consider the first scale since the remaining scales do little to further the intuition.
We can begin by creating our artificial data by typing in the following set of commands in the command window:
wfcreate u 128
series x = @rnorm
x(64) = 40
These commands create a workfile of length 128, and a series X filled with random normal variates. The 64th observation is then set to 40  roughly 10 times as large as observations in the top 1\% of the Gaussian distribution.We then generate a DWT MRA and a MODWT MRA transform of the same series. The output is summarized in the plots below.




Variance Decomposition
Another traditional application of wavelets is to variance decomposition. Just as wavelet transforms can decompose a series signal across scales, they can also decompose a series variance across scales. In particular, this is a decomposition of the amount of original variation attributed to a given scale. Naturally, the conclusions derived above on transience would hold here as well. For instance, if the contribution to overall variation is largest at scale 1, this would indicate that it is transitory forces which contribute most to overall variation. The opposite is true if higher scales are associated with larger contributions to overall variation.Example: MODWT Unbiased Variance Decomposition
To demonstrate the procedure, we will use Japanese real exchange rate data from 1973Q1 to 1988Q4, again extracted from the Pesaran (2007) dataset. The series of interest is called JAPAN_RER. We will produce a scalebyscale decomposition of variance contributions using the MODWT with a Daubechies filter of length 4. Furthermore, we'll produce a 95% confidence intervals using the asymptotic Chisquared distribution with a bandpass estimate for the EDOF. The bandpass EDOF is preferred here since the sample size is less than 128 and the asymptotic approximation to the EDOF requires a sample size of at least 128 observations for decent results.From the open series window, proceed in the following steps:
 Click on View/Wavelet Analysis/Variance Decomposition...
 Change the CI type dropdown to Asymp. BandLimited.
 From the Decomposition dropdown select Overlap transform  MODWT.
 Set the Class dropdown to Daubechies.
 Click on OK.




The first plot is a histogram of variances at each given scale. It is clear that the majority of variation in the JAPAN_RER series comes from higher scales, or lower frequencies. This is indicative of persistent behaviour in the original data, and possibly evidence of a unit root. A quick unit root test on the series will confirm this intuition. The plot below summarizes the output of a unit root test on JAPAN_RER.


Wavelet Thresholding
A particularly important aspect of empirical work is discerning useful data from noise. In other words, if an observed time series is obscured by the presence of unwanted noise, it is critical to obtain an estimate of this noise and filter it from the observed data in order to retain the useful information, or the signal. Traditionally, this filtration and signal extraction was achieved using Fourier transforms or a number of previously mentioned routines such as the STL decomposition. While the former is typically better suited to stationary data, the latter can accommodate nonstationarities, nonlinearities, and seasonalities of arbitrary type. This makes STL an attractive tool in this space and similar (but ultimately different) in function to wavelet thresholding. The following examples explores these nuances.Example: Thresholding as Signal Extraction
Given a series of observed data, recall that STL decomposition produces three curves: Trend
 Seasonality
 Remainder
The last of these is obtained by subtracting from the original data the first two curves. As an additional byproduct, STL also produces a seasonally adjusted version of the original data which derives by subtracting from the original data the seasonality curve.
In contrast, recall from the theoretical discussion in Part I of this series that the principle governing waveletbased signal extraction, otherwise known as wavelet thresholding or wavelet shrinkage, is to shrink any wavelet coefficients not exceeding some threshold to zero and then exploit the MRA to synthesize the signal of interest using the modified wavelet coefficients. This produces two curves:
 Signal
 Residual
Because wavelet thresholding treats any insignificant transient features as noise, it is very likely that any reticent cylclicality would be treated as noise and driven to zero. In this regard, the extracted signal, while perhaps free of cyclical dynamics, would really be so only by technicality, and not by intention. This is in contrast to STL which derives an explicit estimate of seasonal features, and then removes those from the original data to derive the seasonally adjusted curve. Nevertheless, in many instances, the STL seasonally adjusted curve may behave quite similarly to the signal extracted via wavelet thresholding. To demonstrate this, we'll use French real exchange rate data from 1973Q1 to 1988Q4 extracted from the Pesaran (2007) dataset. The series of interest is called FRANCE_RER. We'll also start with performing a MODWT threshold using a Least Asymmetric filter of length 12, and maximum decomposition level 1.
Double click on the FRANCE_RER series to open its window and proceed as follows:
 Click on View/Wavelet Analysis/Thresholding (Denoising)...
 Change the Decomposition dropdown to Overlap transform  MODWT.
 Set the Max scale to 1.
 Change the Class dropdown to Least Asymmetric.
 Set the Length dropdown to 12.
 Click on OK.


Next, let's derive the STL decomposition of the same data. The plots below superimpose the wavelet signal estimate on top of the STL seasonally adjusted curve, as well as the wavelet thresholded noise on top of the STL remainder series.




Outlier Detection
A particularly important and useful application of wavelets is outlier detection. While the subject matter has received some attention over the years starting with Greenblatt (1996), we focus here on a rather simple and appealing contribution by Bilen and Huzurbazar (2002). The appeal of their approach is that it doesn't require model estimation, is not restricted to processes generated via ARIMA, and works in the presence of both additive and innovational outliers. The approach does assume that wavelet coefficients are approximately independent and identically normal variates. This is a rather weak assumption since the independence assumption (the more difficult to satisfy) is typically guaranteed using the DWT. While EVIews offers the ability to perform this procedure using a MODWT, it's generally better suited to the orthonormal transform.Bilen and Huzurbazar (2002) also suggest that Haar is the preferred filter here. This is because the latter yields coefficients large in magnitude in the presence of jumps or outliers. They also suggest that the transformation be carried out only at the first scale. Nevertheless, EViews does offer the ability to stray away from these suggestions.
The overall procedure works on the principle of thresholding and the authors suggest the use of the universal threshold. The idea here is that extreme (outlying) values will register as noticeable spikes in the spectrum. As such, those values would be candidates for outlying observations. In particular, if $ m_{j} $ denotes the number of wavelet coefficients at scale $ \lambda_{j} $, the entire algorithm is summarized (and generalized) as follows:
 Apply a wavelet transform to the original data up to some scale $ J \leq M $.
 Specify a threshold value $ \eta $.
 For each $ j = 1, \ldots, J $:
 Find the set of indices $ S = \cbrace{s_{1}, \ldots, s_{m_{j}}} $ such that $ W_{i, j} > \eta $ for $ i = 1, \ldots, m_{j} $.
 Find the exact location of the outlier among original observations. For instance, if $ s_{i} $ is an index associated with an outlier:
 If the wavelet transform is the DWT, the original observation associated with that outlier is either $ 2^{j}s_{i} $ or $ (2^{j}s_{i}  1) $. To discern between the two, let $ \tilde{\mu} $ denote the mean of the original observations with observations located at $ 2s_{i} $ and $ (2s_{i}  1) $. That is: $$ \tilde{\mu} = \frac{1}{T2}\sum_{t \neq 2^{j}s_{i}\, ,\, (2^{j}s_{i}  1)}{y_{t}} $$ If $ y_{2^{j}s_{i}}  \tilde{\mu} > y_{2^{j}s_{i}  1}  \tilde{\mu} $, the location of the outlier is $ 2^{j}s_{i} $, otherwise, the location of the outlier is $ (2^{j}s_{i}  1) $.
 If the wavelet transform is the MODWT, the outlier is associated with observation $ i $.
Example: Bilen and Huzurbazar (2002) Outlier Detection
To demonstrate outlier detection, data is obtained from the US Geological Survey website https://www.usgs.gov/. As discussed in Bilen and Huzurbazar (2002), data collected in this database comes from many different sources and is generally notorious for input errors. Here we focus on a monthly dataset, collected at irregular intervals from May 19876 to June 2020, measuring water conductance at the Green River near Greendale, UT. The dataset is identified by site number 09234500.A quick summary of the series indicates that there is a large drop from typical values (500 to 800 units) in September 1999. The value recorded at this date is roughly 7.4 units. This is an unusually large drop and is almost certainly an outlying observation.
In an attempt to identify the aforementioned outlier, and perhaps uncover others, we use aforementioned wavelet outlier detection method. We stick with the defaults suggested in the paper and use a DWT transform with a Haar filter, universal threshold, a mean median absolute deviation estimator for wavelet coefficient variance, and a maximum decomposition scale set to unity.
To proceed, either download the data from the source, or open the tab Outliers in the workfile provided. The series we're interested in is WATER_CONDUCTANCE. Next, open the series window and proceed as follows:
 Click on View/Wavelet Analysis/Outlier Detection...
 Set the Max scale dropdown to 1.
 Under the Threshold group, set the Method dropdown to Hard.
 Under the Wavelet coefficient variance group, set the Method dropdown to Mean Med. Abs. Dev..
 Click on OK.


Evidently, the large outlying observation in September 1999 is accurately identified. In addition there are three other possible outlying observations identified in September 1988, January 1992, and June 2020.
Conclusion
In this first entry of our series on wavelets, we provided a theoretical overview of the most important aspects in wavelet analysis. Here we demonstrated how these principles are applied to real and artificial data using the new EViews 12 wavelet engine.Files
References
 Bilen C and Huzurbazar S (2002), "Waveletbased detection of outliers in time series", Journal of Computational and Graphical Statistics. Vol. 11(2), pp. 311327. Taylor & Francis.
 Greenblatt SA (1996), "Wavelets in econometrics", In Computational Economic Systems. , pp. 139160. Springer.
 Pesaran MH (2007), "A simple panel unit root test in the presence of crosssection dependence", Journal of applied econometrics. Vol. 22(2), pp. 265312. Wiley Online Library.
No comments:
Post a Comment