I was trying to implement the WSAE-LSTM model from the paper A deep learning framework for financial time series using stacked autoencoders and long-short term memory . In the first step, Wavelet Transform is applied to the Time Series, although the exact implementation is not outlined in the paper.
The paper hints at applying Wavelet Transform to the whole dataset. I was wondering if this leaks data from testing to training? This article also identifies this problem.
From the article-
I’m sure you’ve heard many times that whenever you’re normalizing a time series for a ML model to fit your normalizer on the train set first then apply it to the test set. The reason is quite simple, our ML model behaves like mean reverter so if we normalize our entire dataset in one go we’re basically giving our model the mean value it needs to revert to. I’ll give you a little clue if we knew the future mean value for a time series we wouldn’t need machine learning to tell us what trades to do ;)
It’s basically the exact same issue as normalising your train and test set in one go. You’re leaking future information into each time step and not even in a small way. In fact you can run a little experiment yourself; the higher a level wavelet transform you apply, miraculously the more “accurate” your ML model’s output becomes.
Can someone tell me if Wavelet Transform "normalizes" the dataset which would lead to data leakage when forecasting? Should it be applied to the whole dataset or only the training dataset?