[Mllab] Questions about the Rhine level dataset

16 Jul 2018

      Good morning,

I have been working for the past days with the Rhine level dataset and,
unfortunately, I have gotten relatively bad results. I have tried several
methods for predictions, based on DNN, such as standard (Dense layers)
models, convolutional models over the data of each station, locally
connected model (in the same sense that the convolutional ones) and LSTM
(which has a surprisingly large margin of error, i.e. 9 or 10% after some
150 training epochs). So, I am unsure if DNNs are really a good option for
the task.

Moreover, I have some issues with the size of the data. My approach has
been so far dividing the data into large but manageable chunks (of, say, 50
000 points), and iterating few epochs (say, 5) over each of them. However,
although in my imagination this should work more or less fine (50 000
points is several months, so it should be fine in order to predict some 12
hours), the actual situation is that after the first or second epoch the
validation loss stops decreasing, and stays at the same level for however
long I keep the model training. In raw numbers, this is a mean square error
of around 65 when the data is not normalized, and similar values for the
normalized case. In practical terms this means that when plotted, the
prediction is usually worse than just predicting the current value.

So, I don't have many ideas left. I tried normalizing the "MinMax" way
instead of X - μ / σ, but the results didn't seem too good. I have also
tried several activations without much success (only linear and ReLU work
well, the others produce huge errors.

Also, with respect to what is suggested in the sheet, I have been unable to
find much information on any of them. More concretely, on the approach of
wavelets, I would appreciate some sources. My main thought would be
decomposing the level as a series of superposed wavelets and taking the
most "relevant ones" (i.e. discarding what can be thought of as noise), but
I am not sure if this makes much sense theoretically or even in the
practical application.

So, as a summary, I would appreciate some guiding towards what I should do,
or some opinion on whether what I have been doing is just a bad idea or a
bad implementation (of course there is a chance that the code is not doing
what intended, and so the problem is a the implementation level and not at
the more theoretical one).

Thank you very much,

Olmo

Olmo Chiara

tags

participants (1)