Ryan Yocum
Ryan Yocum Published in 2018-01-11 22:36:49Z

I'm trying to fit a model on timeseries data using the Kalman filter from the dlm package. I have multiple predictors I want to test (e.g. y ~ b1x1+b2x2), but Kalman doesn't automatically estimate coefficients (b1, b2) the way OLS regressions do. So I get around this problem by using the DEoptim package. DEoptim tries a bunch of different coefficent values, which allows me to have a Kalman output, which I then feed into a glm model (glm(y~kalmanoutput)). DEoptim tries to minimize an out-of-sample error rate.

Even using the built-in parallel processing in DEoptim (parallelType=1), this "loop" takes hours on my computer (i5-8600K CPU @ 3.60GHz, 16 GB RAM).

I'd like advice on how I can speed up this process:

  • Should I look into GPU computing?
  • Are there probably in-line optimizations I can do? For example, I heard data.tables are more efficient than data.frames, so I added that in my code.
  • Anything else I'm not considering
