Aeon: A unified framework for machine learning with time series
github.comIt strikes me as a bit weird that these time series packages tend to discard the time component of the data and just.. not do anything with it.
Prophet, for example uses dates to create Fourier terms and indicators to holidays for example and that just seems like a more sane approach.
It depends on your time series, really. If the samples are evenly spaced, e.g., your sensor gives you a reading every millisecond, your measured sequences aren’t partially overlapping, and you don’t have structured discrete events, then time isn’t very useful. You can always rescale time so that it is just the same as the index.
For your calendar example, date information is very useful because patterns tend to exhibit a cyclic nature across years and there’s discrete special events (holidays). With enough data, you probably don’t need to include the date, but it’s informative for smaller data sets.
Yeah most of the time series data I've had to work with I end up spending a huge amount of time interpolating (so all time slices have some data even if it isn't real) or aggregating to some common denominator (e.g taking sporadic sales and summing up to daily sales). I get why most packages expect nicely spaced or evenly dense data, but boy I would love if I had more options there.
Most time series models assume you've already deseasonalized your data in advance. Typically, seasonality is obvious to the human doing the modeling (e.g. sales being up near Christmas), so it's usually preferable for the human to deseasonalize the data in advance using a separate model that bakes in some of their human knowledge of how the world works. Forcing the model to learn seasonal trends fully on its own adds another layer of estimation error.
Prophet is popular because it works off the shelf with non-deseasonalized data and mixed frequency data, which makes it great for quick forecasting exercises. But IMO it is never the ideal model if you have a lot of time and expertise to work with.
Prophet has worked so well for us, especially since we have a TON of custom events and holidays to consider. None of the other approaches have really come close.
hi, I dont want to enter a public discussion about the split of sktime, I fear the application of Godwin's law. A summary of the key points behind the split from my perspective are here https://github.com/aeon-toolkit/aeon/issues/456 the other sides view will no doubt be forthcoming. If you want to chat about it, join our slack and message me, I'm more than happy to help. How are we different? Well I think we can all live together, its open source, but from my perspective the priorities are 1. Align as closely as we can with sklearn, so as to make it completely intuitive how to use aeon if you know sklearn. 2. Focus on implementations of state of the art algorithms for time series machine learners and less on just wrapping other code. The goal is to reduce the lead time from publication of new ideas to widespread adoption 3. Documentation: make it good.
my interests primarily lie in classification, clustering and regression, but next year we are going into the forecasting world, plenty of exciting collaborations in the brew.
Its refreshing to see that classification is mentioned before forecasting. It has been a frustrating journey embarking on time series classification as it seems overlooked compared to forecasting. Will follow this project closely, and implement it in my next project!
time series classification is my primary research area :)
Wondering how it compares to the rest of the lot: sktime, tslearn, darts, pyts, and cesium.
> Wondering how it compares to the rest of the lot: sktime
It's a fork of sktime. Last common commit before the fork is on Jan 30, 2023.
aeon is based on sktime==0.16.0
Aeon is an sktime fork which happened after one of the sktime core developers (Franz K.) took hostage of the sktime project by kicking out other core devs from the GitHub. Its info you can collect from some GH issues
Looking forward to checking this out! How does this compare with darts[1]?
Darts is a commercial FOSS, aeon is community driven. Also aeon is more following scikit-learn.
Aeon has the advantage of including a friendly deep learning framework, all of the models discussed on the 'Deep Learning for Time Series Classification: a review" are included in aeon with the variety of choices on how to change the parameters of the architecture. More state of the art models such as InceptionTime are also included, not only for classification but regression as well and soon forecasting and clustering.
I wonder why aeon split from sktime.
https://twitter.com/sktime_toolbox/status/164721412371161907...
Here is a little about it: https://astrojuanlu.substack.com/p/episodio-70
See my comment above, mostly all active core devs from sktime at that time left or had to leave to the aeon project
To obfuscate the choice of algorithm behind kwargs (as opposed to creating separate classes) has always seemed to me a suspect choice, in sklearn as well as here. And it seems to make development of the package more complex at the expense of... less readable code for the user, with less flexibility for differences in hyperparameter specifications, etc.
There are of course exceptions, something like `TrendPredictor(order=1, interp="polynomial")` as an example can be flexibly adapted up or down the hierarchy of model complexity much easier than commenting out different lines.
I have taught machine learning in Java using Weka for a long time, and when we moved over to sklearn this also annoyed me. It made a good teaching point with, for example, decision trees having a dozen separate different classes for different algorithms in Weka and sklearn having one configurable one. I guess just design preference in the end. With aeon we are leaning more towards the one class per algorithm or algorithm family, but its not a hard and fast rule. One issue is when does a change in algorithm mean a change in class? So, for example, we have separate transformers for ROCKET, MINROCKET and MULTIROCKET (convolution transforms), but a single configurable RocketClassifier. UltimatelyI think it comes down to how comprehensible it is to a new user.
Recently aeon included a new implementation for both PAA and SAX transformations that are much more efficient and much faster !
all of the functionalities of aeon can easily be mastered thanks for the help of its documentation and example notebooks.