🔍 Behind the Scenes: Methodology & Innovation
1. The Helformer Architecture
Helformer integrates three key components to outperform existing models:
-
Series Decomposition: Using Holt-Winters smoothing, we break price data into level, trend, and seasonality components (Fig. 1). This step isolates patterns that traditional Transformers might miss.
-
Multi-Head Attention: Unlike sequential models (e.g., LSTM), Helformer processes all time steps simultaneously, capturing long-range dependencies efficiently.
-
LSTM-Enhanced Encoder: Replacing the standard Feed-Forward Network with an LSTM layer improves temporal feature extraction.
Fig. 1: Helformer architecture.
2. Data & Hyperparameter Tuning
We trained Helformer on Bitcoin (BTC) daily closing prices (2017–2024) and tested its generalization on 15 other cryptocurrencies (e.g., ETH, SOL). To optimize performance, we used Bayesian optimization via Optuna, automating hyperparameter selection (e.g., learning rate, dropout) and pruning underperforming trials early.
3. Evaluation Metrics
Helformer was benchmarked against RNN, LSTM, GRU, and vanilla Transformer models using:
-
Similarity metrics: R², Kling-Gupta Efficiency (KGE), EVS
-
Error metrics: RMSE, MAPE, MAE
-
Trading metrics: Sharpe Ratio, Maximum Drawdown, Volatility, Cumulative returns
đź’ˇ Key Findings & Practical Impact
1. Superior Predictive Accuracy
Helformer achieved near-perfect R² (1.0) and MAPE (0.0148%) on BTC test data, outperforming all baseline models (Table 1). Its decomposition step reduced errors by 98% compared to vanilla Transformers.
Table 1: Model Performance Comparison
|
Model |
RMSE |
MAPE |
MAE |
R² |
EVS |
KGE |
|
RNN |
1153.1877 |
1.9122% |
765.7482 |
0.9950 |
0.9951 |
0.9905 |
|
LSTM |
1171.6701 |
1.7681% |
737.1088 |
0.9948 |
0.9949 |
0.9815 |
|
BiLSTM |
1140.4627 |
1.9514% |
766.7234 |
0.9951 |
0.9952 |
0.9901 |
|
GRU |
1151.1653 |
1.7500% |
724.5279 |
0.9950 |
0.9950 |
0.9878 |
|
Transformer |
1218.5600 |
1.9631% |
799.6003 |
0.9944 |
0.9946 |
0.9902 |
|
Helformer |
7.7534 |
0.0148% |
5.9252 |
1 |
1 |
0.9998 |
2. Profitable Trading Strategies
In backtests, a Helformer-based trading strategy yielded 925% excess returns for BTC—tripling the Buy & Hold strategy’s returns (277%)—with lower volatility (Sharpe Ratio: 18.06 vs. 1.85), as shown in Fig. 2.
3. Cross-Currency Generalization
Helformer’s pre-trained BTC weights transferred seamlessly to other cryptocurrencies, achieving R² > 0.99 for XRP and TRX. This suggests broad applicability without retraining—a boon for investors managing diverse portfolios.
🌍 Relevance to the Community
-
For Researchers: Helformer’s architecture opens avenues for hybrid time-series models in finance, healthcare, and climate forecasting.
-
For Practitioners: The model’s interpretable components (decomposition + attention) make it adaptable to volatile markets beyond crypto.
-
For Policymakers: Reliable price forecasts could inform regulations to stabilize crypto markets and protect investors.
🤝 Acknowledgments & Open Questions
This work wouldn’t have been possible without my brilliant co-authors Oluyinka Adedokun, Joseph Akpan, Morenikeji Kareem, Hammed Akano, and Oludolapo Olanrewaju, or the support of The Hong Kong Polytechnic University.
We’d love to hear your thoughts!
-
How might Helformer adapt to non-financial time-series data?
-
Could integrating sentiment analysis further improve accuracy?
-
What ethical considerations arise with AI-driven trading?
🔗 Access the full paper: SpringerLink | ReadCube