Helformer: an attention-based deep learning model for cryptocurrency price forecasting

Traditional forecasting models often fail to capture the nonlinear, nonstationary nature of crypto markets, where prices swing dramatically based on factors like market sentiment, regulatory news, and macroeconomic trends. This work develops a Helformer, a new variant of Transformer architecture.
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Explore the Research

SpringerOpen
SpringerOpen SpringerOpen

Helformer: an attention-based deep learning model for cryptocurrency price forecasting - Journal of Big Data

Cryptocurrencies have become a significant asset class, attracting considerable attention from investors and researchers due to their potential for high returns despite inherent price volatility. Traditional forecasting methods often fail to accurately predict price movements as they do not account for the non-linear and non-stationary nature of cryptocurrency data. In response to these challenges, this study introduces the Helformer model, a novel deep learning approach that integrates Holt-Winters exponential smoothing with Transformer-based deep learning architecture. This integration allows for a robust decomposition of time series data into level, trend, and seasonality components, enhancing the model’s ability to capture complex patterns in cryptocurrency markets. To optimize the model’s performance, Bayesian hyperparameter tuning via Optuna, including a pruner callback, was utilized to efficiently find optimal model parameters while reducing training time by early termination of suboptimal training runs. Empirical results from testing the Helformer model against other advanced deep learning models across various cryptocurrencies demonstrate its superior predictive accuracy and robustness. The model not only achieves lower prediction errors but also shows remarkable generalization capabilities across different types of cryptocurrencies. Additionally, the practical applicability of the Helformer model is validated through a trading strategy that significantly outperforms traditional strategies, confirming its potential to provide actionable insights for traders and financial analysts. The findings of this study are particularly beneficial for investors, policymakers, and researchers, offering a reliable tool for navigating the complexities of cryptocurrency markets and making informed decisions.

🔍 Behind the Scenes: Methodology & Innovation

1. The Helformer Architecture
Helformer integrates three key components to outperform existing models:

  • Series Decomposition: Using Holt-Winters smoothing, we break price data into level, trend, and seasonality components (Fig. 1). This step isolates patterns that traditional Transformers might miss.

  • Multi-Head Attention: Unlike sequential models (e.g., LSTM), Helformer processes all time steps simultaneously, capturing long-range dependencies efficiently.

  • LSTM-Enhanced Encoder: Replacing the standard Feed-Forward Network with an LSTM layer improves temporal feature extraction.

    Fig. 1: Helformer architecture.

2. Data & Hyperparameter Tuning
We trained Helformer on Bitcoin (BTC) daily closing prices (2017–2024) and tested its generalization on 15 other cryptocurrencies (e.g., ETH, SOL). To optimize performance, we used Bayesian optimization via Optuna, automating hyperparameter selection (e.g., learning rate, dropout) and pruning underperforming trials early.

3. Evaluation Metrics
Helformer was benchmarked against RNN, LSTM, GRU, and vanilla Transformer models using:

  • Similarity metrics: R², Kling-Gupta Efficiency (KGE), EVS

  • Error metrics: RMSE, MAPE, MAE

  • Trading metrics: Sharpe Ratio, Maximum Drawdown, Volatility, Cumulative returns


💡 Key Findings & Practical Impact

1. Superior Predictive Accuracy
Helformer achieved near-perfect R² (1.0) and MAPE (0.0148%) on BTC test data, outperforming all baseline models (Table 1). Its decomposition step reduced errors by 98% compared to vanilla Transformers.

Table 1: Model Performance Comparison

Model

RMSE

MAPE

MAE

EVS

KGE

RNN

1153.1877

1.9122%

765.7482

0.9950

0.9951

0.9905

LSTM

1171.6701

1.7681%

737.1088

0.9948

0.9949

0.9815

BiLSTM

1140.4627

1.9514%

766.7234

0.9951

0.9952

0.9901

GRU

1151.1653

1.7500%

724.5279

0.9950

0.9950

0.9878

Transformer

1218.5600

1.9631%

799.6003

0.9944

0.9946

0.9902

Helformer

7.7534

0.0148%

5.9252

1

1

0.9998

2. Profitable Trading Strategies
In backtests, a Helformer-based trading strategy yielded 925% excess returns for BTC—tripling the Buy & Hold strategy’s returns (277%)—with lower volatility (Sharpe Ratio: 18.06 vs. 1.85), as shown in Fig. 2.

Fig. 2: Trading results.

3. Cross-Currency Generalization
Helformer’s pre-trained BTC weights transferred seamlessly to other cryptocurrencies, achieving R² > 0.99 for XRP and TRX. This suggests broad applicability without retraining—a boon for investors managing diverse portfolios.


🌍 Relevance to the Community

  1. For Researchers: Helformer’s architecture opens avenues for hybrid time-series models in finance, healthcare, and climate forecasting.

  2. For Practitioners: The model’s interpretable components (decomposition + attention) make it adaptable to volatile markets beyond crypto.

  3. For Policymakers: Reliable price forecasts could inform regulations to stabilize crypto markets and protect investors.


🤝 Acknowledgments & Open Questions

This work wouldn’t have been possible without my brilliant co-authors Oluyinka Adedokun, Joseph Akpan, Morenikeji Kareem, Hammed Akano, and Oludolapo Olanrewaju, or the support of The Hong Kong Polytechnic University.

We’d love to hear your thoughts!

  • How might Helformer adapt to non-financial time-series data?

  • Could integrating sentiment analysis further improve accuracy?

  • What ethical considerations arise with AI-driven trading?

🔗 Access the full paper: SpringerLink | ReadCube

Follow the Topic

Industries
Humanities and Social Sciences > Business and Management > Industries
IT in Business
Humanities and Social Sciences > Business and Management > IT in Business
Computational Intelligence
Technology and Engineering > Mathematical and Computational Engineering Applications > Computational Intelligence
Financial Economics
Humanities and Social Sciences > Economics > Financial Economics
Business Informatics
Mathematics and Computing > Computer Science > Business Informatics
Applied Statistics
Mathematics and Computing > Statistics > Applied Statistics

Related Collections

With Collections, you can get published faster and increase your visibility.

Big Data and Data-driven in Sports

Journal of Big Data is calling for submissions to our Collection on Big Data and Data-driven in Sports. Over the past few decades, interest in applying statistical analysis and modeling techniques to sports has been constantly growing. This trend is evident from the increasing body of scientific research and the numerous published works that provide valuable statistical insights across various sports, such as soccer, tennis, American football, baseball, and basketball, among others. This rising interest has been accompanied by a massive expansion in the availability of data, including the rise of big data, which offers a vast and detailed array of information that enhances the depth and accuracy of sports analytics.

We invite researchers to submit articles focused on advancing statistical and machine learning methods encompassing models, algorithms, and multivariate exploratory techniques in the realm of big data sports analytics. We seek contributions that introduce novel methodologies and practical techniques, showcasing significant advancements, extensions, and applications within this context. Submissions should emphasize innovative, complex, or scientifically compelling aspects of statistical analysis across a broad spectrum of sports, including both professional and amateur disciplines.

Topics:

The goal of this special issue is to collect articles that explore a variety of topics, including (but not limited to):

• Models and algorithms for forecasting game outcomes

• Methods for assessing and measuring teams’, players’ and athletes’ performance

• Implementation of optimal game strategies

• Methods to deal with players’ tracking data

• Analysis of the impact of vital parameters to athletes’ performance

• Exploitation of training data collected with sensors, wearables or other technologic equipment

• Data analysis problems associated with massive, complex datasets in sports

• Novel statistical approaches and data mining methods in sports

• Comparing and contrasting techniques for solving research questions in sports

• The role of social media and public sentiment analysis in sports analytics

• Economic impact analysis of sports events using big data

This Collection welcomes submission of original Research Articles. Should you wish to submit a different article type, please read the submission guidelines to confirm that type is accepted by the journal you are submitting to. Articles for this Collection should be submitted via our submission system, Snapp. During the submission process you will be asked whether you are submitting to a Collection, please select "Big Data and Data-driven in Sports" from the dropdown menu.

Articles will undergo the standard peer-review process of the journal they are considered in the Journal of Big Data and are subject to all of the journal’s standard policies. Articles will be added to the Collection as they are published.

The Editors have no competing interests with the submissions which they handle through the peer review process. The peer review of any submissions for which the Editors have competing interests is handled by another Editorial Board Member who has no competing interests.

Publishing Model: Open Access

Deadline: Dec 31, 2025

Advances in Multimodal Affective Computing and Human-Machine Interaction with Ethical AI Applications

Journal of Big Data is calling for submissions to our Collection on Advances in Multimodal Affective Computing and Human-Machine Interaction with Ethical AI Applications. As technologies involving the detection of emotional and cognitive states become more integrated into our daily lives, there is an emerging need to further develop techniques that use large-scale multimodal content (such as audio, visual, textual, and physiological data) not only to capture specific emotional responses, but also to provide a deeper understanding of associated sentiments, moods, mental states and overall cognitive conditions. Multimodal affective computing enables more holistic and immersive human-machine interaction by processing large and diverse data sources and is gaining momentum in various sectors.

Key topics of interest (but not limited to):

Frameworks for multimodal emotion recognition

Advances in multimodal sensor fusion

Tools for dynamic affect analysis

Multimodal analysis of emotions

Integration of large language models (LLMs) and multimedia generative AI models

Interactive systems for human-centered healthcare and decision support

Development of datasets

Practical case studies of real-time emotion detection

Hybrid systems

Complex emotion models

Multimedia applications for enhancing interactive entertainment experiences

Ethical considerations

This Collection welcomes submission of original Research Articles. Should you wish to submit a different article type, please read our submission guidelines to confirm that type is accepted by the journal. Articles for this Collection should be submitted via our submission system, Snapp. During the submission process you will be asked whether you are submitting to a Collection, please select "Advances in Multimodal Affective Computing and Human-Machine Interaction with Ethical AI Applications" from the dropdown menu.

Articles will undergo the journal’s standard peer-review process and are subject to all of the journal’s standard policies. Articles will be added to the Collection as they are published.

The Editors have no competing interests with the submissions which they handle through the peer review process. The peer review of any submissions for which the Editors have competing interests is handled by another Editorial Board Member who has no competing interests.

Publishing Model: Open Access

Deadline: Dec 31, 2025