Helformer: an attention-based deep learning model for cryptocurrency price forecasting

Traditional forecasting models often fail to capture the nonlinear, nonstationary nature of crypto markets, where prices swing dramatically based on factors like market sentiment, regulatory news, and macroeconomic trends. This work develops a Helformer, a new variant of Transformer architecture.
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Explore the Research

SpringerOpen
SpringerOpen SpringerOpen

Helformer: an attention-based deep learning model for cryptocurrency price forecasting - Journal of Big Data

Cryptocurrencies have become a significant asset class, attracting considerable attention from investors and researchers due to their potential for high returns despite inherent price volatility. Traditional forecasting methods often fail to accurately predict price movements as they do not account for the non-linear and non-stationary nature of cryptocurrency data. In response to these challenges, this study introduces the Helformer model, a novel deep learning approach that integrates Holt-Winters exponential smoothing with Transformer-based deep learning architecture. This integration allows for a robust decomposition of time series data into level, trend, and seasonality components, enhancing the model’s ability to capture complex patterns in cryptocurrency markets. To optimize the model’s performance, Bayesian hyperparameter tuning via Optuna, including a pruner callback, was utilized to efficiently find optimal model parameters while reducing training time by early termination of suboptimal training runs. Empirical results from testing the Helformer model against other advanced deep learning models across various cryptocurrencies demonstrate its superior predictive accuracy and robustness. The model not only achieves lower prediction errors but also shows remarkable generalization capabilities across different types of cryptocurrencies. Additionally, the practical applicability of the Helformer model is validated through a trading strategy that significantly outperforms traditional strategies, confirming its potential to provide actionable insights for traders and financial analysts. The findings of this study are particularly beneficial for investors, policymakers, and researchers, offering a reliable tool for navigating the complexities of cryptocurrency markets and making informed decisions.

🔍 Behind the Scenes: Methodology & Innovation

1. The Helformer Architecture
Helformer integrates three key components to outperform existing models:

  • Series Decomposition: Using Holt-Winters smoothing, we break price data into level, trend, and seasonality components (Fig. 1). This step isolates patterns that traditional Transformers might miss.

  • Multi-Head Attention: Unlike sequential models (e.g., LSTM), Helformer processes all time steps simultaneously, capturing long-range dependencies efficiently.

  • LSTM-Enhanced Encoder: Replacing the standard Feed-Forward Network with an LSTM layer improves temporal feature extraction.

    Fig. 1: Helformer architecture.

2. Data & Hyperparameter Tuning
We trained Helformer on Bitcoin (BTC) daily closing prices (2017–2024) and tested its generalization on 15 other cryptocurrencies (e.g., ETH, SOL). To optimize performance, we used Bayesian optimization via Optuna, automating hyperparameter selection (e.g., learning rate, dropout) and pruning underperforming trials early.

3. Evaluation Metrics
Helformer was benchmarked against RNN, LSTM, GRU, and vanilla Transformer models using:

  • Similarity metrics: R², Kling-Gupta Efficiency (KGE), EVS

  • Error metrics: RMSE, MAPE, MAE

  • Trading metrics: Sharpe Ratio, Maximum Drawdown, Volatility, Cumulative returns


💡 Key Findings & Practical Impact

1. Superior Predictive Accuracy
Helformer achieved near-perfect R² (1.0) and MAPE (0.0148%) on BTC test data, outperforming all baseline models (Table 1). Its decomposition step reduced errors by 98% compared to vanilla Transformers.

Table 1: Model Performance Comparison

Model

RMSE

MAPE

MAE

EVS

KGE

RNN

1153.1877

1.9122%

765.7482

0.9950

0.9951

0.9905

LSTM

1171.6701

1.7681%

737.1088

0.9948

0.9949

0.9815

BiLSTM

1140.4627

1.9514%

766.7234

0.9951

0.9952

0.9901

GRU

1151.1653

1.7500%

724.5279

0.9950

0.9950

0.9878

Transformer

1218.5600

1.9631%

799.6003

0.9944

0.9946

0.9902

Helformer

7.7534

0.0148%

5.9252

1

1

0.9998

2. Profitable Trading Strategies
In backtests, a Helformer-based trading strategy yielded 925% excess returns for BTC—tripling the Buy & Hold strategy’s returns (277%)—with lower volatility (Sharpe Ratio: 18.06 vs. 1.85), as shown in Fig. 2.

Fig. 2: Trading results.

3. Cross-Currency Generalization
Helformer’s pre-trained BTC weights transferred seamlessly to other cryptocurrencies, achieving R² > 0.99 for XRP and TRX. This suggests broad applicability without retraining—a boon for investors managing diverse portfolios.


🌍 Relevance to the Community

  1. For Researchers: Helformer’s architecture opens avenues for hybrid time-series models in finance, healthcare, and climate forecasting.

  2. For Practitioners: The model’s interpretable components (decomposition + attention) make it adaptable to volatile markets beyond crypto.

  3. For Policymakers: Reliable price forecasts could inform regulations to stabilize crypto markets and protect investors.


🤝 Acknowledgments & Open Questions

This work wouldn’t have been possible without my brilliant co-authors Oluyinka Adedokun, Joseph Akpan, Morenikeji Kareem, Hammed Akano, and Oludolapo Olanrewaju, or the support of The Hong Kong Polytechnic University.

We’d love to hear your thoughts!

  • How might Helformer adapt to non-financial time-series data?

  • Could integrating sentiment analysis further improve accuracy?

  • What ethical considerations arise with AI-driven trading?

🔗 Access the full paper: SpringerLink | ReadCube

Follow the Topic

Industries
Humanities and Social Sciences > Business and Management > Industries
IT in Business
Humanities and Social Sciences > Business and Management > IT in Business
Computational Intelligence
Technology and Engineering > Mathematical and Computational Engineering Applications > Computational Intelligence
Financial Economics
Humanities and Social Sciences > Economics > Financial Economics
Business Informatics
Mathematics and Computing > Computer Science > Business Informatics
Applied Statistics
Mathematics and Computing > Statistics > Applied Statistics

Related Collections

With Collections, you can get published faster and increase your visibility.

2025 Australasian Data Science and Machine Learning (AusDM25) Special Issue

The Australasian Data Science and Machine Learning (AusDM) Conference has firmly established itself as the premier Australasian meeting point for both practitioners and researchers in Data Science, Data Analytics, Data Mining, and Machine Learning—including cutting-edge areas such as Deep Learning and Generative AI. Since its inception in 2002, the AusDM conference series has provided a leading forum for presenting, disseminating, and discussing the latest research advances spanning algorithms, software, systems, and practical applications across diverse industries.

Continuing this tradition, AusDM 2025 aims to foster cross-disciplinary knowledge exchange, highlight emerging research directions, and promote collaboration across academia, government, and industry.

This Special Issue cordially invites authors of accepted AusDM 2025 conference papers to submit extended versions of their work. Topics of interest include, but are not limited to, the following areas of Data Science and Machine Learning:

Foundational Techniques in Machine Learning and AI

Supervised, unsupervised, semi-supervised, and self-supervised learning Deep learning and representation learning Reinforcement learning and federated learning Transfer learning, meta-learning, few-shot and continual learning Multitask and multimodal learning Generative models (GANs, diffusion models) Large Language Models (LLMs) and Large Multimodal Models (LMMs) Zero-shot learning and prompt-based learning

Learning from Diverse and Complex Data

Analytics over structured, semi-structured, and unstructured data Text, time-series, graph, spatial, spatio-temporal, and network data Web, social media, multimedia, IoT, and sensor data Sequential, temporal, and dynamic data modelling

Data-Centric AI and Data Engineering

Data preprocessing, cleaning, integration, matching, and linkage Privacy-preserving and secure data mining Data-centric AI pipelines and dataset curation Computational aspects of data mining and large-scale data management

Scalable and Real-Time Data Analytics

Big data analytics and scalable Machine Learning Parallel and distributed learning algorithms Data stream mining and real-time analytics Edge, cloud, and IoT-enabled Machine Learning systems

Interactive and Visual Analytics

Visual analytics and explainability through visualisation Human-in-the-loop machine learning Interactive data exploration and decision support

Responsible, Causal, and Explainable AI

Explainable and interpretable Machine Learning Fairness, accountability, transparency, and ethics in AI Causal inference and causal machine learning Robustness, generalization, and uncertainty quantification

Applied Data Science and Machine Learning Across Domains

Applications in business, finance, education, agriculture, urban planning, healthcare, sports, social sciences, cybersecurity, arts, and humanities Domain-specific AI systems in biomedical informatics, environmental science, astronomy, engineering, and beyond Industrial case studies and data-driven product innovation

Submission Requirements

Submitted journal manuscripts must include at least 30% new content beyond the conference version. All submissions will undergo a rigorous peer-review process in accordance with the journal’s standards.

SDG 9: Industry, Innovation & Infrastructure

Publishing Model: Open Access

Deadline: Jul 31, 2026

Intelligent Data Engineering for FAIR and Reusable Earth & Space Science and Applications

We invite researchers, practitioners, and industry experts to submit original contributions to this special issue/track focused on AI‑driven data engineering for Earth and Space Sciences. This collection addresses the rising need for intelligent techniques that support scalable, interoperable, and FAIR (Findable, Accessible, Interoperable, Reusable) data ecosystems across environmental, geospatial, planetary, and space science domains.

Scope and Themes

This track welcomes submissions that integrate Artificial Intelligence, data engineering, and scientific workflows to tackle challenges in managing, harmonizing, analyzing, and reusing complex Earth and Space Science data.

Topics of interest include, but are not limited to:

AI‑enabled data pipelines for Earth observation, satellite systems, and space mission data. Semantic, ontology‑driven, and knowledge‑based frameworks for metadata enrichment and interoperability. Intelligent workflow orchestration, hybrid AI systems, and agent‑based automation. Representation learning, self‑supervised learning, and foundation models for geospatial and environmental data. FAIR principles, reproducible workflows, Open Science frameworks, and transparent data governance. Scientific applications in climate analytics, disaster prediction, planetary exploration, environmental sustainability, and more.

Key Focus Areas

The special issue/track also highlights emerging AI‑driven paradigms for next‑generation scientific data engineering, including:

Foundation models and self‑supervised learning for Earth observation.

Knowledge graphs, semantic mapping, and retrieval‑augmented generation (RAG).

Agentic workflow orchestration and enhanced digital twin accessibility.

AI‑based multimodal data fusion, explainability, and domain adaptation.

Synthetic data generation and quality assessment.

Multivariate time‑series modeling and forecasting for environmental and planetary monitoring.

Why Submit?

This track aims to showcase innovations, architectures, frameworks, and real‑world case studies that demonstrate how AI‑driven data engineering accelerates scientific discovery, strengthens data interoperability, and supports sustainability across Earth and Space Science disciplines. Contributions that advance FAIR and Open Science principles are especially encouraged.

Work building upon discussions or early results presented at the DARES’25 ECAI Workshop is also welcome.

SDG 9: Industry, Innovation & Infrastructure

Publishing Model: Open Access

Deadline: Sep 05, 2026