Exploring future smart grid residential electricity usage with a synthetic dataset of Danish electricity prosumers

Exploring future smart grid residential electricity usage with a synthetic dataset of Danish electricity prosumers

Share this post

Choose a social network to share with, or copy the shortened URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

With the increasing adoption of renewable energy sources (RES), battery electric vehicles (BEVs) and energy storage systems (ESS) in recent years by residential consumers, they are changing from conventional consumers to prosumers, who both consume and produce electricity. This transformation has made the power systems increasingly dynamic and bidirectional in terms of power flow. 

To effectively plan for the future of residential electricity consumption, grid operators, policymakers, utilities, and other stakeholders must have a clear understanding of the dynamics of the prosumers in the future. However, limited or lack of data availability is a major obstacle for several reasons. Firstly, the slow adoption of new technologies (particularly BEVs) and automation in this field contributes to the scarcity of real data from prosumers. Secondly, individual electricity consumption data for thousands of prosumers is not available to practitioners and researchers due to consumers' privacy concerns. Thirdly, in countries with widespread smart meter rollouts, interval consumption data of prosumers is available. However, metered data only shows imported and exported energy from/to the grid, which cannot help determine the type of prosumers based on their behind-the-meter (BTM) equipment, such as BEVs, stationary batteries, or solar PV systems. Lastly, the dynamic nature of prosumers' behaviour and the frequent changes in household electrical appliances further exacerbate reliable data availability issues. 

Under this circumstance, we synthesized a benchmark dataset based on real-world consumers' data collected from Denmark, incorporating three different RES interval datasets: automated energy storage systems, rooftop solar PV systems, and BEVs. By reformatting the data and applying a conditional tabular generative adversarial network (CTGAN)-based data synthesizer, we can sidestep the privacy concern of using real-world consumers' data. In this way, we created a synthetic dataset of 600,000 days of imported and exported energy from/to the grid. This dataset includes hourly resolution profiles labelled by BTM equipment, type of day, season, and daily temperature, providing a comprehensive representation of residential prosumers' consumption patterns.

To verify the authenticity of the dataset, we applied a set of analysis methods, including qualitative inspection, empirical statistics, Machine Learning (ML) based evaluation metrics, and information theory. We demonstrated that the synthetic dataset shows similar statistical features when compared to our benchmark dataset as well as other research using real-world electricity users' data. The ML model trained by the synthetic dataset shows a reasonable performance when it's tested with the real dataset. This means the synthetic dataset can be used to provide insights both for humans and for ML models. While walking through all the explored key performance indicators (KPIs), one limitation spotted is that the synthetic dataset has a higher complexity compared to the real dataset. This shows that the CTGAN generally overestimates the complexity of the real dataset when the stochastic nature of residential users and their varying consumption patterns are taken into account. On the other hand, the models can successfully capture the features and relative complexity of each type of user.

We believe this synthetic dataset offers several advantages. It allows users to gain insight into possible future scenarios based on different technology adaptations. This is possible by using the programming code publicly shared, which allows researchers and other stakeholders to apply different plausible scenarios of the future for planning, operation, investment, developing new business models, etc. Additionally, users can examine how different external factors such as temperatures and seasons impact their electricity usage. Lastly, while our synthetic dataset focuses on Danish residential prosumers, our methodology can be applied to other datasets. As a result, it can be used with high-resolution data that could provide more information on the system requirements in the future.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Electrical and Electronic Engineering
Technology and Engineering > Electrical and Electronic Engineering

Related Collections

With collections, you can get published faster and increase your visibility.

Ecological data for tracking biological diversity and environmental change

This collection presents data contributions addressing topics in biodiversity and ecology.

Publishing Model: Open Access

Deadline: Jan 31, 2024

Medical imaging data for digital diagnostics

This Collection presents a series of articles describing annotated datasets of medical images and video. All medical specialities are considered and data can be derived from study participants, tissue samples, electronic health records (EHRs) or other sources.

Publishing Model: Open Access

Deadline: Dec 20, 2023