Efficient Prediction of Water Quality Index (WQI) Using Machine Learning Algorithms

Published in Civil Engineering

Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Every impactful research project has a story, and for our team, this story began with the question: how can technology improve water quality monitoring to ensure better health and environmental outcomes? Our paper, "Efficient Prediction of Water Quality Index (WQI) Using Machine Learning Algorithms," addressed this question and earned the 2022 Best Paper Award from Human-Centric Intelligent Systems.

The Process and Methodology

The foundation of our research was built on a comprehensive analysis of water quality data sourced from India's diverse water bodies. The dataset included essential parameters such as dissolved oxygen (DO), biological oxygen demand (BOD), pH, and total coliform (TC). To ensure a reliable and replicable process, we designed a robust workflow for data preparation and modeling, as depicted in Figure 1.

Working diagram of proposed model.

Figure 1: Research Workflow
This figure illustrates the sequence of steps followed in the study:

  1. Data Collection: Acquiring datasets from Kaggle, focusing on key water quality parameters.
  2. Data Preprocessing: Addressing missing data using Random Forest imputation and applying Min-Max normalization for scaling.
  3. Feature Selection: Identifying critical variables using a correlation matrix.
  4. Machine Learning Models: Training and testing five algorithms (Neural Network, Random Forest, Multinomial Logistic Regression, Support Vector Machine, and Bagged Tree Model).
  5. Performance Evaluation: Comparing model accuracies and identifying the best performer.

This structured approach not only streamlined our study but also ensured replicability, a cornerstone of rigorous research.

Key Findings and Insights

The performance of the machine learning algorithms was assessed using metrics such as accuracy and kappa values. 

  • The Multinomial Logistic Regression (MLR) model achieved the highest accuracy of 99.83%, setting a benchmark for water quality prediction systems.
  • Random Forest (RF) followed closely with an accuracy of 98.99%, demonstrating its strength in handling complex datasets.
  • Other models, including Neural Network (98.65%), Bagged Tree Model (98.99%), and Support Vector Machine (96.98%), also performed well, though slightly lower than MLR.

The chart underscores the reliability of MLR in WQI prediction, making it an ideal choice for real-world applications.

Practical Implications

Our study's results provide a roadmap for developing efficient, data-driven systems for water quality monitoring. The insights gained can support policymakers, environmental agencies, and researchers in implementing proactive measures to ensure safe water access.

Looking forward, we aim to build a software application using our proposed model, enabling real-time water quality predictions. Such a tool could revolutionize water resource management, particularly in regions facing acute water quality challenges.

Final Thoughts

Winning the Best Paper Award has been a tremendous honor, motivating us to continue exploring the potential of machine learning in solving critical environmental problems. We extend our heartfelt thanks to the editorial board of Human-Centric Intelligent Systems for this recognition and to our research team at VRD Research Lab for their dedication and collaboration.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Soil and Water Protection
Technology and Engineering > Civil Engineering > Environmental Civil Engineering > Soil and Water Protection

Related Collections

With Collections, you can get published faster and increase your visibility.

Applications and Challenges of Blockchain Technology in User-Centric Intelligent Systems

This special collection aims to explore the transformative potential and inherent challenges of integrating blockchain technology into user-centric intelligent systems, a rapidly evolving interdisciplinary domain at the intersection of decentralized computing, human behavior modeling, and intelligent analytics.

As human-centric systems increasingly rely on vast, sensitive, and behavior-rich data, blockchain offers promising solutions for trust, transparency, privacy preservation, and decentralized governance. However, its integration into intelligent systems that model, predict, and respond to human behavior introduces unique technical, ethical, and usability challenges.

This collection invites original research, reviews, and case studies that address the following themes:

• Blockchain for Trust and Privacy in Human-Centric Systems: Mechanisms for decentralized trust, identity management, and privacy-preserving data sharing in systems that model user behavior and community dynamics.

• Decentralized User Modeling and Personalization: Blockchain-enabled frameworks for secure and transparent personalization, recommendation, and behavioral analytics.

• Smart Contracts and Autonomous Agents: Applications of smart contracts in automating user-centric interactions, decision-making, and system governance.

• Blockchain in Social and Behavioral Computing: Use of distributed ledgers to track, validate, and analyze social influence, community evolution, and behavioral dynamics.

• Security and Ethical Challenges: Addressing disinformation, misinformation, fairness, and explainability in blockchain-powered intelligent systems.

• Integration with Mobile and Social Sensing: Blockchain applications in ubiquitous sensing environments for healthcare, mobility, and societal impact.

• Scalability and Usability Issues: Technical limitations and human factors affecting the adoption of blockchain in intelligent systems.

• Trustworthy and Explainable AI via Blockchain: Use of blockchain to provide auditability, provenance, and verifiable explanations for AI models and decisions in human-centric systems, enhancing user trust and regulatory compliance.

• Federated and Decentralised Learning for User Privacy: Blockchain-supported federated, swarm, or split learning frameworks that enable collaborative AI model training across distributed users and organisations without exposing sensitive personal or behavioural data.

This collection aligns with the journal’s mission to advance human-centric intelligence by fostering multidisciplinary research that bridges blockchain technology, AI, behavioral modeling, and social computing. Contributions should emphasize both theoretical insights and practical implementations that enhance the understanding and development of secure, ethical, and user-aware intelligent systems.

This Collection supports and amplifies research related to SDG 9 (Industry & Innovation).

Publishing Model: Open Access

Deadline: Sep 14, 2026

Next-Generation Smart Grids: Power Conversion, Control, and Human-Friendly Applications

With the rapid advancement of renewable energy technologies, intelligent systems, and human-centered design, the transformation of energy conversion and utilization has entered a new stage. Traditional energy conversion systems often prioritize technical and economic efficiency, while overlooking adaptability, resilience, and human-centric requirements in real-world applications. This gap can lead to suboptimal real-world performance, limited user acceptance, and constrained flexibility in adapting to dynamic human environments. To address these challenges, there is a growing need to place humans at the heart of energy system design. Human-centric smart energy conversion emphasizes the integration of advanced materials, intelligent control, data-driven optimization, and user-oriented applications to create sustainable, flexible, and socially responsible energy solutions.

The purpose of this special issue is to explore how to integrate the human-centric design concept into smart energy conversion and its diverse applications, and collect relevant latest research results, ranging from theoretical modeling and material innovation to system-level integration and societal impacts. We particularly welcome contributions that highlight interdisciplinary perspectives bridging energy science, artificial intelligence, cognitive systems, and application domains.

Main Topics and Quality Control

This Special Issue intends to bring together recent developments, industrial practices, and emerging challenges in human-centric smart energy conversion and applications across various sectors. We welcome unpublished, original submissions on any topic related to human-centric energy solutions. Topics of interest include, but are not limited to:

(i) Novel materials and devices for smart energy harvesting and conversion;

(ii) User behavior modeling and adaptive energy conversion control;

(iii) Intelligent control strategies for adaptive energy conversion systems;

(iv) Data-driven and human-in-the-loop optimization of energy utilization;

(v) Human–machine interaction in smart grids and distributed energy systems;

(vi) Personalized energy services and demand response strategies

This Collection supports and amplifies research related to- SDG 11: Sustainable Cities & Communities; SDG 9: Industry, Innovation & Infrastructure; SDG 7: Affordable and Clean Energy

Publishing Model: Open Access

Deadline: Jul 31, 2026