1. Trang chủ
  2. » Luận Văn - Báo Cáo

Assessment and prediction of surface water quality in relation to land use change in quang ninh province using machine learning

149 1 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Assessment and prediction of surface water quality in relation to land use change in Quang Ninh Province using machine learning
Tác giả Christian Ray Palses Delos Santos
Người hướng dẫn Associate Prof. Dr. Pham Quy Giang, Dr. Tran Thi Viet Ha
Trường học Vietnam National University, Hanoi
Chuyên ngành Environmental Engineering
Thể loại Master's thesis
Năm xuất bản 2025
Thành phố Hanoi
Định dạng
Số trang 149
Dung lượng 8,12 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Cấu trúc

  • CHAPTER 1: LITERATURE REVIEW (18)
    • 1.1. Surface Water (18)
      • 1.1.1. Background (18)
    • 1.2. Surface Water Quality (20)
      • 1.2.1. Indicators of Surface Water Quality (20)
      • 1.2.2. International Standards (22)
      • 1.2.3. National Standards (23)
    • 1.3. Factors Affecting Surface Water Quality (31)
      • 1.3.1. Anthropogenic Factors (31)
      • 1.3.2. Natural Factors (33)
      • 1.3.3. Land Use: Interface Between People and Nature (33)
    • 1.4. Overview of Quang Ninh Province (35)
      • 1.4.1. General Background (35)
      • 1.4.2. Water Quality and Land Use (36)
    • 1.5. Machine Learning in Water Quality Prediction (37)
      • 1.5.1. Application of Machine Learning in Predicting Water Quality (37)
      • 1.5.2. Tree-based Machine Learning (38)
  • CHAPTER 2: METHODOLOGY (40)
    • 2.1. Secondary Data Collection (40)
      • 2.1.1. Water Quality Monitoring Stations (40)
      • 2.1.2. Water Quality Parameters (41)
      • 2.1.3. Land Use Land Cover (LULC) Data (43)
      • 2.1.4. Terrain (45)
      • 2.1.5. Weather and Climate Data (47)
      • 2.1.6. Soil Map (47)
      • 2.1.7. Road Map and Ports (48)
    • 2.2. Scope of the Study (48)
      • 2.2.1. Monitoring Stations (48)
      • 2.2.2. Parameters (49)
      • 2.2.3. Duration (49)
    • 2.3. Historical Land Use Change Analysis (49)
      • 2.3.1. Land Use Data Pre-processing (49)
      • 2.3.2. Cross-Validation and Correction of Inconsistent Land Use Categories . 35 2.3.3. Analysis of the Change of Land Use Area (49)
    • 2.4. Assessment of Water Quality and Land Use Change Relationship (53)
      • 2.4.1. Data Compilation and Digitization (53)
      • 2.4.2. Data Handling and Imputation (53)
      • 2.4.3. Water Quality Index Calculation (54)
      • 2.4.4. Catchment Delineation of Monitoring Stations (65)
      • 2.4.5. Analysis of Land Use and Water Quality Relationship (66)
    • 2.5. Forecasting Land Use Dynamics in Quang Ninh (67)
      • 2.5.1. MOLUSCE (67)
      • 2.5.2. Raster Alignment (67)
      • 2.5.3. Model Evaluation Metrics (67)
    • 2.6. Modeling Future Water Quality Trends (69)
      • 2.6.1. Machine Learning Models (69)
      • 2.6.2. Feature Engineering (70)
      • 2.6.3. Model Performance Metrics (72)
  • CHAPTER 3: RESULTS AND DISCUSSION (74)
    • 3.1. Historical Land Use Change Analysis (74)
      • 3.1.1. Temporal Analysis of Land Use Change (74)
      • 3.1.2. Land Use Correlation (75)
    • 3.2. Assessment of Water Quality and Land Use Change Relationship (79)
      • 3.2.1. Boundary Setting for Inland Water Monitoring Stations (79)
      • 3.2.2. WQI Calculation and Parameter Assessment (80)
      • 3.2.3. Temporal Relationship of Water Quality Status and Land Use (85)
      • 3.2.4. Water Quality and Land Use Correlation (95)
    • 3.3. Prediction of Future Land Use (98)
      • 3.3.1. Model Training Performance (98)
      • 3.3.2. Model Validation Performance (99)
      • 3.3.3. Predicted Land Use in the Beginning of 2030 (102)
    • 3.4. Prediction of Future Water Quality (104)
  • Appendix 1. Land Use Map of Quang Ninh Province for the Year 2017 from Esri (130)
  • Appendix 2. Land Use Map of Quang Ninh Province for the Year 2018 from Esri (0)
  • Appendix 3. Land Use Map of Quang Ninh Province for the Year 2019 from Esri (0)
  • Appendix 4. Land Use Map of Quang Ninh Province for the Year 2020 from Esri (0)
  • Appendix 5. Land Use Map of Quang Ninh Province for the Year 2021 from Esri (0)
  • Appendix 6. Land Use Map of Quang Ninh Province for the Year 2022 from Esri (0)
  • Appendix 7. Land Use Map of Quang Ninh Province for the Year 2023 from Esri (0)
  • Appendix 8. Land Use Map of Quang Ninh Province for the Year 2024 from Esri (0)
  • Appendix 9. Land Use Transition of Quang Ninh Province from 2017 to 2024 (0)
  • Appendix 10. Ammonium Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 11. Arsenic (As) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 12. BOD 5 Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 13. Cadmium (Cd) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 14. COD Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 15. Total Coliform Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 16. DO Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 17. Iron (Fe) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 18. Mercury (Hg) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 19. Nitrate Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 20. Lead (Pb) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 21. pH Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 22. Phosphate Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 23. Ammonium Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 24. Arsenic (As) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 25. Cadmium (Cd) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 26. Total Coliform Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 27. Copper (Cu) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 28. DO Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 29. Iron (Fe) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 30. Mercury (Hg) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 31. Manganese (Mn) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 32. Oil and Grease Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 33. Lead (Pb) Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 34. pH Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 35. Phosphate Validation Result (A) Measured Values (B) WQI (0)
  • Appendix 36. Zinc (Zn) Validation Result (A) Measured Values (B) WQI (0)

Nội dung

VIETNAM NATIONAL UNIVERSITY, HANOI VIETNAM JAPAN UNIVERSITY CHRISTIAN RAY PALSES DELOS SANTOS ASSESSMENT AND PREDICTION OF SURFACE WATER QUALITY IN RELATION TO LAND USE CHANGE IN QUAN

LITERATURE REVIEW

Surface Water

Surface water encompasses all water on the Earth's surface and serves as a vital resource for ecosystems, human activities, and global climate regulation (UN-Water, n.d.) It exists in diverse forms, ranging from freshwater systems such as rivers, lakes, and wetlands to saline bodies like seas and oceans.

Water resources are broadly categorized into inland and marine sources, each essential to ecosystems, human needs, and climate regulation Inland waters provide freshwater for drinking, agriculture, and industry, while wetlands act as natural filters and habitats for diverse species Marine waters, spanning vast saline environments, regulate global temperatures, absorb carbon dioxide, and support marine biodiversity and fisheries Together, inland and marine water systems sustain life, maintain ecological balance, and secure access to vital resources.

Inland surface water comprises naturally occurring freshwater features such as rivers, lakes, streams, and wetlands, which are essential to hydrological cycles and provide diverse habitats for biological species Rivers transport nutrients and sediments, linking terrestrial and aquatic ecosystems, while wetlands offer water filtration, flood mitigation, and valuable habitat services, and lakes function as reservoirs for freshwater storage Some streams and channels are also engineered for human needs, giving rise to engineered surface waters that include reservoirs, dams, artificial lakes, channels, and levees Collectively, these systems support water resource management by supplying for human consumption, agriculture, and energy production, while helping to mitigate the impacts of flooding and drought.

Engineered water systems—storing water for dry periods, generating hydropower, and redirecting flow through controlled channels—are indispensable for urban infrastructure, agricultural sustainability, and disaster resilience (Müller et al., 2022) However, while these surface water infrastructures provide significant benefits, their construction and operation can affect natural water flow and ecosystems (Koschorreck et al., 2020) Careful management is needed to balance human needs with environmental sustainability, and the interplay between natural inland waters and artificial water systems highlights the need for integrated water resources management that safeguards ecological integrity while supporting resilient communities.

5 sustainable management to ensure that these interconnected resources continue to support ecosystems, environmental stability, and human needs

Marine surface waters, spanning seas, oceans, and coastal estuaries, are expansive saline environments that regulate the Earth's climate by absorbing and distributing solar heat, helping to stabilise temperatures worldwide They drive ocean currents that shape weather patterns and the global circulation of heat, nutrients, and gases, supporting both terrestrial and marine ecosystems and underpinning livelihoods such as fishing and maritime trade These waters also absorb atmospheric carbon dioxide, contributing to climate-change mitigation Yet rising sea temperatures, acidification, and plastic pollution threaten marine biodiversity and the balance of oceanic systems Protecting marine surface waters requires coordinated global action to reduce pollution, manage resources sustainably, and mitigate climate-change impacts.

Demand for sustainable surface water management is rising as climate change, urbanization, and population growth strain the water resources communities rely on for drinking water, agricultural irrigation, and industrial use To ensure long-term availability and quality, it is essential to advance sustainable surface water management through innovative engineering solutions, effective policies, and collective stewardship that balance human needs with environmental conservation Protecting surface water quality supports ecosystems and biodiversity, safeguards human health, and underpins economic stability by reducing drinking water treatment costs and ensuring reliable supply Polluted or poorly managed surface water can harm ecosystems, limit water availability, and drive up treatment expenses, making sustainable water management a critical priority for both the environment and society By integrating engineering innovation, governance, and shared responsibility, we can build resilience to climate pressures and secure sustainable water resources for the future.

6 and availability of this vital resource for future generations, fostering a harmonious coexistence between humanity and the natural world.

Surface Water Quality

1.2.1 Indicators of Surface Water Quality

Assessing surface water quality is essential for protecting both aquatic ecosystems and the human communities that rely on them Over the years, measurement and classification methods have advanced, incorporating physical, chemical, and biological indicators—and even radiologic indicators—to monitor a wide range of signals and trends (Cuffney et al., 2000; Olomukoro et al., 2022; Rocha et al., 2014; Buzynnyi & Mykhailova, 2024; Odumu et al., 2025) Each indicator is evaluated using specific, quantifiable parameters—the technical variables that define water quality in precise terms—and these data underpin broader assessments Together, indicators and parameters influence key ecological processes such as photosynthesis, nutrient cycling, and overall productivity, enabling assessment of ecosystem integrity and identification of threats like pollution or invasive species (Vossler et al., 2023; J Wang et al., 2023) Ultimately, robust water quality indicators support sustainable water-resource management by informing decisions on treatment, agricultural practices, and environmental conservation.

Table 1.1 Types of Water Quality Indicators and Parameters

Indicator Description Common/Example Parameters

Physical Reflects the physical characteristics of water, which influence its appearance, temperature, and ability to support life These are directly observable or measurable without chemical reactions

Temperature, turbidity, total suspended solids (TSS), color, odor, taste, and electrical conductivity (EC)

Chemical Measures the chemical composition of water, including the presence of nutrients, minerals, contaminants, and dissolved gases

• General properties: pH, dissolved oxygen (DO), biochemical oxygen demand (BOD), chemical oxygen demand (COD), and total organic carbon (TOC)

• Nutrients: nitrate (NO₃⁻), nitrite (NO₂⁻), ammonia (NH₃), and phosphate (PO₄³⁻)

• Contaminants: Heavy metals (e.g., lead, mercury), pesticides, hydrocarbons, cyanide

Biological Measures the presence, abundance, and diversity of living organisms in the water, which reflects its ecological health

Coliform bacteria (e.g., E coli), algal biomass (e.g., chlorophyll-a), macroinvertebrates, and pathogenic microorganisms (e.g., viruses, protozoa)

Radiological* Measures the presence of radioactive elements or radiation in the water, which could be due to natural sources or contamination from industrial or nuclear activities

Radon (Rn), uranium (U), radium (Ra), cesium (Cs)

*Radiological parameters, while important in specific contexts, were not within the scope of this study

International surface water quality standards provide a global framework for assessing and managing water systems Leading organizations—World Health Organization (WHO), United States Environmental Protection Agency (US EPA), Food and Agriculture Organization (FAO), United Nations Environment Programme (UNEP), Standard Methods for the Examination of Water and Wastewater (SMEWW), International Organization for Standardization (ISO), and ASTM International—have established specific parameters and values tailored to different water uses and evaluation purposes These standards guide water quality management and are essential for ensuring safe and sustainable water use across sectors such as drinking water, agriculture, industry, and recreation.

World Health Organization (WHO) prioritizes water quality as a core public-health concern and drinking-water safety, with guidelines that identify critical parameters such as microbial contaminants, heavy metals, and chemical pollutants that can impact human health and ensure safe water supplies (World Health Organization, 2017) The U.S Environmental Protection Agency (EPA) protects human health and the environment by developing and enforcing water-quality standards under the Clean Water Act (CWA) and Safe Drinking Water Act (SDWA) for pollutants in surface water, safeguarding ecosystems and public health The Food and Agriculture Organization (FAO) provides water-quality guidelines tailored to agricultural applications—irrigation, livestock watering, and aquaculture—highlighting parameters that influence crop growth, soil health, and aquatic life to prevent harm to the environment and food-production systems (Ayers & Westcot).

1985) Meanwhile, the United Nations Environment Programme (UNEP) is a global authority on environmental issues, working to promote sustainable development and protect natural resources UNEP develops international guidelines, such as the Global Environment Monitoring System (GEMS/Water), which provides tools for assessing and managing surface water quality worldwide (United Nations Environment

Programme, 1992) The Standard Methods for the Examination of Water and Wastewater (SMEWW) is a widely used set of protocols for analyzing the quality of water and wastewater It is developed and published collaboratively by three leading organizations, namely, the American Public Health Association (APHA), the American Water Works Association (AWWA), and the Water Environment Federation (WEF) The International Organization for Standardization (ISO) develops globally recognized standards for water quality, including sampling and testing methodologies, ensuring consistency and reliability in surface water assessments The American Society for Testing and Materials (ASTM International) develops voluntary consensus standards for testing and assessing water quality, including methods for analyzing physical, chemical, and biological parameters Widely adopted in industries and regulatory frameworks, ASTM standards ensure reliability and uniformity in surface water quality evaluations These standards support environmental protection, policy development, and industry compliance, enabling sustainable water management and international collaboration

International water quality standards provide a reliable baseline for national regulations and help establish a unified global understanding of the key parameters that must be monitored (World Health Organization, 2021) By adopting these guidelines, countries ensure consistent water quality assessment and implement measures to protect both human health and ecosystems Moreover, these standards foster international collaboration by enabling cross-border data comparisons and facilitating coordinated management of transboundary water resources.

Vietnam protects its water resources through laws and regulations designed to ensure the safety and sustainability of surface water and marine water quality These regulations specify the parameters and limit values that must be assessed to evaluate water quality, tailored to the water’s intended use and location The Ministry of Natural Resources and Environment (MONRE, Vietnamese: Bộ Tài nguyên và Môi trường/BTNMT) plays a central role in formulating these laws, which are essential for effective water resource protection and environmental management.

10 for protecting public health, ecosystems, and promoting the sustainable use of water resources

MONRE issued Circular No 01/2023/TT-BTNMT in March 2023, introducing five National Technical Regulations for water quality monitoring The regulations include the updated QCVN 08:2023/BTNMT for Surface Water Quality and QCVN 10:2023/BTNMT for Marine Water Quality They establish clear standards for physical, chemical, biological, and microbiological parameters that must be monitored in both surface waters (such as rivers, lakes, and reservoirs) and marine waters (including coastal and nearshore areas).

QCVN 08:2023/BTNMT provides the regulatory framework for assessing and maintaining surface water quality across Vietnam, ensuring rivers, lakes, and other water bodies meet environmental quality criteria that protect public health, support ecosystem sustainability, and enable essential water-related industries such as irrigation, fishing, and industry The standard specifies limit values for key water quality parameters, including pH, dissolved oxygen, heavy metals, and microbial contamination, guiding monitoring and compliance to prevent pollution By defining these thresholds, the regulation aims to minimize environmental pollution, safeguard biodiversity, and ensure safe water for both human consumption and economic use.

QCVN 10:2023/BTNMT sets the standards for the quality of Vietnam’s marine and coastal waters, aiming to protect ecosystems and support sustainable coastal development Recognizing the critical role of coastal environments in aquaculture, tourism, and fisheries, the regulation seeks to safeguard marine life and public health from waterborne diseases and contaminants It establishes enforceable limits for key water quality parameters—salinity, temperature, phosphates, heavy metals, and bacterial counts—that directly affect aquatic organisms and human activities By maintaining clean, healthy coastal waters, Vietnam’s shores can remain vibrant, pollution-free, and capable of supporting a diverse range of marine species and coastal industries.

Additionally, Vietnamese National Standards (Tiêu chuẩn Việt Nam, or TCVN) play a crucial role in setting water quality benchmarks across various categories For example, TCVN 5987:1995 establishes the national standard for drinking water quality, outlining the safety and potability criteria that govern public water supplies and influence compliance, monitoring, and public health protection.

Water quality regulations specify key parameters including pH, turbidity, microbial content, and heavy metals to safeguard drinking water and other uses The TCVN framework complements these regulatory requirements by ensuring water meets essential health and safety criteria for human consumption and other applications Together, these regulations and TCVN standards form a comprehensive, enforceable nationwide framework for managing water quality across the country.

Table 1.2: International and National Standards for Water Quality

Standards/Organization Coverage Number of Parameters/Methods Description/Purpose Reference

International 73 parameters Primarily provide guidelines for drinking water quality and health- related aspects of water

Guidelines for Drinking Water Quality: FOURTH EDITION INCORPORATING THE FIRST ADDENDUM

Drinking Water: 93 parameters Surface Water: 40-50 parameters*

*Depending on water bodies and monitoring needs

Develops standards and guidelines to protect ecosystems, biodiversity, and communities from harmful pollutants

(Baird et al., 2017; United States Environmental Protection

Establishes guidelines for water quality in agriculture, particularly

FAO Irrigation and Drainage Paper 29: Water Quality for Agriculture

Standards/Organization Coverage Number of Parameters/Methods Description/Purpose Reference

Ecosystem Protection: 12 parameters irrigation and aquaculture (Ayers & Westcot, 1985)

International No exact number of parameters, but rather, it helps in monitoring critical indicators

Addresses water quality in the context of environmental health and ecosystem sustainability

Provides standardized and validated methods for analyzing water and wastewater, ensuring accuracy, reliability, and comparability of results globally

Standard Methods for the Examination of Water and Wastewater (SMEWW) (Baird et al., 2017)

Standards/Organization Coverage Number of Parameters/Methods Description/Purpose Reference

Surface Water: 80-100 methods Groundwater: 60-80 methods

Provides technical standards for water sampling, analysis, and classification

ISO 5667-6:2014-Water Quality-Sampling-Part 6:

Guidance on sampling of rivers and streams

Standards/Organization Coverage Number of Parameters/Methods Description/Purpose Reference

(sampling and measurement only for surface water bodies and qualities)

Provides reliable methods for analyzing surface water, ensuring accurate results, and supporting environmental monitoring and ecosystem protection

Various ASTM standards for different sampling and measurement

53 parameters It is used for environmental monitoring, to assess the condition of inland water bodies, and to ensure compliance with environmental protection laws

QCVN 08:2023/BTNMT (Ministry of Natural Resources and Environment, 2023a)

Standards/Organization Coverage Number of Parameters/Methods Description/Purpose Reference

46 parameters The purpose of the standard is to protect marine ecosystems and ensure that coastal waters support aquatic life, tourism, aquaculture, and human health

(Ministry of Natural Resources and Environment, 2023b)

Ministry of Science and Technology

Developed to ensure the safety, quality, and compatibility of products, processes, and services in Vietnam, often aligning with international standards to facilitate trade and cooperation

National Standards of Vietnam (Vietnamese: Tiêu chuẩn Việt Nam, TCVN) under the Law on Standards and Technical

68/2006/QH11) (National Assembly of Vietnam,

Factors Affecting Surface Water Quality

Surface water quality is shaped by a range of factors, both anthropogenic and natural, that can markedly affect the health of aquatic ecosystems and the availability of safe water resources (Khatri & Tyagi, 2015) Human activities such as industrial discharges, agricultural runoff, and urbanization frequently introduce pollutants that degrade water quality, while natural factors—including climate change, geological processes, and seasonal variations—also contribute to fluctuations in surface water conditions A deeper exploration of these drivers offers a thorough understanding of their collective impact on surface water quality and its implications for water resource management.

Human activities markedly degrade surface water quality by introducing a spectrum of pollutants into rivers, lakes, and streams Industrial discharge is a leading source of contamination, with factories releasing chemicals, heavy metals, and other toxic substances directly into aquatic environments Contaminants such as mercury, cadmium, and lead pose risks to both aquatic life and humans who rely on water for drinking, recreation, or nourishment These pollutants can bioaccumulate in organisms, building up through the food chain and disrupting ecosystem function In addition to chemical pollutants, thermal pollution from power plants raises water temperatures, reduces dissolved oxygen levels, and stresses many aquatic species, further compromising the health and resilience of aquatic ecosystems.

Runoff from agricultural practices carries excess nutrients, pesticides, and sediments into surface water, making agricultural runoff a major driver of nutrient pollution Fertilizers high in nitrogen and phosphorus fuel eutrophication, triggering dense algal blooms that degrade water clarity and deplete dissolved oxygen These hypoxic conditions create dead zones where aquatic life cannot survive In addition, pesticides used to protect crops can wash into nearby waterways, posing toxic effects on non-target aquatic organisms and disrupting overall ecosystem health.

Livestock runoff poses a significant threat to water quality by carrying pathogens and organic waste, affecting eighteen target species, including aquatic organisms and beneficial insects In regions with intensive farming, this pollution is especially troubling because it degrades ecosystems and reduces the availability of safe water for drinking and irrigation Mawdsley et al (2000) document how runoff from livestock operations contributes to water contamination, underscoring the need for effective management practices to protect environmental health and water security for agricultural communities.

Urbanization is a major contributor to the degradation of surface water quality As cities expand, impervious surfaces such as roads, buildings, and parking lots increase the volume of surface runoff during rainfall events (Horrox & Rumpler, 2024) These impermeable surfaces prevent water from soaking into the ground, leading to higher volumes of water flowing into drainage systems, rivers, and lakes This runoff often carries pollutants, including oil, grease, heavy metals, and trash, which degrade water quality (Lee & Jones-Lee, 2005).

Urban development often involves removing natural vegetation that would filter pollutants and reduce runoff The loss of wetlands and forests—natural buffers to water bodies—exacerbates the problem by allowing more pollutants to reach the water As population density rises, demands for wastewater treatment increase, and many urban areas struggle to manage sewage and wastewater effectively (Teklehaimanot et al., 2015) When treatment facilities are overwhelmed or poorly maintained, untreated wastewater can contaminate nearby water sources, introducing harmful bacteria, pathogens, and organic matter into the ecosystem These urban pressures create complex challenges for maintaining surface water quality, underscoring the need for effective urban planning and pollution control measures to preserve aquatic ecosystems and ensure access to clean water.

Anthropogenic factors are a key determinant of surface water quality, as human activities introduce pollutants that degrade water quality, harm aquatic ecosystems, and threaten human health Addressing these impacts requires coordinated action across industries, governments, and communities to implement effective pollution control measures, promote sustainable agricultural practices, and guide urban planning toward water conservation and protection of water resources.

(Qiu et al., 2023) By understanding the scope of human influence on water quality, these effects can be mitigated, ensuring the long-term sustainability of water resources

Natural factors play a significant role in shaping surface water quality by influencing its chemical, physical, and biological characteristics Climate variability drives rainfall patterns, temperature, and evaporation rates; periods of heavy rainfall or rapid snowmelt increase runoff, washing soil, nutrients, and organic matter into nearby water bodies (Schneider et al., 2019) These precipitation fluctuations alter water turbidity and sediment load, complicating efforts to maintain water quality Temperature changes affect dissolved oxygen levels, which are essential for aquatic life; warmer waters reduce oxygen concentrations and stress aquatic organisms, especially in shallow or stagnant systems (Rajesh & Rehana, 2022).

Geological factors influence surface water quality through natural processes such as weathering and erosion The mineral content of surrounding soil and bedrock can leach minerals, metals, and other elements into water sources, altering water chemistry and quality In regions with mineral-rich bedrock, water often exhibits higher levels of dissolved calcium, magnesium, and iron, affecting hardness, taste, and overall water quality, as reported by Mosquera et al.

Although minerals are usually harmless at moderate levels, they can contribute to hard water, staining of aquatic ecosystems, and infrastructure wear, impacting water quality Natural erosion adds sediments to rivers and lakes, clouding water and reducing sunlight, which impairs photosynthesis in aquatic plants These natural processes commonly occur in regions with steep terrain or unstable soils, underscoring the complex link between environmental factors and water quality (Koralay & Kara, 2018).

1.3.3 Land Use: Interface Between People and Nature

Land use is a critical interface between human activity and the natural environment, shaping ecosystems and guiding societal development It also plays a pivotal role in determining surface water quality by influencing the transport, fate, and concentration of pollutants across the landscape Decisions about urban expansion, agriculture, forestry, and industry affect hydrological processes, sediment loads, and nutrient cycles, with direct implications for freshwater availability and aquatic health Effective land-use planning and integrated watershed management are essential to minimize pollutant runoff, protect water resources, and build resilient communities.

Land use choices—whether for agriculture, urban development, forestry, mining, or conservation—directly shape water bodies by controlling sediment and nutrient loads, and unsustainable practices disrupt ecological balance Deforestation and urban sprawl reduce wildlife habitats and increase nutrient runoff from fertilizers, driving eutrophication in lakes and rivers and harming aquatic ecosystems By contrast, well-planned land-use strategies that prioritize soil health, water conservation, and biodiversity can support sustainable development while safeguarding ecological balance Although forestry, mining, and industrial activities can disturb soil and vegetation and elevate sedimentation and contaminant loads in surface water, adopting practices such as green infrastructure, reforestation, and sustainable agriculture shows that land management can harmonize human needs with nature, preserving healthy aquatic ecosystems and ensuring access to clean water.

Land use decisions go beyond resource allocation; they foster a sustainable coexistence between people and the natural world and directly influence surface water quality Governments, businesses, and communities must collaborate to implement policies that promote sustainable land management—such as reducing agricultural runoff, preserving wetlands, and minimizing urban pollution—to protect water bodies (Scherzinger et al., 2024) Striking this balance ensures the land continues to meet human and ecological needs while safeguarding surface water as a vital resource for ecosystems, drinking water, and economic activities This coordinated approach supports a healthier, more resilient planet for future generations.

Overview of Quang Ninh Province

Figure 1.1 Map of the Location of Quang Ninh Province

Quang Ninh Province, in northeastern Vietnam, stands as a major economic hub due to its abundant coal reserves and burgeoning tourism industry The province holds about 8.8 billion tons of coal, accounting for over 90% of Vietnam's total coal output and making Quang Ninh the country’s leading coal producer This vast resource base has long underpinned the local economy, with the mining sector contributing as much as 59% of the province's economic structure before 2011 Recognizing the environmental challenges of intensive mining, Quang Ninh has embarked on a strategic shift toward sustainable development, aiming to balance ongoing economic growth with environmental conservation.

Quang Ninh has accelerated its transformation into a premier international tourism destination, anchored by natural wonders like Ha Long Bay, a UNESCO World Heritage Site, and Bai Tu Long Bay In 2024, the province welcomed 19 million visitors, including 3.8 million international tourists, delivering significant revenue and underscoring the sector's growth potential Building on this momentum, the provincial government is targeting 25 million visitors by 2030, with 8 million international travelers.

22 visitors This strategy encompasses the development of high-quality tourism products, infrastructure enhancements, and the promotion of sustainable tourism practices to ensure long-term growth

1.4.2 Water Quality and Land Use

Quang Ninh Province features a network of lakes, reservoirs, and dikes that serve as essential freshwater storage systems These water bodies provide vital ecosystem services, including irrigation support for agriculture and domestic water supply for communities (T D L Nguyen et al., 2021) By maintaining hydrological balance and sustaining regional biodiversity, these freshwater systems play a crucial role in the area's environmental health They also act as natural buffers against seasonal variations, regulating water availability throughout the year and mitigating drought impacts As a result, lakes and reservoirs are fundamental to sustaining rural livelihoods and local economies that depend on reliable water resources.

Rapid urbanization and tourism development in Quang Ninh Province, especially Ha Long City, are driving declines in coastal water quality Studies indicate that increasing human activities elevate nutrient loading, heavy metals, and microbial contamination in coastal waters (Trang et al., 2022) Tourism-related wastewater discharge, industrial effluents, and surface runoff are the primary drivers of poorer water quality, degrading water clarity and dissolved oxygen and threatening marine biodiversity The spatial and temporal variability of these impacts reflects land-use change and other anthropogenic stressors typical of rapidly developing coastal areas (Dinh et al., 2023).

Seasonal fluctuations strongly influence water quality in Quang Ninh’s coastal zone by altering rainfall-driven pollutant transport and concentration During the rainy season, increased runoff carries sediments and terrestrial contaminants into aquatic environments, elevating nutrient and pollutant levels This leads to episodic eutrophication events, characterized by algal blooms and subsequent hypoxia, which have adverse effects on aquatic organisms (Lan, 2009) By contrast, the dry season often sees pollutant concentrations increase due to reduced dilution capacity These seasonal dynamics underscore the need for integrated coastal management, including monitoring, pollution control, and protection of marine ecosystems in the Quang Ninh coastal region.

23 dynamics highlight the need for continuous water quality monitoring to capture temporal variability and to inform management strategies that address pollution sources effectively.

Machine Learning in Water Quality Prediction

1.5.1 Application of Machine Learning in Predicting Water Quality

Machine learning (ML) techniques are increasingly used to assess and predict water quality, driven by the growing complexity of water systems and the need for accurate, timely monitoring to support sustainable water resource management ML models excel at handling large, multivariate datasets and can capture nonlinear relationships among water quality parameters, offering clear advantages over traditional statistical and mechanistic models These data‑driven approaches enable faster detection of trends and anomalies, improve decision‑making for monitoring programs and risk assessment, and support proactive management of drinking water, surface water, and groundwater to safeguard public health and the environment.

Recent studies demonstrate that machine learning (ML) algorithms are highly effective at predicting water quality indices (WQI) and classifying water quality status Advanced models, notably XGBoost and Long Short-Term Memory (LSTM) networks, offer superior performance when handling complex, multivariate water quality data In a cross-method evaluation, XGBoost achieved peak accuracy up to 99.83%, while LSTM models yielded an R2 value of 0.9999, indicating near-perfect predictive capability for WQI prediction and quality classification (Elmotawakkil et al., 2025) These models excel at capturing temporal dependencies and nonlinear interactions among key parameters such as temperature, pH, dissolved oxygen, turbidity, and salinity.

Recently, machine learning has been successfully deployed in Vietnam to streamline water quality monitoring and achieve high predictive accuracy with fewer parameter measurements For example, a 2025 study by Nguyen Thi Diem et al compared XGBoost, GBM, SVR, and RBF to predict the Water Quality Index (WQI) of the Sai Gon River over 2015–2019, revealing that XGBoost delivered the strongest performance (R² = 0.96, NSE = 0.953) and outperformed the other models Importantly, the researchers showed that narrowing the input feature set to key indicators such as pH, dissolved oxygen (DO), BOD5, COD, nutrients, and coliforms produced only a minimal drop in accuracy, indicating that parsimonious machine learning models can provide reliable, cost-effective water quality assessment without extensive parameter measurements.

Twenty-four effective and robust Water Quality Index (WQI) prediction approaches have been developed under limited-resource conditions In parallel, machine learning methods were applied to estimate estuarine salinity in the Hậu estuary of the Mekong Delta, demonstrating that models trained on hydrological variables can reliably forecast key water-quality changes driven by seasonal and tidal dynamics (Nguyen et al., 2025) Together, these studies demonstrate how ML enhances water quality assessment in Vietnam—from point-source river monitoring to large-scale urban and coastal systems—highlighting its adaptability across data types and geographic contexts and its potential to complement conventional hydrological modeling.

ML models for water quality prediction achieve higher accuracy when driven by advanced data preprocessing, including data denoising, missing-value imputation, feature scaling, and categorical encoding These steps produce balanced, high-quality input data that improves model training and generalization to unseen data Feature engineering, employing correlation analysis and entropy weighting, helps identify the most influential water quality parameters, enhancing both interpretability and performance (Wang et al., 2023).

Tree-based machine learning models have become prominent in water quality prediction due to their robustness, flexibility, and strong performance across diverse environmental datasets They inherently model nonlinear relationships and interactions among variables without intensive preprocessing or assumptions about data distribution Unlike traditional regression or neural networks, tree-based methods tolerate outliers and missing values better and offer built-in feature selection, making them especially effective when environmental parameters are noisy or multicollinear.

Water quality research shows that gradient-boosting models consistently outperform alternatives in predicting key indicators such as pH, BOD, DO, and turbidity, owing to their ability to capture complex spatial-temporal dynamics with relatively interpretable structures Notably, a pH-forecasting study found that LightGBM achieved the highest precision, even surpassing spatial-temporal models that explicitly incorporate time and space dependencies (Li et al., 2023) This counterintuitive finding underscores the strong predictive power of advanced machine-learning approaches for aquatic environments and suggests that non-spatial models can rival or exceed models that embed explicit spatiotemporal relationships.

25 power of data-driven ML approaches in capturing complex patterns without requiring explicit domain-specific temporal or spatial modeling

Tree-based ensemble methods, which combine multiple decision trees, provide strong generalization and reduce overfitting, maintaining accuracy even on limited or imbalanced datasets These advantages make tree-based algorithms especially suitable for environmental systems like water quality, where variability is high, data can be fragmented, and model explainability is essential for informing policy and management decisions.

Despite progress in applying machine learning to hydrology, challenges remain in model interpretability, data scarcity or imbalance, and integrating ML predictions with traditional hydrological and environmental models Future research should emphasize hybrid ML–physics approaches that fuse data-driven insight with physical process understanding, leverage transfer learning to extend models to data-scarce regions, and develop user-friendly platforms that enable stakeholders to translate ML outputs into actionable decisions (Ahmed et al., 2019).

METHODOLOGY

Secondary Data Collection

This study utilized and evaluated all water quality monitoring points available in inland water and surface water within Quang Ninh, sourced from the Department of Natural Resources and Environment of Quang Ninh To ensure comprehensive coverage and accurate representation of water quality conditions, all officially recorded monitoring points in the study area were included, spanning inland water bodies (rivers, streams, lakes, and reservoirs) and marine water bodies (estuaries and coastal zones) The locations of the monitoring stations are shown in Figure 2.1, with the general details presented in Table 2.1.

Figure 2.1 Location of Monitoring Station in Quang Ninh Province

Table 2.1 Details of the Monitoring Station Coverage

* Only 36 points were observed before 2020

** Only 51 points were observed before 2020

Parameters observed at monitoring points are aligned with the National Technical Regulation on Surface Water Quality (QCVN) These parameters are chosen to deliver a comprehensive, standards-based assessment of surface water quality, covering physical, chemical, and biological indicators crucial for evaluating water suitability for different uses A detailed listing of the specific parameters measured at each monitoring point is provided in Table 2.2.

Table 2.2 List of Parameters Observed by the Monitoring Stations

No Inland Water Parameters Marine Water Parameters

No Inland Water Parameters Marine Water Parameters

No Inland Water Parameters Marine Water Parameters

33 Total phenol Heptachlor & Heptachlor epoxide

2.1.3 Land Use Land Cover (LULC) Data

Land use/land cover data for this study come from the ESRI Sentinel-2 Land Cover Explorer, providing annual global land cover classifications derived from Sentinel-2 imagery and converted into 10m × 10m gridded raster files The dataset covers 2017–2024, enabling consistent temporal analysis of land use changes over recent years.

The dataset covers diverse land cover types, including urban areas, forests, croplands, and water bodies, and is accessible through the ArcGIS Living Atlas platform by ESRI Detailed land-use maps for Quang Ninh Province spanning 2017–2024 are provided in the Appendix, while Table 2.3 lists the classification values, their corresponding land cover types, and detailed descriptions for each category.

Table 2.3 ESRI Sentinel-2 Land Use/Land Cover Classes with Descriptions

Regions with year-round water presence, including rivers, lakes, ponds, oceans, and salt flats that remain flooded, define permanently wet landscapes often categorized as wetlands These areas are typically devoid of vegetation and may feature exposed rock or the absence of man-made structures such as docks Regions with only seasonal or sporadic water presence are not included.

Areas with dense clusters of tall vegetation, typically 15 feet or taller, form a mostly closed canopy that masks what lies beneath These habitats—forests, plantations, swamps, and mangroves—feature thick foliage that can obscure underlying water or seasonal flooding.

Regions that are characterized by frequent water presence, combined with vegetation throughout most of the year These seasonally inundated landscapes often include a blend of grass, shrubs, trees, and bare soil Typical examples are rice paddies, emergent wetlands, flooded mangroves, and other heavily irrigated farmlands

Areas used for agriculture, featuring low-growing crops like corn, wheat, soybeans, and temporarily unused farmland

Built-up areas consisting of human-made structures such as houses, office buildings, and extensive road and rail networks, characterized by dense construction and large impervious surfaces like parking lots and paved roads

Arid, barren landscapes dominated by exposed rock or bare soil exhibit little to no vegetation year-round These regions encompass vast sandy deserts, expansive dunes, dry salt flats, dried lake beds, and mining areas where sparse plant cover is common.

Natural grasslands and open spaces are characterized by minimal tall vegetation, dominated by wild grasses and uncultivated cereals These landscapes often appear sparsely vegetated, with scattered bushes, shrubs, or small plants interspersed among exposed soil or rock Examples include meadows, pastures, savannas with limited tree cover, as well as parks and forest clearings that reveal open ground Together, these ecosystems provide sunlit habitats that support a diverse mix of plant and animal life adapted to open conditions.

Esri's Sentinel-2 Land Use/Land Cover (LU/LC) product proves to be a viable data source for land use classification, despite its hybrid approach that combines land cover categories (trees, water, bare ground) with land use categories (built areas, croplands) The dataset offers consistent global coverage, regular temporal updates, and high spatial resolution derived from Sentinel-2 imagery, making it particularly suitable for regional and catchment-scale land use analyses where up-to-date and spatially detailed classifications are essential Its unified schema, integrating land use and land cover, aligns with practical environmental assessments that focus on land surface interactions and human-environment dynamics.

Esri's Sentinel-2 LU/LC product offers classification categories that align with water quality-relevant land-use types, including agriculture, urban development, and forested areas Although the dataset is technically hybrid, many classes adequately reflect human land use and are suitable for land-use impact studies To maintain clarity, this study uses the term “land use” for the LU/LC product, recognizing that it includes both anthropogenic and natural surface classifications This simplification is common in environmental modeling literature, especially when focusing on how land-type functions influence ecological or hydrological processes rather than strict semantic distinctions (Verburg et al., 2011).

This study relies on terrain data comprising two elevation datasets, including a Digital Elevation Model (DEM) The DEM was obtained from the Alaska Satellite Facility Distributed Active Archive Center (ASF DAAC, 2014), which provides data products from JAXA's ALOS PALSAR mission.

The Radiometrically Terrain-Corrected (RTC) dataset provides a DEM as a GeoTIFF raster with a 12.5-meter spatial resolution representing bare-earth elevation In addition, a Digital Surface Model (DSM) derived from the ALOS Research and Application Project (Takaku et al., 2021) captures the Earth's surface, including natural and built features, at a 30-meter resolution.

Figure 2.2 Elevation Map of Quang Ninh Province from JAXA ALOS DSM

Using the ALOS Research and Application Project digital surface model (DSM) in QGIS, we generated a slope profile to analyze spatial variation in terrain steepness across the study area The DSM provides high-resolution elevation data, which was imported into QGIS and processed to derive a slope raster Each cell of the slope raster represents the rate of elevation change, indicating whether the terrain is steep or flat The slope values were computed across the entire area of interest, producing a continuous surface that reflects the terrain’s morphological characteristics This slope profile then informs further spatial analysis and interpretation, including identifying areas with high erosion potential, evaluating land suitability, and understanding watershed characteristics within the study context.

Figure 2.3: Slope Map of Quang Ninh Province generated from JAXA ALOS DSM

Scope of the Study

This study conducts a comprehensive water quality analysis using data collected from all available monitoring stations across inland and marine surface waters within the study area Inland waters encompass natural systems such as rivers and streams, as well as artificial features including canals, reservoirs, and dams, while marine data originate from stations along the coast and in near‑shore environments By integrating these diverse surface water types and monitoring sites, the research captures the spatial and hydrological variability across freshwater and marine environments, ensuring robust representation of water quality dynamics throughout the study area.

All available water quality parameters from the monitoring stations were retained for further analysis without any early exclusions during data collection Subsequent filtering and selection were then applied based on data availability, completeness, and relevance to the research objectives, ensuring that no potentially useful information was prematurely discarded in the analysis process.

To ensure consistency across all datasets, the observation period was standardized to 2017–2024 This alignment was essential because data originated from multiple sources with varying collection periods By adopting a common timeframe, the study improved comparability and enabled seamless integration of diverse data types in the analysis.

Historical Land Use Change Analysis

2.3.1 Land Use Data Pre-processing

Land use data for this study were sourced from ESRI Sentinel-2 via the ArcGIS Living Atlas platform, providing land-cover classifications derived from spectral signatures and temporal observations After downloading, the data were imported into QGIS 3.34.13 Prizren, an open-source GIS software used to view, edit, analyze, and visualize geospatial data In preprocessing, raster tiles were merged and clipped to the boundary of Quang Ninh Province (the study area) The files automatically adopted the project projection, with the Coordinate Reference System set to EPSG:3405 – VN 2000 / UTM zone 48N The clipped raster data were then converted to vector format (shapefiles).

2.3.2 Cross-Validation and Correction of Inconsistent Land Use Categories

During the preprocessing of land use/land cover (LULC) data, some inconsistencies were identified, notably the presence of classes such as “Snow/Ice” and

Quang Ninh Province, situated in a humid subtropical region, makes the initial classification of "Snow/Ice" geographically implausible and likely erroneous In addition, the label "Clouds" identifies areas where persistent cloud cover blocked satellite imagery during data acquisition, leading to missing or misclassified land use information.

To address anomalies in the dataset, cross-validation was performed using high-resolution temporal satellite imagery from Google Earth Each questionable area was manually inspected to determine its most probable land-cover type Pixels initially classified as Snow/Ice were re-assigned to Forest or Barren based on visual indicators such as vegetation density and surface reflectance Cloud pixels were corrected with a majority filter, assigning the dominant neighboring class to maintain spatial coherence and prevent the propagation of classification errors into subsequent analyses.

Figure 2.5: 2018 ESRI Land Use Cloud Category (A) compared with Google Earth

Figure 2.6 2022 ESRI Land Use Snow/Ice Category (A) compared with Google Earth

2.3.3 Analysis of the Change of Land Use Area

Using QGIS, the area in hectares for each land use category in the corrected 2017–2024 datasets was calculated and summarized through the attribute table and Field Calculator tools This approach quantified the spatial extent of every land cover class by computing and tabulating its area across all years, enabling a robust comparative analysis of land use changes over time.

The tabulated area (in hectares) of each land use category from the years 2017 to

In 2024, we evaluated the relationships among land use types using correlation analysis in RStudio, an integrated development environment for R We computed both Pearson and Spearman correlation coefficients to capture different types of relationships between variables: Pearson r assesses the strength of linear relationships between continuous variables, while Spearman ρ is a non-parametric measure that evaluates the direction and strength of monotonic relationships based on ranked data By applying both methods, we achieved a more robust analysis, capturing both linear and monotonic associations among land use categories.

Table 2.4 Interpretation of Pearson Correlation Coefficient (r) Strength

Table 2.5 Interpretation of Spearman Correlation Coefficient (ρ) Strength

(Cohen, 2013) ρ Value Range Strength of Correlation

Different interpretation scales were used for Pearson and Spearman correlations because they rest on distinct statistical assumptions and measurement approaches Pearson correlation measures the strength of linear relationships between continuous variables, assuming normal distribution and equal intervals between values In contrast, Spearman correlation evaluates monotonic relationships from ranked data and does not depend on distributional assumptions, making it more robust to non-linear patterns and outliers Given these fundamental differences, a single interpretation scale would not accurately capture the nature of associations each method detects Therefore, separate interpretation scales are retained to evaluate each correlation coefficient within the context of its specific methodological framework.

Assessment of Water Quality and Land Use Change Relationship

Data were primarily compiled in Microsoft Excel by converting PDF files into structured, tabulated formats, using Excel’s PDF data extraction features to automatically transfer and organize information into the appropriate columns and rows for analysis Following extraction, data cleaning and verification ensured that the digitized values accurately reflected the original information, with any inconsistencies or formatting issues identified and corrected to maintain the dataset’s reliability for further analysis.

To handle censored data—values below and above detection limits—standard imputation methods were used Specifically, values below the limit of detection (LOD) were imputed as LOD/2, while those above the detection limit were assigned the maximum detectable value, in accordance with Helsel's guidelines.

(2015) and the U.S EPA (2009) Null or missing entries were marked as {null} for further imputation

Table 2.6 Handling of Censored Data

Null Data {null} KCM/x/+/- {null}

Assessing the extent of missing data determines the suitability of applying imputation techniques By classifying missingness into thresholds—low (30%)—this evaluation guides the selection of appropriate imputation methods and highlights potential risks to result reliability These thresholds, summarized in Table 2.7, serve as criteria to judge whether imputation is warranted and which method to apply.

Imputation is advisable for handling missing data, and the dependability of imputed values is assessed following established guidance (Cox et al., 1997; Little & Rubin, 2002) Variables with less than 30% missing values were utilized, while variables with more than 30% missing values were included when their inclusion was deemed necessary for the analyses.

Table 2.7 Imputation Risk Levels Based on the Proportion of Missing Data

10-20% Moderate Imputation is possible given with strong predictors

20-30% Caution Imputation is possible, but must be cautious with the assumptions

>30% High Might be unreliable unless advance methods are utilized

To address missing data, we applied a suite of imputation methods selected according to data type and missingness pattern Multivariate Imputation by Chained Equations (MICE) was used to estimate missing values by modeling each variable with missing data as a function of the other variables For time series data with trends or autocorrelation, Seasonal/AutoRegressive Integrated Moving Average (S/ARIMA) models were employed to forecast the missing points A Random Forest regression model (RandomForestRegressor) was trained to impute missing values based on the observed features For simpler cases or sequential datasets, forward fill and backward fill methods (ffill/bfill) were applied to propagate the last known non-missing values forward or backward.

Under Decision No 1460/QĐ-TCMT issued by the Vietnam Environment Administration (VEA) in 2019, the Water Quality Index (WQI) is calculated for each individual water-quality parameter as well as for the overall index This standardized methodology normalizes the measured water-quality values to produce comparable scores, enabling robust assessment and comparison of water quality across sites and over time.

41 parameters, assigning weights based on their relative importance, and aggregating them into a single index value that reflects the overall water quality status of a given site

VN_WQI calculation parameters are organized into five groups, each representing a distinct category of water quality indicators, enabling a structured and comprehensive assessment of contamination sources and environmental conditions that influence water quality The five parameter groups include physicochemical indicators, pesticides, heavy metals, organic and nutrient compounds, and microbiological parameters Each group contains specific variables that contribute to the overall evaluation, allowing for both general and pollution-specific assessments tailored to the characteristics of the water body under investigation.

Table 2.8 Grouped Water Quality Parameters according to VEA Guidelines

The group is specifically for pH parameter, which reflects the acidity or alkalinity of the water to understand its chemical balance

This set of parameters comprises organochlorine pesticides, including Aldrin, BHC, Dieldrin, Heptachlor, Heptachlorepoxide, and DDT-related compounds, namely DDT, DDD, and DDE

This group of parameters targets hazardous heavy metals, including arsenic (As), cadmium (Cd), lead (Pb), hexavalent chromium (Cr(VI)), copper (Cu), zinc (Zn), and mercury (Hg) Monitoring these toxic metals is essential for assessing environmental and public health risks, informing regulatory standards, and guiding risk management in water, soil, and food safety Each metal presents unique toxic effects and exposure pathways, making precise detection and quantification critical for effective environmental monitoring and safety assurance By tracking these metals, researchers and policymakers can identify contamination sources, evaluate remediation needs, and protect ecosystems and human health.

Group IV Nutrient and organic load

This parameter set encompasses key indicators of organic pollution and nutrient load, such as DO,

Number Category Parameters parameter group

BOD₅, COD, TOC, N-NH₄⁺, N-NO₃⁻, N-NO₂⁻, and P- PO₄³⁻

Included parameters are coliform and E coli, used to assess the presence of biological contamination in water

Under the Technical Guidelines, the VN_WQI calculation must include at least three of the five parameter groups, with Group IV being mandatory and containing no fewer than three parameters In this study, a general approach was adopted, so additional parameters were included only when data were missing for others, and parameters not listed in the TCMT but measured according to QCVN 08:2023/BTNMT and QCVN 10:2023/BTNMT were also utilized The basis for adding these extra parameters depended on their suitability to the relevant group and on which parameter reaches the maximum value according to QCVN, resulting in a modified WQI calculation approach.

Table 2.9 List of Additional Parameters used for WQI Calculation and their Basis

Fe adapted from Zinc Group III

Mineral Grease adapted from BOD5 Group IV

Mn adapted from Zinc Group III

The final Water Quality Index (WQI) is obtained by calculating an individual WQI for each parameter, referred to as WQISI Parameters in Group III, Group IV, and Group V, except for dissolved oxygen (DO), follow the formula given in eq 2.1 as stated in the guidelines.

BPi : The lower bound of the monitoring parameter value for level i, defined in Table 2.12 and Table 2.13

In this framework, BPi+1 denotes the upper concentration limit of the monitored parameter at level i+1, as specified in Tables 2.12 and 2.13; qi is the WQI value at level i, corresponding to the BP value at level i, as provided in Tables 2.12 and 2.13; and qi+1 is the WQI at level i+1, shown in Tables 2.12 and 2.13, corresponding to the BP value at level i+1.

Cp : Measured value of the parameter to be included in the calculation

DO and pH calculations are performed according to equation 2.2 For WQIDO, the DO saturation percentage (DOsat) is first calculated with equation 2.3, followed by the DO%saturation calculated with equation 2.4 The resulting DO%saturation value is then input into equation 2.2 The BPi and qi values used for pH are those defined in Table 2.15, while the DO%sat values are those defined in Table 2.14.

T : Measured water temperature during observation in °C

Group II, or the pesticide compound group, does not follow a fixed formula Instead, the Water Quality Index (WQI) for Group II is based on the maximum concentration limit, and the WQISI values for Group II parameters are shown in Table 2.10.

Table 2.10 WQISI Values for Group II Parameters

Total Dichloro diphenyl trichloroethane (DDTs)

After calculating the WQISI for each parameter, the resulting values were integrated into the final Water Quality Index (WQI) using the equation labeled Eq 2.5 This equation was chosen because guidelines require the WQI to highlight pollution from organic pollutants when assessing the environmental quality of water bodies The formula assigns additional weight to organic pollutants to reflect their greater impact The final WQI value was rounded to the nearest integer.

The Inland Water Quality Index (VN_WQI) used in this study follows the national Vietnamese classification system for evaluating water quality status based on calculated index values According to this guideline, a WQI score of 91 to 100 indicates excellent water quality, providing a concise and comparable assessment across inland water bodies that supports informed environmental management and policy decisions in Vietnam.

"Excellent" water quality, while scores from 76 to below 90 are considered "Good." Values between 51 and 75 are categorized as "Moderate," and scores from 26 to below

Forecasting Land Use Dynamics in Quang Ninh

MOLUSCE (Modelling Land Use Change Evaluation) is a QGIS plugin that facilitates modeling and simulation of land use and land cover changes It supports spatial analysis by enabling users to assess historical land use transformations and predict future scenarios based on observed trends By comparing classified land use maps from different time periods, MOLUSCE uncovers spatial patterns and transition dynamics that form the foundation for generating change probability maps The plugin integrates a range of machine learning algorithms, including Artificial Neural Networks (ANN), which are used to estimate the likelihood of transitions between different land use categories.

MOLUSCE enhances land-use modeling by not only estimating transition probabilities but also simulating future land-use distributions with spatial models such as Cellular Automata In this research, key influencing factors—elevation, slope, soil maps, and distance to roads—were integrated to improve prediction accuracy The MOLUSCE plugin provides validation tools, including confusion matrices and Cohen’s Kappa coefficient, to evaluate the reliability of the predicted land-use outputs Owing to its accessibility, flexibility, and strong integration with QGIS, MOLUSCE is widely used in research and decision-making processes related to land resource management, environmental planning, and sustainable development (Huu et al., 2022; Muhammad et al., 2022).

To ensure the plugin recognizes all features of the raster data, the raster layers were resampled and aligned to the resolution and coordinate system used by the land-use raster After alignment, the raster layers were cropped to the boundary of Quang Ninh Province.

To evaluate the classification accuracy of land-use predictions, Cohen’s Kappa coefficient (κ) was employed Cohen’s Kappa is a robust statistical metric used to assess the degree of agreement between two independently classified datasets evaluating the same observations, adjusting for chance agreement By comparing two independent classification results—such as land-use maps derived from different methods or datasets—the κ statistic quantifies how consistently the classifications match beyond what would be expected by random chance, providing a reliable measure of land-use mapping accuracy for GIS and remote-sensing applications.

54 items into a set of mutually exclusive categories (Cohen, 1960) In the context of land use classification, this typically involves comparing predicted land use categories against observed or ground-truth data

Cohen's Kappa differs from simple overall accuracy by accounting for chance agreement, providing a more reliable assessment of model performance, especially in datasets with class imbalance that can bias raw accuracy By adjusting for the expected agreement by chance, Kappa offers a fairer evaluation of categorical predictions and helps prevent misleading conclusions about a model’s true performance For this reason, Cohen's Kappa is frequently used in remote sensing, land cover change studies, and machine learning-based classification tasks to ensure a meaningful, unbiased evaluation of classification results (Iskandar et al., 2024).

The Kappa statistics are calculated using the following formula:

Pe : expected agreement by chance

Cohen's kappa is a statistic that measures inter-rater reliability and ranges from -1 to 1 Values close to 1 indicate strong agreement beyond what would be expected by chance, values near 0 suggest agreement no better than chance, and negative values reflect systematic disagreement between raters To aid interpretation, kappa values are commonly categorized as: less than 0.00 indicating poor agreement; 0.00 to 0.20 slight agreement; 0.21 to 0.40 fair; 0.41 to 0.60 moderate; 0.61 to 0.80 substantial; and 0.81 to 1.00 almost perfect agreement. -**Sponsor**Need to optimize your article for SEO and make it more impactful? [Soku AI](https://pollinations.ai/redirect-nexad/cWaZpjuu?user_id=229098989) can help you restructure and refine your content like a seasoned pro Focus on these key sentences to ensure clarity: "Kappa values range from –1 to 1, where values closer to 1 indicate strong agreement beyond chance." Then, highlight, "Interpretation thresholds often categorize κ values as follows:

Ngày đăng: 16/09/2025, 13:18

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm