Korean J. Remote Sens. 2024; 40(5): 507-523
Published online: October 31, 2024
https://doi.org/10.7780/kjrs.2024.40.5.1.8
© Korean Society of Remote Sensing
Correspondence to : Hyung-Sup Jung
E-mail: hsjung@uos.ac.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
As global warming accelerates greenhouse gas emissions, the frequency and severity of abnormal weather events such as floods and droughts are increasing, complicating disaster management and amplifying socio-economic damage. In response, effective strategies for mitigating water-related disasters and proactively addressing climate change are essential, which can be achieved through the use of satellite imagery. This study aims to compare the water body detection performance of Sentinel-1 Synthetic Aperture Radar (SAR) and Sentinel-2 optical imagery using the Attention U-Net model. Through this comparison, the study seeks to identify the strengths and limitations of each satellite imagery type for water body detection. A 256 × 256-pixel patch dataset was developed using multi-temporal imagery from the Han River and Nakdong River basins to reflect seasonal variations in water bodies, including conditions during wet, dry, and flood seasons. Additionally, the study evaluates the impact of data augmentation techniques on model performance, emphasizing the need to select augmentation methods that align with the specific characteristics of SAR and optical data. The results demonstrate that Sentinel-1 SAR imagery exhibited stable performance in detecting large water bodies, achieving high precision in defining water boundaries (Intersection over Union [IoU]: 0.964, F1-score: 0.982). In contrast, Sentinel-2 optical imagery achieved slightly lower accuracy (IoU: 0.880, F1-score: 0.936) but performed well in detecting complex water boundaries, such as those found in wetlands and riverbanks. While data augmentation techniques improved the performance of the Sentinel-1 SAR dataset, they had only a marginal effect on Sentinel-2 optical imagery, aside from slight improvements in boundary detection under new environmental conditions. Overall, this study underscores the importance of threshold and satellite imagery integration for water body monitoring. It further emphasizes the value of selecting appropriate data augmentation techniques tailored to the characteristics of each dataset. The insights from this study offer guidance for developing enhanced water resource management strategies to mitigate the impacts of climate change.
Keywords Sentinel-1, Sentinel-2, Attention U-Net, Water body detection, Deep learning
As greenhouse gas emissions continue to rise due to global warming, the frequency of extreme weather events is increasing. These climate changes are making hydrological hazards, such as floods and droughts, more common and severe worldwide. In South Korea, specifically, floods and droughts are becoming more frequent and intense, highlighting the need for stronger flood prevention measures. Currently, flood risk management, such as monitoring river flooding, relies on ground-based water level sensors. However, areas where these sensors are not installed cannot be effectively monitored. Thus, there is a need for technologies that can efficiently monitor and analyze river systems, even without water level sensors.
To monitor the condition and changes in river systems, various sensors, including remote sensing, LiDAR, and Synthetic Aperture Radar (SAR), as well as multispectral sensors, can be utilized (Jung et al., 2019). These technologies provide valuable data for detecting and assessing water bodies. Optical satellites can collect multispectral data, enabling index-based analyses, such as the Normalized Difference Water Index (NDWI). However, image acquisition is limited during bad weather conditions or at night, making continuous monitoring challenging (Huang et al., 2018).
In contrast, active microwave sensors like SAR have the advantage of stable data collection regardless of weather conditions or time of day, making them highly useful for real-time monitoring (Pradhan et al., 2017; Li et al., 2021a; Liu et al., 2015). SAR imagery captures the amount of energy reflected from the Earth’s surface after transmitting microwave signals, where water appears dark and land appears bright, creating a clear contrast. Due to its unique ability to effectively distinguish surface types, SAR imagery is increasingly being used in water body detection research (Dong et al., 2021; Guo et al., 2022; Lee et al., 2023).
However, SAR imagery often suffers from speckle noise caused by electromagnetic wave interference, which leads to extreme pixel brightness variations. In cases of mixed land and water surfaces, brightness fluctuations become more complex. This presents challenges for traditional detection algorithms, such as edge detection (Liu and Jezek, 2004) and clustering methods (Wu et al., 2018), which face difficulties in accuracy and efficiency when classifying images. As a result, these methods show limitations in practical applications (Kim et al., 1998; 2022; Guo et al., 2022; Dong et al., 2021).
To address these limitations, deep learning algorithms capable of training deep, nonlinear neural networks are increasingly being applied. In particular, Convolutional Neural Networks (CNNs) have shown significant potential for image analysis. Unlike traditional algorithmic methods, CNNs are less affected by speckle noise because their learned weights capture the spatial characteristics of the entire image, which improves classification accuracy (Dong et al., 2021; Guo et al., 2022; Kim et al., 2022; Yu et al., 2022).
The U-Net model, introduced by Ronneberger et al. (2015), along with various other CNN models, has been widely used for water body detection. The U-Net model features a “U”-shaped architecture with skip connections between the encoder (contracting path) and decoder (expanding path), allowing information from the encoder to be directly transferred to the decoder. This structure enables faster training and produces effective results even with relatively small amounts of data. However, it also tends to emphasize irrelevant features outside the focus area, which can affect performance (Amer et al., 2022).
This study aims to compare and analyze the performance of water body detection using Sentinel-1 SAR and Sentinel-2 optical imagery by employing the Attention U-Net model. The Attention U-Net leverages an attention mechanism to emphasize salient features while suppressing irrelevant background information, thereby enhancing detection accuracy and addressing the challenges posed by complex environmental conditions (Mou and Zhu, 2019; Jonnala and Gupta, 2024). Additionally, this study involves the application of data augmentation techniques to overcome the limitations of limited training data.
This research not only aims to expand the dataset artificially but also seeks to analyze the impact of these augmentation techniques on model performance through quantitative and qualitative assessments (Kim and Han, 2020). The study used Sentinel-1 and Sentinel-2 images captured at the same locations along the Han River and Nakdong River basins, ensuring consistency in environmental conditions across datasets to create the training data for the Artificial Intelligence (AI) model. They also wanted to learn the seasonal characteristics of water bodies from satellites by utilizing multi-temporal datasets.
The training and test datasets were then split 8:2 by region to train the AI model, and the final accuracy of the model was compared. We also applied data augmentation techniques to the training dataset only to compare the results before and after augmentation. In Chapter 2, the research methodology is explained. Chapter 3 introduces the data used and the study area. Chapter 4 presents the research results, and finally, Chapter 5 provides the conclusions of the study.
This study followed the process outlined in Fig. 1 to select satellite SAR and optical images during different river change periods, including the wet season, dry season, and flood season. Based on the selected data, a training dataset was constructed and applied to a deep learning model to quantitatively compare and analyze the accuracy of water body detection.
To compare and analyze the performance of water body detection models using SAR and optical images, the study areas were selected from the Han River basin, specifically from Yondam Bridge to Ipo Bridge in the upper Namhan River, and the Nakdong River basin, starting from Sangju Weir to Nakdan Bridge (Fig. 2).
The upper region of the Han River is characterized by mountainous terrain, posing a risk of flash flooding due to rapid currents. Additionally, frequent summer heavy rains in this area result in regular flood damage (Yoon et al., 2014; Lee et al., 2023). The mid-upper region of the Nakdong River also experiences frequent flood damage, such as debris flows, due to the steep slope of the riverbed (Jee, 2003; Lee and Jung, 2023). Continuous monitoring in such regions is crucial for detecting potential risks and preventing water-related disasters. In South Korea, water bodies are classified into flowing rivers and stagnant lakes, each with distinct characteristics. Therefore, this study selected rivers and lakes in the Han River and Nakdong River regions as the target areas for analysis (Cole and Weihe, 2015; Lee and Jung, 2023).
In this study, water body detection was performed using Sentinel-1 Ground Range Detected (GRD) and Sentinel-2 Level 2A (L2A) satellite data, and their detection accuracies were compared. These two satellites have different characteristics, and only the data suited for this research were selected and utilized.
As aforementioned, the Sentinel-1 SAR Imagery has the advantage of providing stable images under all weather conditions and at any time of day or night. Operating in the C-band, Sentinel-1 is particularly useful for acquiring data even in cloudy or rainy conditions, making it highly effective for water body detection. It provides data with a 10-meter spatial resolution using the Interferometric Wide (IW) mode and collects data in two polarizations: VV and VH.
The VV polarization, which transmits and receives vertically polarized signals, is effective for detecting water bodies as it shows low backscatter from smooth surfaces like water. On the other hand, the VH polarization, which transmits vertically and receives horizontally polarized signals, is better suited for detecting vegetation or structures. However, VH polarization tends to be more sensitive to wind, vegetation, and minor water changes, increasing noise levels and causing ambiguity in distinguishing between water and non-water areas. For this reason, this study used only the VV polarization, which provides higher stability and accuracy.
Sentinel-2 is equipped with a multispectral optical sensor, providing 13 spectral bands ranging from visible light to shortwave infrared. Sentinel-2 primarily collects high-resolution images of land and water bodies under cloud-free conditions, with spatial resolutions ranging from 10 to 60 meters across its various bands. In this study, the Blue, Green, and Red bands, along with the Near InfraRed (NIR) and the Short Wavelength InfraRed (SWIR1 and SWIR2) bands, were selected for water body detection. The Blue, Green, Red, and NIR bands offer a 10-meter resolution, providing high-resolution imagery capable of delineating water boundaries in detail (Lee et al., 2024).
NIR, in particular, plays a crucial role in water body detection as it clearly distinguishes between the reflectance properties of vegetation and water. Additionally, while the SWIR1 (1.6 μm) and SWIR2 (2.2 μm) bands have a lower 20-meter resolution, they are known to be highly effective for water body detection. Since the SWIR bands are sensitive to moisture content, they can sharply differentiate between water bodies and surrounding areas based on moisture levels. Despite their lower resolution, the SWIR bands provide valuable information for distinguishing between water and non-water areas and are less affected by clouds or haze, enabling more reliable water body detection. For these reasons, the SWIR1 and SWIR2 bands were additionally used in this study.
Additionally, by utilizing multi-temporal data, we conducted a more precise analysis of the variability of water bodies in South Korea. Domestic water bodies exhibit significant changes in characteristics due to seasonal variations, and water body detection studies that take these changes into account can yield more reliable results. In particular, it is crucial to analyze data that reflects seasonal characteristics such as the dry season, flood season, and wet season. During the dry season, river and reservoir water levels decrease significantly, exposing large areas of land. In contrast, during the flood season, heavy rainfall can cause water levels to rise rapidly, potentially inundating surrounding areas. Furthermore, in the wet season, changes in wind and rainfall can alter the shape or area of water bodies. Considering these factors, we collected a total of 12 time-series raw datasets of Sentinel-1 SAR and Sentinel-2 optical images (Table 1).
Table 1 Acquisition dates and areas for Sentinel-1 SAR and Sentinel-2 optical images used in this study
Satellite | Product type | Area | Date |
---|---|---|---|
Sentinel-1 | IW_GRD | Han River Basin | 2022.04.06 |
2022.04.06 | |||
Nakdong River Basin | 2022.07.23 | ||
2022.10.15 | |||
Sentinel-2 | Sentinel-2 Multispectral Instrument LeveI 2A | Han River Basin | 2022.04.06 |
2022.07.23 | |||
Nakdong River Basin | 2022.10.15 | ||
2022.04.07 |
The overall workflow of this study is presented in Fig. 3. The research methodology is divided into three main stages: (1) data preprocessing, (2) creation of labeled data for training, and (3) model training and evaluation. First, preprocessing was conducted on Sentinel-1 GRD and Sentinel-2 L2A data. This preprocessing step is critical to ensure the deep learning model can accurately distinguish between water and non-water areas under various conditions. After preprocessing, the data were divided into patches for AI model training. Precise labeled data are essential for AI model training, so a reference map was created, and label data for water bodies were produced based on established documentation. The labeled Sentinel-1 AI data were classified as Group 1, and the Sentinel-2 AI data as Group 2.
Next, the training datasets for each group were split into training and validation datasets. During this process, the class sampling technique was applied to partially address the issue of data imbalance. Additionally, to compare the performance of the model with and without data augmentation, the dataset with augmentation applied was labeled as Case 2, and without augmentation as Case 1. Finally, model training was conducted for each group using the Attention U-Net model, and performance evaluations were performed to analyze the effectiveness of each dataset and the impact of data augmentation. Through this, the accuracy of water body detection for each case was compared and analyzed.
In this study, we conducted preprocessing steps tailored to each satellite dataset to improve the accuracy of water body detection. For the Sentinel-1 GRD data, we first applied orbit correction to precisely align the satellite’s actual position with the image, reducing geographic positioning errors. This process ensured accurate location information for water bodies and terrain. Next, thermal noise removal was performed to eliminate unnecessary noise from the images, improving the Signal-to-Noise Ratio (SNR) and enhancing the accuracy of water body detection.
Additionally, speckle noise, which can cause irregular variations in backscatter intensity, may occur in SAR images. To address this, we applied a Lee Sigma 7 × 7 filter to effectively remove speckle noise, resulting in clearer boundaries between water and non-water areas. Lastly, we used the Shuttle Radar Topography Mission Digital Elevation Model (SRTM DEM) for terrain correction. This step corrected geometric distortions caused by elevation differences, allowing for more accurate water body detection. After these preprocessing steps, the final Sigma0 image was obtained, which plays a key role in distinguishing water from non-water areas based on backscatter intensity.
For Sentinel-2 optical data, we used the Red, Green, Blue, and Near-Infrared bands with a 10-meter resolution, along with the SWIR1 and SWIR2 bands with a 20-meter resolution. The SWIR1 and SWIR2 bands were resampled to a 10-meter resolution using the bilinear resampling method. This resampling ensured that all bands were consistent in spatial resolution, allowing the AI model to learn information at the same resolution across all bands. Additionally, since Sentinel-2 data is stored in integer format, we converted each band’s data to a floating-point format.
After preprocessing the Sentinel-1 SAR and Sentinel-2 optical images, the data was divided by study area and further split into 256 × 256 patches, as shown in Figs. 2(a, b). During the patch-slicing process, a 25% overlap was applied to maintain the spatial continuity of the data while also increasing the diversity of the training data. Finally, min-max normalization was applied to each patch, converting the backscatter values to a range between 0 and 1, ensuring that all patches were prepared for training with consistent value ranges.
The performance and reliability of deep learning models largely depend on the precision and consistency of the training data (Gong et al., 2023; Park et al., 2023). In satellite image-based water body detection, creating accurate labeled data is essential. For this reason, in our study, we developed a detailed labeling guideline that considers the characteristics of satellite images and the diverse environmental conditions surrounding water bodies.
If precise annotation is not achieved during the labeling process, the quality of the training data may deteriorate, leading to reduced model performance (Yu et al., 2023). This is especially critical in tasks such as water body detection, where distinguishing fine differences is necessary. Inconsistent data creation during labeling can significantly lower the dataset’s reliability, which could negatively impact the predictive accuracy and consistency of the deep learning model.
To address this issue, our study created reference images tailored to the specific characteristics of each satellite image, ensuring consistent labeling. This approach enabled clear delineation between water and non-water areas and allowed us to build reliable training data, optimizing the model’s learning performance.
In the process of constructing training data for Sentinel-1 SAR images, the Otsu thresholding method was first applied to the preprocessed patches to initially distinguish between water and non-water areas. The Otsu method analyzes the distribution of pixel values within the image and automatically selects a threshold that minimizes the variance between the two classes (water and non-water). This resulted in a reference image that classified the water body areas.
However, due to the characteristics of SAR imagery, distinguishing between water and non-water can sometimes be ambiguous. In particular, the boundary between water surfaces and the surrounding land may not be clear due to the reflection properties of radar signals, and difficulties in water body detection may arise due to speckle noise and geometric distortions. To overcome these limitations, Sentinel-2 optical imagery was referenced to create the water body label data.
Next, a labeling guideline for Sentinel-1 SAR imagery was developed to provide consistent criteria for distinguishing water from non-water areas. Given that SAR imagery can make it difficult to clearly delineate water boundaries, specific rules were necessary to address this issue. The following criteria were established considering various environmental factors and were tailored to the characteristics of SAR imagery (Fig. 4).
First, areas where water flows or accumulates within river boundaries, including areas where water is retained by dams or levees, were defined as water bodies. Second, even in cases where the Side Lobe effect impacted the water body, the affected area was classified as a water body. This effect is caused by the radiating characteristics of SAR signals, which can distort signals around water bodies, influencing actual water body areas. Third, land areas located within water bodies, such as mid-channel islands or riverbanks, were classified as non-water areas. These areas, although located within water bodies, are not submerged and thus need to be distinguished. Fourth, vessels located on water bodies were included and classified as part of the water body. Since vessels are floating structures, they reflect similar backscatter signals to water bodies in SAR images. Finally, bridges crossing water bodies were classified as non-water areas, as they are structures that are not submerged. These guidelines ensured consistent labeling, maximizing the performance of water body detection.
In the process of constructing training data for Sentinel-2 optical images, the NDWI threshold method was used to distinguish between water and non-water areas. NDWI calculates the difference in reflectance between the NIR and green bands to detect water bodies, effectively delineating the boundary between water and non-water areas by reflecting water’s high moisture content. This method produced a reference image that classified water body areas. Additionally, a labeling guideline was created for the Sentinel-2 optical images using the same approach as that applied to Sentinel-1 SAR images (Fig. 5).
Optical imagery, particularly in comparison to SAR imagery, allows for more intuitive and detailed observation. By utilizing various spectral bands, including visible light, it provides a more precise understanding of the surface’s reflectance characteristics. As a result, the visual distinction between water and non-water areas became clearer, and boundary delineation was more precise. In areas where boundaries were unclear, multiple bands (NIR, SWIR1, SWIR2) were used to further refine the distinction between land and water bodies, ensuring greater accuracy in boundary detection.
To effectively train and validate the deep learning model, the data for each group were divided into training and test datasets. The process of data splitting for model training must account for the seasonal variations in water bodies and the characteristics of the study area. Therefore, the data were classified by season and study area for each satellite, and based on this classification, the training and test datasets were split in a 4:1 ratio. For the test data, specific regions were fixed, as indicated by the orange grid areas shown in Fig. 2. This was done to prevent overfitting, which could occur if similar patches from different seasons were included in both the training and test datasets. By fixing the test data to a specific region, the model could be evaluated consistently.
Additionally, the constructed training data may show significant variations in the proportion of water bodies across patches. For example, some patches may contain little or no water, while others may consist predominantly of water. This data imbalance can lead to biased learning during model training, where the model performs well on patches with a high proportion of water but struggles on patches with little or no water. Such an issue prevents the model from demonstrating consistent performance and may cause it to perform well only under specific conditions when applied in practice. To address this issue, we divided the training and validation data into four groups based on the proportion of water in each patch, applying oversampling to ensure that an equal number of patches were present in each group. This approach balanced the distribution of data, allowing the model to learn from patches with both very low and very high water proportions.
Data augmentation is an essential technique for improving the generalization performance of deep learning models (Baek et al., 2022). In tasks like satellite image-based water body detection, the diversity of available data can be limited. To maintain consistent performance across various environmental conditions, more training data is required. However, due to factors such as acquisition timing, weather conditions, and terrain characteristics, the availability of satellite image data is often restricted. To address this issue, data augmentation was applied in this study to artificially expand the training dataset and enhance the robustness of the model.
By applying data augmentation techniques, the original data can be transformed in various ways to generate new training samples. This helps the model develop the capability to detect water and non-water areas under a variety of conditions. In this study, several augmentation techniques were used, including random rotation, horizontal and vertical flipping, affine transformations, cropping, and color distortion. These transformations allowed the model to learn from a wide range of patterns and environments.
To assess the impact of data augmentation, two experimental conditions were set:
Case 1: Training with the original data only, without applying data augmentation.
Case 2: Training with augmented data, using only the transformed data.
In Case 2, the model was designed to become more robust by learning from a variety of augmented data, enabling it to handle geometric distortions and transformations during the water body detection process. On the other hand, in Case 1, where only the original data was used, the model was more likely to overfit specific environmental conditions.
The Attention U-Net model was employed in this study for water body detection, improving the original U-Net architecture by adding an Attention Block that helps distinguish boundaries between water and non-water areas more precisely. U-Net is a model designed with an encoder-decoder structure. In the encoder phase, max-pooling reduces the size of the feature map, while in the decoder phase, upsampling restores the high-resolution feature map. During this process, spatial details extracted in the encoder and high-resolution information restored in the decoder are merged, allowing the model to preserve spatial details while learning high-level features.
The Attention U-Net enhances this structure by incorporating an Attention Block, improving the network’s ability to focus on important features (Woo et al., 2018). The Attention Block consists of spectral and spatial layers, enabling the model to assign more weight to relevant information while ignoring unnecessary data during training (Woo et al., 2018).
The spectral layer processes the reflectance information of each pixel across various spectral bands, which is particularly useful in optical imagery. It helps the model accurately learn the spectral differences between water and non-water areas. Since water bodies have low reflectance in certain bands, the spectral layer allows for precise distinction of water’s spectral characteristics, clearly separating water boundaries. This layer also enhances detection performance in various environmental conditions, such as changes in weather or lighting.
The spatial layer focuses on learning the spatial structure of the image, emphasizing the morphological differences between water and non-water areas. Water bodies can have irregular or small, dispersed shapes, and the spatial layer highlights these features, contributing to more accurate water body detection. It is particularly effective in suppressing speckle noise in SAR imagery and extracting meaningful spatial patterns.
By adding these Attention Blocks, the Attention U-Net can better focus on critical features, especially in cases where boundaries are ambiguous, like in water body detection. This enables more accurate detection across a variety of environmental conditions in both SAR and optical imagery, providing consistent performance. Fig. 6 shows the structure of the Attention U-Net model used in this study. The input data consisted of 256 × 256 patches; the label data was the same-sized water body-labeled dataset.
The model includes Attention Blocks, Conv Blocks, and pooling layers. Each Conv Block performs convolution operations using a 3 × 3 kernel, followed by batch normalization and the ReLU activation function. This process is repeated three times to ensure that important information is extracted from the feature maps. Additionally, the pooling layer reduces the size of the input feature map by half while doubling the number of channels, enabling the model to learn higher-dimensional features.
The Attention Block is applied in the final decoder stage to assign more weight to important information. After two convolution operations, batch normalization, and activation function applications, the spectral and spatial layers are added, allowing the model to learn spectral and spatial information in detail. This enables the model to differentiate between the boundaries of water and non-water areas more accurately. Finally, a 1 × 1 convolution is used to classify the output into water and non-water regions.
Table 2 presents the key hyperparameters applied in the training of the Attention U-Net model. To compare the water body detection performance based on satellite images from two groups, as well as the effect of data augmentation, the key settings such as kernel initialization, loss function, optimization algorithm, mini-batch size, number of epochs, and learning rate were kept consistent across all experiments. The activation function used for all layers, except for the output layer, was the Rectified Linear Unit (ReLU), while the output layer employed Softmax for multi-class classification. The loss function was set to Intersection over Union (IoU) Loss, and the Adam optimizer was adopted (van Beers et al., 2019). The initial learning rate was set to 0.0001, and it was gradually reduced to 0.000125 as training progressed. The mini-batch size was set to 10, and training was carried out for a total of 500 epochs. After training, the model with the best performance was used for testing.
Table 2 Attension U-Net model hyperparameters for this study
Hyper parameters | Value |
---|---|
Optimizer | Adam |
Learning rate | 0.0001 |
Loss function | IoU |
Batch size | 10 |
Epochs | 500 |
Activation | Relu, Softmax |
In this study, the performance of the Attention U-Net model was evaluated both quantitatively and qualitatively. For the quantitative evaluation, an independent test dataset, which was not included in the training process, was used to verify the model’s prediction accuracy. Various performance metrics based on the confusion matrix were employed to assess the model’s effectiveness.
Firstly, accuracy is a common performance metric that represents the proportion of correctly classified pixels out of all pixels and is widely used in supervised learning. However, in cases where there is data imbalance, a model may achieve high accuracy but perform poorly in predicting certain classes. To address this, additional metrics such as precision, recall, F1-score, and IoU were used for a more comprehensive evaluation (Li et al., 2021b; Yang et al., 2020).
Accuracy is defined as the ratio of correctly classified pixels out of the total pixels and can be expressed as:
Precision refers to the proportion of pixels predicted as water that are actually water and is defined as:
Recall (or sensitivity) indicates the proportion of actual water pixels that were correctly predicted as water and is expressed as:
F1-score is the harmonic mean of precision and recall, representing a balance between the two, and is calculated as:
IoU measures the ratio between the intersection and union of the actual and predicted water body areas. A higher IoU value indicates better model performance (Lee et al., 2022).
These metrics provided a thorough assessment of the model’s ability to accurately detect water bodies, especially in scenarios where data imbalance might otherwise distort a simpler accuracy-based evaluation.
Figs. 7 and 8 show the datasets of Sentinel-1 SAR and Sentinel-2 optical imagery, developed in this study for the Han River and Nakdong River. These datasets consist of raw data, reference materials for label creation, and labels produced according to specific criteria.
Figs. 7(a1, b1, c1, d1) are Sentinel-1 VV polarization images acquired during the dry and wet seasons, respectively. It can be observed that Figs. 7(a1, c1) appear relatively darker compared to Figs. 7(b1, d1). This brightness is due to stronger backscattering signals caused by waves on the water surface during the wet season, induced by rain and wind. Figs. 7(a2, c2) are Sentinel-2 True Color Maps, which were used to distinguish shaded areas and water bodies that are difficult to differentiate in SAR images. Fig. 9(3) shows the results of a threshold-based classification method used to separate water bodies from non-water areas. In patches with larger water bodies, the detection was relatively accurate, but in areas with small water bodies, shadows, and speckle noise, some misclassification occurred. Finally, the label data in Fig. 7 represent the precisely delineated water body areas, created by synthesizing various data obtained from these images. Misclassified areas were corrected according to the established criteria, and undetected areas within water bodies and misclassified shadow regions were also adjusted.
Figs. 8(a1, b1, c1, d1) show true color-composite images captured during the dry and wet seasons, respectively, allowing an intuitive view of water bodies and the surrounding terrain. In Figs. 8(a1, c1), which were acquired during the dry season, there is relatively less vegetation, while in Figs. 8(b1, d1), acquired during the wet season, the vegetation is more lush, appearing in darker green (Baek et al., 2021). Figs. 8(a2, b2, c2, d2) are false color-composite images using infrared and SWIR bands, which provide a detailed look at the reflectance characteristics of vegetation and water bodies (Lee et al., 2022). The false color-composite images highlight the strong infrared reflectance, better showing the vigorous activity of vegetation, while the SWIR band emphasizes the difference in reflectance between water and the land surface, making it useful for water body detection. As a result, not only were water bodies detected, but even inland wetlands and riverbanks, which are difficult to detect in Sentinel-1 SAR images, were precisely distinguished.
In Figs. 8(a3, b3, c3, d3), water bodies were generally well detected regardless of the season, though some detection errors occurred in smaller streams. Fig. 8 represents the water body label data created by thoroughly analyzing the data. Areas of undetected or misclassified regions were corrected by using true and false color-composite images. The boundaries between water and non-water areas are clearly defined, and even narrow rivers and inland wetland areas, which are difficult to observe in Sentinel-1 SAR images, were precisely distinguished. This process helped to correct the parts missed by the NDWI data and establish a consistent dataset.
In this study, the Attention U-Net model was trained using the water body datasets that were developed. The training data consisted of 96 images for training and 24 images for testing, each with a size of 256 × 256 pixels, for each group. To address data imbalance within the patches, a class sampling technique was applied based on the ratio of water pixels, resulting in a final set of 576 training images and 192 test images. Additionally, to compare the effectiveness of data augmentation techniques, for Case 2, a fivefold augmentation was applied to the training data (Table 3).
Table 3 Dataset sizes after data balancing
Group No. | Case 1 | Case 2 | |||
---|---|---|---|---|---|
Han River Basin | Nakdong River Basin | Han River Basin | Nakdong River Basin | ||
Group 1 | Train | 238 | 238 | 1,440 | 1,440 |
Test | 96 | 96 | 96 | 96 | |
Group 2 | Train | 238 | 238 | 1,440 | 1,440 |
Test | 96 | 96 | 96 | 96 |
Fig. 9 shows the water body detection results for the test data in Group 1. The first row of Fig. 9 presents the VV images, the second row of Fig. 9 shows the label data, and the third and fourth rows of Fig. 9 display the water body detection results before and after data augmentation, respectively. In most cases, Case 2 (with data augmentation) outperformed Case 1 (without augmentation). For example, in Fig. 9(a), some noise-containing pixels in the Han River area were misclassified by Case 1 but correctly classified by Case 2. The areas highlighted in yellow boxes indicate regions where the boundary between water and non-water areas was unclear. Case 1 predicted the boundary incorrectly, while Case 2 made a more accurate prediction. In Fig. 9(b), Case 2 also predicted the boundaries around the bridge area more precisely, while Case 1 made some misclassifications. In Figs. 9(c, d), which represent the Nakdong River area, both cases showed good performance in regions with clear water and non-water boundaries. However, in areas with more complex boundaries, Case 2 provided more precise predictions.
Fig. 10 shows the water body detection results for the test data in Group 2. The first row of Fig. 9 presents the VV images, the second row of Fig. 9 shows the label data, and the third and fourth rows of Fig. 9 display the water body detection results before and after data augmentation, respectively. In general, for the Sentinel-2 images, Case 2 (with data augmentation) performed better than Case 1. This was especially noticeable in smaller water bodies and areas with complex boundaries, where Case 2 was able to detect these intricate structures more accurately.
Fig. 10(a) illustrates the complex structure of small streams and riverbanks in the Han River area. While Case 1 successfully detected larger water body areas, it missed some smaller streams and boundary sections. In contrast, Case 2 detected these smaller areas more accurately. The yellow boxes highlight regions where the water body boundaries were predicted more precisely, with Case 2 demonstrating better performance in complex areas. A similar result is seen in Fig. 10(b), where both cases detected the larger water body areas well, but Case 1 missed some portions of the complex boundaries.
In Figs. 10(c, d), Case 2 showed better performance in detecting the riverbanks and inland wetlands, though detection accuracy still dropped in areas with highly complex boundaries. Notably, at the small river junctions of the Nakdong River, highlighted by yellow boxes, Case 2 demonstrated more accurate detection.
Table 4 presents the quantitative evaluation results of the water body detection performance of the Attention U-Net model using the developed training datasets. In Group 1, both cases showed accuracy scores above 0.99; however, this high accuracy may be overestimated due to bias toward non-water areas in the data. To address this, additional metrics such as the F1-score and IoU were examined. In Group 1, the F1-score improved from 0.978 to 0.982, and the IoU increased from 0.958 to 0.964. This indicates that through data augmentation, the model was able to learn various patterns, improving the consistency and accuracy of water body detection.
Table 4 Quantitative comparison of Group 1 (Sentinel-1) and Group 2 (Sentinel-2) across Case 1 (without data augmentation) and Case 2 (with data augmentation)
Group No. | Case 1 | Case 2 | |
---|---|---|---|
Group 1 | Precision | 0.988 | 0.985 |
Recall | 0.970 | 0.978 | |
F1-score | 0.978 | 0.982 | |
Accuracy | 0.995 | 0.996 | |
IoU | 0.958 | 0.964 | |
Group 2 | Precision | 0.961 | 0.954 |
Recall | 0.916 | 0.918 | |
F1-score | 0.938 | 0.936 | |
Accuracy | 0.984 | 0.984 | |
IoU | 0.884 | 0.880 |
On the other hand, in Group 2, the F1-score and IoU slightly decreased from 0.938 to 0.936 and from 0.884 to 0.880, respectively. This decrease can be attributed to the model’s increased sensitivity to complex terrains like riverbanks and inland wetlands, resulting in a tendency for over-detection. After data augmentation, there was an increase in cases where non-water areas were mistakenly detected as water. Despite this trend, the recall value showed a slight improvement, indicating better detection of actual water areas.
Although Sentinel-1 and Sentinel-2 have the same 10-meter spatial resolution, differences in water body detection performance arise due to the distinct characteristics of each satellite. In the comparison between Group 1 and Group 2, Group 1 achieved a high IoU of 0.964, while Group 2’s IoU was relatively lower at 0.884. This performance gap is mainly attributed to the fundamental differences between SAR and optical imagery.
Group 1 uses Sentinel-1 SAR imagery, where the boundaries between water and non-water areas are generally more clearly detected due to the nature of SAR. Since SAR imagery relies on radar signals, it can more easily distinguish water bodies based on the differences in reflectance between the water surface and surrounding land. However, SAR imagery also faces challenges such as geometric distortion and speckle noise, making it difficult to distinguish smaller streams or stagnant inland wetlands, where boundaries may be unclear. To address this, all inland wetlands were labeled as water, which could explain the very high water detection accuracy in Group 1.
On the other hand, Group 2 uses Sentinel-2 optical imagery, which employs various spectral bands, including visible and NIR, for water body detection. Optical imagery allows for finer distinctions of surface features, and it is particularly effective in identifying the boundaries between water and land in complex areas like riverbanks and inland wetlands. This fine detail in optical imagery is advantageous for classifying intricate terrains that are harder to distinguish in SAR imagery. However, due to the limited amount of data created during the data preparation process, which did not fully capture the complexity of these terrains, Group 2’s performance was somewhat lower than the SAR-based Group 1. This suggests that more false positives occurred in Group 2, especially in complex areas like inland wetlands and riverbanks.
In conclusion, SAR imagery showed more stable and accurate performance in detecting large water bodies, while optical imagery, despite being better suited for handling complex terrain, may have been evaluated as less accurate due to limitations in the labeling process. Consequently, Group 2 might have experienced more false detections in complex areas, leading to its comparatively lower performance.
This study provides a comprehensive analysis comparing the water body detection accuracy using Attention U-Net models with Sentinel-1 SAR and Sentinel-2 optical imagery. First, a multi-temporal training dataset was constructed for the Han River and Nakdong River basins, reflecting both seasonal and topographical characteristics to match the features of inland water bodies in Korea. This dataset is essential for allowing the water body detection model to adapt to a wider variety of scenarios. Next, the water body detection results of the two types of imagery were evaluated using the Attention U-Net model. A comparison of Group and Case performances revealed distinct characteristics in water detection accuracy, depending on the specific conditions.
In Group 1, which used Sentinel-1 SAR imagery, the model demonstrated excellent water body detection performance, with an IoU of 0.964 and an F1-score of 0.982. SAR imagery effectively distinguished the boundaries between water and non-water areas. However, it struggled to differentiate complex environments like inland wetlands and narrow riverbanks, leading to all such areas being classified as water. Conversely, in Group 2, which used Sentinel-2 optical imagery, the model’s performance was slightly lower, with an IoU of 0.880 and an F1-score of 0.936 compared to Group 1. However, the optical imagery allowed for more precise detection in complex areas such as inland wetlands and riverbanks. This is because the multiple spectral bands in optical imagery enable finer differentiation between the reflectance properties of water bodies and surrounding terrain.
A comparison of the results based on the use of data augmentation showed that Case 2, where augmentation was applied, generally performed better. In Group 1, Case 2 achieved an IoU of 0.964, higher than Case 1’s 0.958, with the F1-score also improving to 0.982. This suggests that the data augmentation technique effectively contributed to reducing speckle noise and improving boundary detection. In Group 2, Case 2 also handled complex boundaries more effectively. However, although there was a slight decrease in quantitative metrics like the F1-score and IoU, this can be attributed to the model’s tendency to over-detect complex boundaries due to data augmentation. Optical imagery, with its higher spectral resolution, enables precise detection in areas like riverbanks and inland wetlands, but the model trained with augmented data tended to over-detect boundaries, slightly lowering accuracy. Nonetheless, the recall value improved in Case 2, indicating that the optical imagery tended to detect water bodies in a broader range of environments. This suggests that as more diverse datasets for water bodies are developed, the model’s accuracy will likely improve.
In summary, SAR imagery demonstrated superior performance in detecting large water bodies, while optical imagery was more suited for distinguishing fine boundaries. When applying data augmentation techniques, it is crucial to select the appropriate methods based on the characteristics of the data, and continuous evaluation of the effects of various techniques and data configurations is important. This study highlights the need for a fusion approach to improve water body detection performance across various environments. Future research should focus on developing models that combine both SAR and optical imagery to maximize water body detection capabilities.
This research was supported by 1) the Institute of Civil Military Technology Cooperation, the Defense Acquisition Program Administration, and the Ministry of Trade, Industry and Energy of Korea (22-CM-EO-02) and 2) the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MIST) (No. 2023R1A2C1004395).
No potential conflict of interest relevant to this article was reported.
Korean J. Remote Sens. 2024; 40(5): 507-523
Published online October 31, 2024 https://doi.org/10.7780/kjrs.2024.40.5.1.8
Copyright © Korean Society of Remote Sensing.
Il-Hoon Choi1,2, Eu-Ru Lee3,4 , Hyung-Sup Jung5,6*
1Master Student, Department of Geoinformatics, University of Seoul, Seoul, Republic of Korea
2Managing Director, Platform Convergence Business Headquarters, Neighbor System, Seoul, Republic of Korea
3Combined MS/PhD Student, Department of Geoinformatics, University of Seoul, Seoul, Republic of Korea
4Combined MS/PhD Student, Department of Smart Cities, University of Seoul, Seoul, Republic of Korea
5Professor, Department of Geoinformatics, University of Seoul, Seoul, Republic of Korea
6Professor, Department of Smart Cities, University of Seoul, Seoul, Republic of Korea
Correspondence to:Hyung-Sup Jung
E-mail: hsjung@uos.ac.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
As global warming accelerates greenhouse gas emissions, the frequency and severity of abnormal weather events such as floods and droughts are increasing, complicating disaster management and amplifying socio-economic damage. In response, effective strategies for mitigating water-related disasters and proactively addressing climate change are essential, which can be achieved through the use of satellite imagery. This study aims to compare the water body detection performance of Sentinel-1 Synthetic Aperture Radar (SAR) and Sentinel-2 optical imagery using the Attention U-Net model. Through this comparison, the study seeks to identify the strengths and limitations of each satellite imagery type for water body detection. A 256 × 256-pixel patch dataset was developed using multi-temporal imagery from the Han River and Nakdong River basins to reflect seasonal variations in water bodies, including conditions during wet, dry, and flood seasons. Additionally, the study evaluates the impact of data augmentation techniques on model performance, emphasizing the need to select augmentation methods that align with the specific characteristics of SAR and optical data. The results demonstrate that Sentinel-1 SAR imagery exhibited stable performance in detecting large water bodies, achieving high precision in defining water boundaries (Intersection over Union [IoU]: 0.964, F1-score: 0.982). In contrast, Sentinel-2 optical imagery achieved slightly lower accuracy (IoU: 0.880, F1-score: 0.936) but performed well in detecting complex water boundaries, such as those found in wetlands and riverbanks. While data augmentation techniques improved the performance of the Sentinel-1 SAR dataset, they had only a marginal effect on Sentinel-2 optical imagery, aside from slight improvements in boundary detection under new environmental conditions. Overall, this study underscores the importance of threshold and satellite imagery integration for water body monitoring. It further emphasizes the value of selecting appropriate data augmentation techniques tailored to the characteristics of each dataset. The insights from this study offer guidance for developing enhanced water resource management strategies to mitigate the impacts of climate change.
Keywords: Sentinel-1, Sentinel-2, Attention U-Net, Water body detection, Deep learning
As greenhouse gas emissions continue to rise due to global warming, the frequency of extreme weather events is increasing. These climate changes are making hydrological hazards, such as floods and droughts, more common and severe worldwide. In South Korea, specifically, floods and droughts are becoming more frequent and intense, highlighting the need for stronger flood prevention measures. Currently, flood risk management, such as monitoring river flooding, relies on ground-based water level sensors. However, areas where these sensors are not installed cannot be effectively monitored. Thus, there is a need for technologies that can efficiently monitor and analyze river systems, even without water level sensors.
To monitor the condition and changes in river systems, various sensors, including remote sensing, LiDAR, and Synthetic Aperture Radar (SAR), as well as multispectral sensors, can be utilized (Jung et al., 2019). These technologies provide valuable data for detecting and assessing water bodies. Optical satellites can collect multispectral data, enabling index-based analyses, such as the Normalized Difference Water Index (NDWI). However, image acquisition is limited during bad weather conditions or at night, making continuous monitoring challenging (Huang et al., 2018).
In contrast, active microwave sensors like SAR have the advantage of stable data collection regardless of weather conditions or time of day, making them highly useful for real-time monitoring (Pradhan et al., 2017; Li et al., 2021a; Liu et al., 2015). SAR imagery captures the amount of energy reflected from the Earth’s surface after transmitting microwave signals, where water appears dark and land appears bright, creating a clear contrast. Due to its unique ability to effectively distinguish surface types, SAR imagery is increasingly being used in water body detection research (Dong et al., 2021; Guo et al., 2022; Lee et al., 2023).
However, SAR imagery often suffers from speckle noise caused by electromagnetic wave interference, which leads to extreme pixel brightness variations. In cases of mixed land and water surfaces, brightness fluctuations become more complex. This presents challenges for traditional detection algorithms, such as edge detection (Liu and Jezek, 2004) and clustering methods (Wu et al., 2018), which face difficulties in accuracy and efficiency when classifying images. As a result, these methods show limitations in practical applications (Kim et al., 1998; 2022; Guo et al., 2022; Dong et al., 2021).
To address these limitations, deep learning algorithms capable of training deep, nonlinear neural networks are increasingly being applied. In particular, Convolutional Neural Networks (CNNs) have shown significant potential for image analysis. Unlike traditional algorithmic methods, CNNs are less affected by speckle noise because their learned weights capture the spatial characteristics of the entire image, which improves classification accuracy (Dong et al., 2021; Guo et al., 2022; Kim et al., 2022; Yu et al., 2022).
The U-Net model, introduced by Ronneberger et al. (2015), along with various other CNN models, has been widely used for water body detection. The U-Net model features a “U”-shaped architecture with skip connections between the encoder (contracting path) and decoder (expanding path), allowing information from the encoder to be directly transferred to the decoder. This structure enables faster training and produces effective results even with relatively small amounts of data. However, it also tends to emphasize irrelevant features outside the focus area, which can affect performance (Amer et al., 2022).
This study aims to compare and analyze the performance of water body detection using Sentinel-1 SAR and Sentinel-2 optical imagery by employing the Attention U-Net model. The Attention U-Net leverages an attention mechanism to emphasize salient features while suppressing irrelevant background information, thereby enhancing detection accuracy and addressing the challenges posed by complex environmental conditions (Mou and Zhu, 2019; Jonnala and Gupta, 2024). Additionally, this study involves the application of data augmentation techniques to overcome the limitations of limited training data.
This research not only aims to expand the dataset artificially but also seeks to analyze the impact of these augmentation techniques on model performance through quantitative and qualitative assessments (Kim and Han, 2020). The study used Sentinel-1 and Sentinel-2 images captured at the same locations along the Han River and Nakdong River basins, ensuring consistency in environmental conditions across datasets to create the training data for the Artificial Intelligence (AI) model. They also wanted to learn the seasonal characteristics of water bodies from satellites by utilizing multi-temporal datasets.
The training and test datasets were then split 8:2 by region to train the AI model, and the final accuracy of the model was compared. We also applied data augmentation techniques to the training dataset only to compare the results before and after augmentation. In Chapter 2, the research methodology is explained. Chapter 3 introduces the data used and the study area. Chapter 4 presents the research results, and finally, Chapter 5 provides the conclusions of the study.
This study followed the process outlined in Fig. 1 to select satellite SAR and optical images during different river change periods, including the wet season, dry season, and flood season. Based on the selected data, a training dataset was constructed and applied to a deep learning model to quantitatively compare and analyze the accuracy of water body detection.
To compare and analyze the performance of water body detection models using SAR and optical images, the study areas were selected from the Han River basin, specifically from Yondam Bridge to Ipo Bridge in the upper Namhan River, and the Nakdong River basin, starting from Sangju Weir to Nakdan Bridge (Fig. 2).
The upper region of the Han River is characterized by mountainous terrain, posing a risk of flash flooding due to rapid currents. Additionally, frequent summer heavy rains in this area result in regular flood damage (Yoon et al., 2014; Lee et al., 2023). The mid-upper region of the Nakdong River also experiences frequent flood damage, such as debris flows, due to the steep slope of the riverbed (Jee, 2003; Lee and Jung, 2023). Continuous monitoring in such regions is crucial for detecting potential risks and preventing water-related disasters. In South Korea, water bodies are classified into flowing rivers and stagnant lakes, each with distinct characteristics. Therefore, this study selected rivers and lakes in the Han River and Nakdong River regions as the target areas for analysis (Cole and Weihe, 2015; Lee and Jung, 2023).
In this study, water body detection was performed using Sentinel-1 Ground Range Detected (GRD) and Sentinel-2 Level 2A (L2A) satellite data, and their detection accuracies were compared. These two satellites have different characteristics, and only the data suited for this research were selected and utilized.
As aforementioned, the Sentinel-1 SAR Imagery has the advantage of providing stable images under all weather conditions and at any time of day or night. Operating in the C-band, Sentinel-1 is particularly useful for acquiring data even in cloudy or rainy conditions, making it highly effective for water body detection. It provides data with a 10-meter spatial resolution using the Interferometric Wide (IW) mode and collects data in two polarizations: VV and VH.
The VV polarization, which transmits and receives vertically polarized signals, is effective for detecting water bodies as it shows low backscatter from smooth surfaces like water. On the other hand, the VH polarization, which transmits vertically and receives horizontally polarized signals, is better suited for detecting vegetation or structures. However, VH polarization tends to be more sensitive to wind, vegetation, and minor water changes, increasing noise levels and causing ambiguity in distinguishing between water and non-water areas. For this reason, this study used only the VV polarization, which provides higher stability and accuracy.
Sentinel-2 is equipped with a multispectral optical sensor, providing 13 spectral bands ranging from visible light to shortwave infrared. Sentinel-2 primarily collects high-resolution images of land and water bodies under cloud-free conditions, with spatial resolutions ranging from 10 to 60 meters across its various bands. In this study, the Blue, Green, and Red bands, along with the Near InfraRed (NIR) and the Short Wavelength InfraRed (SWIR1 and SWIR2) bands, were selected for water body detection. The Blue, Green, Red, and NIR bands offer a 10-meter resolution, providing high-resolution imagery capable of delineating water boundaries in detail (Lee et al., 2024).
NIR, in particular, plays a crucial role in water body detection as it clearly distinguishes between the reflectance properties of vegetation and water. Additionally, while the SWIR1 (1.6 μm) and SWIR2 (2.2 μm) bands have a lower 20-meter resolution, they are known to be highly effective for water body detection. Since the SWIR bands are sensitive to moisture content, they can sharply differentiate between water bodies and surrounding areas based on moisture levels. Despite their lower resolution, the SWIR bands provide valuable information for distinguishing between water and non-water areas and are less affected by clouds or haze, enabling more reliable water body detection. For these reasons, the SWIR1 and SWIR2 bands were additionally used in this study.
Additionally, by utilizing multi-temporal data, we conducted a more precise analysis of the variability of water bodies in South Korea. Domestic water bodies exhibit significant changes in characteristics due to seasonal variations, and water body detection studies that take these changes into account can yield more reliable results. In particular, it is crucial to analyze data that reflects seasonal characteristics such as the dry season, flood season, and wet season. During the dry season, river and reservoir water levels decrease significantly, exposing large areas of land. In contrast, during the flood season, heavy rainfall can cause water levels to rise rapidly, potentially inundating surrounding areas. Furthermore, in the wet season, changes in wind and rainfall can alter the shape or area of water bodies. Considering these factors, we collected a total of 12 time-series raw datasets of Sentinel-1 SAR and Sentinel-2 optical images (Table 1).
Table 1 . Acquisition dates and areas for Sentinel-1 SAR and Sentinel-2 optical images used in this study.
Satellite | Product type | Area | Date |
---|---|---|---|
Sentinel-1 | IW_GRD | Han River Basin | 2022.04.06 |
2022.04.06 | |||
Nakdong River Basin | 2022.07.23 | ||
2022.10.15 | |||
Sentinel-2 | Sentinel-2 Multispectral Instrument LeveI 2A | Han River Basin | 2022.04.06 |
2022.07.23 | |||
Nakdong River Basin | 2022.10.15 | ||
2022.04.07 |
The overall workflow of this study is presented in Fig. 3. The research methodology is divided into three main stages: (1) data preprocessing, (2) creation of labeled data for training, and (3) model training and evaluation. First, preprocessing was conducted on Sentinel-1 GRD and Sentinel-2 L2A data. This preprocessing step is critical to ensure the deep learning model can accurately distinguish between water and non-water areas under various conditions. After preprocessing, the data were divided into patches for AI model training. Precise labeled data are essential for AI model training, so a reference map was created, and label data for water bodies were produced based on established documentation. The labeled Sentinel-1 AI data were classified as Group 1, and the Sentinel-2 AI data as Group 2.
Next, the training datasets for each group were split into training and validation datasets. During this process, the class sampling technique was applied to partially address the issue of data imbalance. Additionally, to compare the performance of the model with and without data augmentation, the dataset with augmentation applied was labeled as Case 2, and without augmentation as Case 1. Finally, model training was conducted for each group using the Attention U-Net model, and performance evaluations were performed to analyze the effectiveness of each dataset and the impact of data augmentation. Through this, the accuracy of water body detection for each case was compared and analyzed.
In this study, we conducted preprocessing steps tailored to each satellite dataset to improve the accuracy of water body detection. For the Sentinel-1 GRD data, we first applied orbit correction to precisely align the satellite’s actual position with the image, reducing geographic positioning errors. This process ensured accurate location information for water bodies and terrain. Next, thermal noise removal was performed to eliminate unnecessary noise from the images, improving the Signal-to-Noise Ratio (SNR) and enhancing the accuracy of water body detection.
Additionally, speckle noise, which can cause irregular variations in backscatter intensity, may occur in SAR images. To address this, we applied a Lee Sigma 7 × 7 filter to effectively remove speckle noise, resulting in clearer boundaries between water and non-water areas. Lastly, we used the Shuttle Radar Topography Mission Digital Elevation Model (SRTM DEM) for terrain correction. This step corrected geometric distortions caused by elevation differences, allowing for more accurate water body detection. After these preprocessing steps, the final Sigma0 image was obtained, which plays a key role in distinguishing water from non-water areas based on backscatter intensity.
For Sentinel-2 optical data, we used the Red, Green, Blue, and Near-Infrared bands with a 10-meter resolution, along with the SWIR1 and SWIR2 bands with a 20-meter resolution. The SWIR1 and SWIR2 bands were resampled to a 10-meter resolution using the bilinear resampling method. This resampling ensured that all bands were consistent in spatial resolution, allowing the AI model to learn information at the same resolution across all bands. Additionally, since Sentinel-2 data is stored in integer format, we converted each band’s data to a floating-point format.
After preprocessing the Sentinel-1 SAR and Sentinel-2 optical images, the data was divided by study area and further split into 256 × 256 patches, as shown in Figs. 2(a, b). During the patch-slicing process, a 25% overlap was applied to maintain the spatial continuity of the data while also increasing the diversity of the training data. Finally, min-max normalization was applied to each patch, converting the backscatter values to a range between 0 and 1, ensuring that all patches were prepared for training with consistent value ranges.
The performance and reliability of deep learning models largely depend on the precision and consistency of the training data (Gong et al., 2023; Park et al., 2023). In satellite image-based water body detection, creating accurate labeled data is essential. For this reason, in our study, we developed a detailed labeling guideline that considers the characteristics of satellite images and the diverse environmental conditions surrounding water bodies.
If precise annotation is not achieved during the labeling process, the quality of the training data may deteriorate, leading to reduced model performance (Yu et al., 2023). This is especially critical in tasks such as water body detection, where distinguishing fine differences is necessary. Inconsistent data creation during labeling can significantly lower the dataset’s reliability, which could negatively impact the predictive accuracy and consistency of the deep learning model.
To address this issue, our study created reference images tailored to the specific characteristics of each satellite image, ensuring consistent labeling. This approach enabled clear delineation between water and non-water areas and allowed us to build reliable training data, optimizing the model’s learning performance.
In the process of constructing training data for Sentinel-1 SAR images, the Otsu thresholding method was first applied to the preprocessed patches to initially distinguish between water and non-water areas. The Otsu method analyzes the distribution of pixel values within the image and automatically selects a threshold that minimizes the variance between the two classes (water and non-water). This resulted in a reference image that classified the water body areas.
However, due to the characteristics of SAR imagery, distinguishing between water and non-water can sometimes be ambiguous. In particular, the boundary between water surfaces and the surrounding land may not be clear due to the reflection properties of radar signals, and difficulties in water body detection may arise due to speckle noise and geometric distortions. To overcome these limitations, Sentinel-2 optical imagery was referenced to create the water body label data.
Next, a labeling guideline for Sentinel-1 SAR imagery was developed to provide consistent criteria for distinguishing water from non-water areas. Given that SAR imagery can make it difficult to clearly delineate water boundaries, specific rules were necessary to address this issue. The following criteria were established considering various environmental factors and were tailored to the characteristics of SAR imagery (Fig. 4).
First, areas where water flows or accumulates within river boundaries, including areas where water is retained by dams or levees, were defined as water bodies. Second, even in cases where the Side Lobe effect impacted the water body, the affected area was classified as a water body. This effect is caused by the radiating characteristics of SAR signals, which can distort signals around water bodies, influencing actual water body areas. Third, land areas located within water bodies, such as mid-channel islands or riverbanks, were classified as non-water areas. These areas, although located within water bodies, are not submerged and thus need to be distinguished. Fourth, vessels located on water bodies were included and classified as part of the water body. Since vessels are floating structures, they reflect similar backscatter signals to water bodies in SAR images. Finally, bridges crossing water bodies were classified as non-water areas, as they are structures that are not submerged. These guidelines ensured consistent labeling, maximizing the performance of water body detection.
In the process of constructing training data for Sentinel-2 optical images, the NDWI threshold method was used to distinguish between water and non-water areas. NDWI calculates the difference in reflectance between the NIR and green bands to detect water bodies, effectively delineating the boundary between water and non-water areas by reflecting water’s high moisture content. This method produced a reference image that classified water body areas. Additionally, a labeling guideline was created for the Sentinel-2 optical images using the same approach as that applied to Sentinel-1 SAR images (Fig. 5).
Optical imagery, particularly in comparison to SAR imagery, allows for more intuitive and detailed observation. By utilizing various spectral bands, including visible light, it provides a more precise understanding of the surface’s reflectance characteristics. As a result, the visual distinction between water and non-water areas became clearer, and boundary delineation was more precise. In areas where boundaries were unclear, multiple bands (NIR, SWIR1, SWIR2) were used to further refine the distinction between land and water bodies, ensuring greater accuracy in boundary detection.
To effectively train and validate the deep learning model, the data for each group were divided into training and test datasets. The process of data splitting for model training must account for the seasonal variations in water bodies and the characteristics of the study area. Therefore, the data were classified by season and study area for each satellite, and based on this classification, the training and test datasets were split in a 4:1 ratio. For the test data, specific regions were fixed, as indicated by the orange grid areas shown in Fig. 2. This was done to prevent overfitting, which could occur if similar patches from different seasons were included in both the training and test datasets. By fixing the test data to a specific region, the model could be evaluated consistently.
Additionally, the constructed training data may show significant variations in the proportion of water bodies across patches. For example, some patches may contain little or no water, while others may consist predominantly of water. This data imbalance can lead to biased learning during model training, where the model performs well on patches with a high proportion of water but struggles on patches with little or no water. Such an issue prevents the model from demonstrating consistent performance and may cause it to perform well only under specific conditions when applied in practice. To address this issue, we divided the training and validation data into four groups based on the proportion of water in each patch, applying oversampling to ensure that an equal number of patches were present in each group. This approach balanced the distribution of data, allowing the model to learn from patches with both very low and very high water proportions.
Data augmentation is an essential technique for improving the generalization performance of deep learning models (Baek et al., 2022). In tasks like satellite image-based water body detection, the diversity of available data can be limited. To maintain consistent performance across various environmental conditions, more training data is required. However, due to factors such as acquisition timing, weather conditions, and terrain characteristics, the availability of satellite image data is often restricted. To address this issue, data augmentation was applied in this study to artificially expand the training dataset and enhance the robustness of the model.
By applying data augmentation techniques, the original data can be transformed in various ways to generate new training samples. This helps the model develop the capability to detect water and non-water areas under a variety of conditions. In this study, several augmentation techniques were used, including random rotation, horizontal and vertical flipping, affine transformations, cropping, and color distortion. These transformations allowed the model to learn from a wide range of patterns and environments.
To assess the impact of data augmentation, two experimental conditions were set:
Case 1: Training with the original data only, without applying data augmentation.
Case 2: Training with augmented data, using only the transformed data.
In Case 2, the model was designed to become more robust by learning from a variety of augmented data, enabling it to handle geometric distortions and transformations during the water body detection process. On the other hand, in Case 1, where only the original data was used, the model was more likely to overfit specific environmental conditions.
The Attention U-Net model was employed in this study for water body detection, improving the original U-Net architecture by adding an Attention Block that helps distinguish boundaries between water and non-water areas more precisely. U-Net is a model designed with an encoder-decoder structure. In the encoder phase, max-pooling reduces the size of the feature map, while in the decoder phase, upsampling restores the high-resolution feature map. During this process, spatial details extracted in the encoder and high-resolution information restored in the decoder are merged, allowing the model to preserve spatial details while learning high-level features.
The Attention U-Net enhances this structure by incorporating an Attention Block, improving the network’s ability to focus on important features (Woo et al., 2018). The Attention Block consists of spectral and spatial layers, enabling the model to assign more weight to relevant information while ignoring unnecessary data during training (Woo et al., 2018).
The spectral layer processes the reflectance information of each pixel across various spectral bands, which is particularly useful in optical imagery. It helps the model accurately learn the spectral differences between water and non-water areas. Since water bodies have low reflectance in certain bands, the spectral layer allows for precise distinction of water’s spectral characteristics, clearly separating water boundaries. This layer also enhances detection performance in various environmental conditions, such as changes in weather or lighting.
The spatial layer focuses on learning the spatial structure of the image, emphasizing the morphological differences between water and non-water areas. Water bodies can have irregular or small, dispersed shapes, and the spatial layer highlights these features, contributing to more accurate water body detection. It is particularly effective in suppressing speckle noise in SAR imagery and extracting meaningful spatial patterns.
By adding these Attention Blocks, the Attention U-Net can better focus on critical features, especially in cases where boundaries are ambiguous, like in water body detection. This enables more accurate detection across a variety of environmental conditions in both SAR and optical imagery, providing consistent performance. Fig. 6 shows the structure of the Attention U-Net model used in this study. The input data consisted of 256 × 256 patches; the label data was the same-sized water body-labeled dataset.
The model includes Attention Blocks, Conv Blocks, and pooling layers. Each Conv Block performs convolution operations using a 3 × 3 kernel, followed by batch normalization and the ReLU activation function. This process is repeated three times to ensure that important information is extracted from the feature maps. Additionally, the pooling layer reduces the size of the input feature map by half while doubling the number of channels, enabling the model to learn higher-dimensional features.
The Attention Block is applied in the final decoder stage to assign more weight to important information. After two convolution operations, batch normalization, and activation function applications, the spectral and spatial layers are added, allowing the model to learn spectral and spatial information in detail. This enables the model to differentiate between the boundaries of water and non-water areas more accurately. Finally, a 1 × 1 convolution is used to classify the output into water and non-water regions.
Table 2 presents the key hyperparameters applied in the training of the Attention U-Net model. To compare the water body detection performance based on satellite images from two groups, as well as the effect of data augmentation, the key settings such as kernel initialization, loss function, optimization algorithm, mini-batch size, number of epochs, and learning rate were kept consistent across all experiments. The activation function used for all layers, except for the output layer, was the Rectified Linear Unit (ReLU), while the output layer employed Softmax for multi-class classification. The loss function was set to Intersection over Union (IoU) Loss, and the Adam optimizer was adopted (van Beers et al., 2019). The initial learning rate was set to 0.0001, and it was gradually reduced to 0.000125 as training progressed. The mini-batch size was set to 10, and training was carried out for a total of 500 epochs. After training, the model with the best performance was used for testing.
Table 2 . Attension U-Net model hyperparameters for this study.
Hyper parameters | Value |
---|---|
Optimizer | Adam |
Learning rate | 0.0001 |
Loss function | IoU |
Batch size | 10 |
Epochs | 500 |
Activation | Relu, Softmax |
In this study, the performance of the Attention U-Net model was evaluated both quantitatively and qualitatively. For the quantitative evaluation, an independent test dataset, which was not included in the training process, was used to verify the model’s prediction accuracy. Various performance metrics based on the confusion matrix were employed to assess the model’s effectiveness.
Firstly, accuracy is a common performance metric that represents the proportion of correctly classified pixels out of all pixels and is widely used in supervised learning. However, in cases where there is data imbalance, a model may achieve high accuracy but perform poorly in predicting certain classes. To address this, additional metrics such as precision, recall, F1-score, and IoU were used for a more comprehensive evaluation (Li et al., 2021b; Yang et al., 2020).
Accuracy is defined as the ratio of correctly classified pixels out of the total pixels and can be expressed as:
Precision refers to the proportion of pixels predicted as water that are actually water and is defined as:
Recall (or sensitivity) indicates the proportion of actual water pixels that were correctly predicted as water and is expressed as:
F1-score is the harmonic mean of precision and recall, representing a balance between the two, and is calculated as:
IoU measures the ratio between the intersection and union of the actual and predicted water body areas. A higher IoU value indicates better model performance (Lee et al., 2022).
These metrics provided a thorough assessment of the model’s ability to accurately detect water bodies, especially in scenarios where data imbalance might otherwise distort a simpler accuracy-based evaluation.
Figs. 7 and 8 show the datasets of Sentinel-1 SAR and Sentinel-2 optical imagery, developed in this study for the Han River and Nakdong River. These datasets consist of raw data, reference materials for label creation, and labels produced according to specific criteria.
Figs. 7(a1, b1, c1, d1) are Sentinel-1 VV polarization images acquired during the dry and wet seasons, respectively. It can be observed that Figs. 7(a1, c1) appear relatively darker compared to Figs. 7(b1, d1). This brightness is due to stronger backscattering signals caused by waves on the water surface during the wet season, induced by rain and wind. Figs. 7(a2, c2) are Sentinel-2 True Color Maps, which were used to distinguish shaded areas and water bodies that are difficult to differentiate in SAR images. Fig. 9(3) shows the results of a threshold-based classification method used to separate water bodies from non-water areas. In patches with larger water bodies, the detection was relatively accurate, but in areas with small water bodies, shadows, and speckle noise, some misclassification occurred. Finally, the label data in Fig. 7 represent the precisely delineated water body areas, created by synthesizing various data obtained from these images. Misclassified areas were corrected according to the established criteria, and undetected areas within water bodies and misclassified shadow regions were also adjusted.
Figs. 8(a1, b1, c1, d1) show true color-composite images captured during the dry and wet seasons, respectively, allowing an intuitive view of water bodies and the surrounding terrain. In Figs. 8(a1, c1), which were acquired during the dry season, there is relatively less vegetation, while in Figs. 8(b1, d1), acquired during the wet season, the vegetation is more lush, appearing in darker green (Baek et al., 2021). Figs. 8(a2, b2, c2, d2) are false color-composite images using infrared and SWIR bands, which provide a detailed look at the reflectance characteristics of vegetation and water bodies (Lee et al., 2022). The false color-composite images highlight the strong infrared reflectance, better showing the vigorous activity of vegetation, while the SWIR band emphasizes the difference in reflectance between water and the land surface, making it useful for water body detection. As a result, not only were water bodies detected, but even inland wetlands and riverbanks, which are difficult to detect in Sentinel-1 SAR images, were precisely distinguished.
In Figs. 8(a3, b3, c3, d3), water bodies were generally well detected regardless of the season, though some detection errors occurred in smaller streams. Fig. 8 represents the water body label data created by thoroughly analyzing the data. Areas of undetected or misclassified regions were corrected by using true and false color-composite images. The boundaries between water and non-water areas are clearly defined, and even narrow rivers and inland wetland areas, which are difficult to observe in Sentinel-1 SAR images, were precisely distinguished. This process helped to correct the parts missed by the NDWI data and establish a consistent dataset.
In this study, the Attention U-Net model was trained using the water body datasets that were developed. The training data consisted of 96 images for training and 24 images for testing, each with a size of 256 × 256 pixels, for each group. To address data imbalance within the patches, a class sampling technique was applied based on the ratio of water pixels, resulting in a final set of 576 training images and 192 test images. Additionally, to compare the effectiveness of data augmentation techniques, for Case 2, a fivefold augmentation was applied to the training data (Table 3).
Table 3 . Dataset sizes after data balancing.
Group No. | Case 1 | Case 2 | |||
---|---|---|---|---|---|
Han River Basin | Nakdong River Basin | Han River Basin | Nakdong River Basin | ||
Group 1 | Train | 238 | 238 | 1,440 | 1,440 |
Test | 96 | 96 | 96 | 96 | |
Group 2 | Train | 238 | 238 | 1,440 | 1,440 |
Test | 96 | 96 | 96 | 96 |
Fig. 9 shows the water body detection results for the test data in Group 1. The first row of Fig. 9 presents the VV images, the second row of Fig. 9 shows the label data, and the third and fourth rows of Fig. 9 display the water body detection results before and after data augmentation, respectively. In most cases, Case 2 (with data augmentation) outperformed Case 1 (without augmentation). For example, in Fig. 9(a), some noise-containing pixels in the Han River area were misclassified by Case 1 but correctly classified by Case 2. The areas highlighted in yellow boxes indicate regions where the boundary between water and non-water areas was unclear. Case 1 predicted the boundary incorrectly, while Case 2 made a more accurate prediction. In Fig. 9(b), Case 2 also predicted the boundaries around the bridge area more precisely, while Case 1 made some misclassifications. In Figs. 9(c, d), which represent the Nakdong River area, both cases showed good performance in regions with clear water and non-water boundaries. However, in areas with more complex boundaries, Case 2 provided more precise predictions.
Fig. 10 shows the water body detection results for the test data in Group 2. The first row of Fig. 9 presents the VV images, the second row of Fig. 9 shows the label data, and the third and fourth rows of Fig. 9 display the water body detection results before and after data augmentation, respectively. In general, for the Sentinel-2 images, Case 2 (with data augmentation) performed better than Case 1. This was especially noticeable in smaller water bodies and areas with complex boundaries, where Case 2 was able to detect these intricate structures more accurately.
Fig. 10(a) illustrates the complex structure of small streams and riverbanks in the Han River area. While Case 1 successfully detected larger water body areas, it missed some smaller streams and boundary sections. In contrast, Case 2 detected these smaller areas more accurately. The yellow boxes highlight regions where the water body boundaries were predicted more precisely, with Case 2 demonstrating better performance in complex areas. A similar result is seen in Fig. 10(b), where both cases detected the larger water body areas well, but Case 1 missed some portions of the complex boundaries.
In Figs. 10(c, d), Case 2 showed better performance in detecting the riverbanks and inland wetlands, though detection accuracy still dropped in areas with highly complex boundaries. Notably, at the small river junctions of the Nakdong River, highlighted by yellow boxes, Case 2 demonstrated more accurate detection.
Table 4 presents the quantitative evaluation results of the water body detection performance of the Attention U-Net model using the developed training datasets. In Group 1, both cases showed accuracy scores above 0.99; however, this high accuracy may be overestimated due to bias toward non-water areas in the data. To address this, additional metrics such as the F1-score and IoU were examined. In Group 1, the F1-score improved from 0.978 to 0.982, and the IoU increased from 0.958 to 0.964. This indicates that through data augmentation, the model was able to learn various patterns, improving the consistency and accuracy of water body detection.
Table 4 . Quantitative comparison of Group 1 (Sentinel-1) and Group 2 (Sentinel-2) across Case 1 (without data augmentation) and Case 2 (with data augmentation).
Group No. | Case 1 | Case 2 | |
---|---|---|---|
Group 1 | Precision | 0.988 | 0.985 |
Recall | 0.970 | 0.978 | |
F1-score | 0.978 | 0.982 | |
Accuracy | 0.995 | 0.996 | |
IoU | 0.958 | 0.964 | |
Group 2 | Precision | 0.961 | 0.954 |
Recall | 0.916 | 0.918 | |
F1-score | 0.938 | 0.936 | |
Accuracy | 0.984 | 0.984 | |
IoU | 0.884 | 0.880 |
On the other hand, in Group 2, the F1-score and IoU slightly decreased from 0.938 to 0.936 and from 0.884 to 0.880, respectively. This decrease can be attributed to the model’s increased sensitivity to complex terrains like riverbanks and inland wetlands, resulting in a tendency for over-detection. After data augmentation, there was an increase in cases where non-water areas were mistakenly detected as water. Despite this trend, the recall value showed a slight improvement, indicating better detection of actual water areas.
Although Sentinel-1 and Sentinel-2 have the same 10-meter spatial resolution, differences in water body detection performance arise due to the distinct characteristics of each satellite. In the comparison between Group 1 and Group 2, Group 1 achieved a high IoU of 0.964, while Group 2’s IoU was relatively lower at 0.884. This performance gap is mainly attributed to the fundamental differences between SAR and optical imagery.
Group 1 uses Sentinel-1 SAR imagery, where the boundaries between water and non-water areas are generally more clearly detected due to the nature of SAR. Since SAR imagery relies on radar signals, it can more easily distinguish water bodies based on the differences in reflectance between the water surface and surrounding land. However, SAR imagery also faces challenges such as geometric distortion and speckle noise, making it difficult to distinguish smaller streams or stagnant inland wetlands, where boundaries may be unclear. To address this, all inland wetlands were labeled as water, which could explain the very high water detection accuracy in Group 1.
On the other hand, Group 2 uses Sentinel-2 optical imagery, which employs various spectral bands, including visible and NIR, for water body detection. Optical imagery allows for finer distinctions of surface features, and it is particularly effective in identifying the boundaries between water and land in complex areas like riverbanks and inland wetlands. This fine detail in optical imagery is advantageous for classifying intricate terrains that are harder to distinguish in SAR imagery. However, due to the limited amount of data created during the data preparation process, which did not fully capture the complexity of these terrains, Group 2’s performance was somewhat lower than the SAR-based Group 1. This suggests that more false positives occurred in Group 2, especially in complex areas like inland wetlands and riverbanks.
In conclusion, SAR imagery showed more stable and accurate performance in detecting large water bodies, while optical imagery, despite being better suited for handling complex terrain, may have been evaluated as less accurate due to limitations in the labeling process. Consequently, Group 2 might have experienced more false detections in complex areas, leading to its comparatively lower performance.
This study provides a comprehensive analysis comparing the water body detection accuracy using Attention U-Net models with Sentinel-1 SAR and Sentinel-2 optical imagery. First, a multi-temporal training dataset was constructed for the Han River and Nakdong River basins, reflecting both seasonal and topographical characteristics to match the features of inland water bodies in Korea. This dataset is essential for allowing the water body detection model to adapt to a wider variety of scenarios. Next, the water body detection results of the two types of imagery were evaluated using the Attention U-Net model. A comparison of Group and Case performances revealed distinct characteristics in water detection accuracy, depending on the specific conditions.
In Group 1, which used Sentinel-1 SAR imagery, the model demonstrated excellent water body detection performance, with an IoU of 0.964 and an F1-score of 0.982. SAR imagery effectively distinguished the boundaries between water and non-water areas. However, it struggled to differentiate complex environments like inland wetlands and narrow riverbanks, leading to all such areas being classified as water. Conversely, in Group 2, which used Sentinel-2 optical imagery, the model’s performance was slightly lower, with an IoU of 0.880 and an F1-score of 0.936 compared to Group 1. However, the optical imagery allowed for more precise detection in complex areas such as inland wetlands and riverbanks. This is because the multiple spectral bands in optical imagery enable finer differentiation between the reflectance properties of water bodies and surrounding terrain.
A comparison of the results based on the use of data augmentation showed that Case 2, where augmentation was applied, generally performed better. In Group 1, Case 2 achieved an IoU of 0.964, higher than Case 1’s 0.958, with the F1-score also improving to 0.982. This suggests that the data augmentation technique effectively contributed to reducing speckle noise and improving boundary detection. In Group 2, Case 2 also handled complex boundaries more effectively. However, although there was a slight decrease in quantitative metrics like the F1-score and IoU, this can be attributed to the model’s tendency to over-detect complex boundaries due to data augmentation. Optical imagery, with its higher spectral resolution, enables precise detection in areas like riverbanks and inland wetlands, but the model trained with augmented data tended to over-detect boundaries, slightly lowering accuracy. Nonetheless, the recall value improved in Case 2, indicating that the optical imagery tended to detect water bodies in a broader range of environments. This suggests that as more diverse datasets for water bodies are developed, the model’s accuracy will likely improve.
In summary, SAR imagery demonstrated superior performance in detecting large water bodies, while optical imagery was more suited for distinguishing fine boundaries. When applying data augmentation techniques, it is crucial to select the appropriate methods based on the characteristics of the data, and continuous evaluation of the effects of various techniques and data configurations is important. This study highlights the need for a fusion approach to improve water body detection performance across various environments. Future research should focus on developing models that combine both SAR and optical imagery to maximize water body detection capabilities.
This research was supported by 1) the Institute of Civil Military Technology Cooperation, the Defense Acquisition Program Administration, and the Ministry of Trade, Industry and Energy of Korea (22-CM-EO-02) and 2) the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MIST) (No. 2023R1A2C1004395).
No potential conflict of interest relevant to this article was reported.
Table 1 . Acquisition dates and areas for Sentinel-1 SAR and Sentinel-2 optical images used in this study.
Satellite | Product type | Area | Date |
---|---|---|---|
Sentinel-1 | IW_GRD | Han River Basin | 2022.04.06 |
2022.04.06 | |||
Nakdong River Basin | 2022.07.23 | ||
2022.10.15 | |||
Sentinel-2 | Sentinel-2 Multispectral Instrument LeveI 2A | Han River Basin | 2022.04.06 |
2022.07.23 | |||
Nakdong River Basin | 2022.10.15 | ||
2022.04.07 |
Table 2 . Attension U-Net model hyperparameters for this study.
Hyper parameters | Value |
---|---|
Optimizer | Adam |
Learning rate | 0.0001 |
Loss function | IoU |
Batch size | 10 |
Epochs | 500 |
Activation | Relu, Softmax |
Table 3 . Dataset sizes after data balancing.
Group No. | Case 1 | Case 2 | |||
---|---|---|---|---|---|
Han River Basin | Nakdong River Basin | Han River Basin | Nakdong River Basin | ||
Group 1 | Train | 238 | 238 | 1,440 | 1,440 |
Test | 96 | 96 | 96 | 96 | |
Group 2 | Train | 238 | 238 | 1,440 | 1,440 |
Test | 96 | 96 | 96 | 96 |
Table 4 . Quantitative comparison of Group 1 (Sentinel-1) and Group 2 (Sentinel-2) across Case 1 (without data augmentation) and Case 2 (with data augmentation).
Group No. | Case 1 | Case 2 | |
---|---|---|---|
Group 1 | Precision | 0.988 | 0.985 |
Recall | 0.970 | 0.978 | |
F1-score | 0.978 | 0.982 | |
Accuracy | 0.995 | 0.996 | |
IoU | 0.958 | 0.964 | |
Group 2 | Precision | 0.961 | 0.954 |
Recall | 0.916 | 0.918 | |
F1-score | 0.938 | 0.936 | |
Accuracy | 0.984 | 0.984 | |
IoU | 0.884 | 0.880 |
Hankeun Cho, Dongryeol Ryu
Korean J. Remote Sens. 2024; 40(5): 643-656Seo Jin Kim, Hahn Chul Jung, Do-Hyun Hwang
Korean J. Remote Sens. 2024; 40(5): 455-464