APPLICATION OF A COMBINED DAILY RAIN GAUGES AND RAINFALL SATELLITE ESTIMATES SCHEME FOR BASIN MANAGEMENT¶

After: Daniel Vila (INPE-CPTEC, Brazil) and Cesar Luis Garcia (CREAN-UNC and Catholic University of Cordoba, Argentina)

Notebook prepared by Ben Maathuis and Bas Retsios. ITC-University of Twente, Enschede. The Netherlands

Estimating daily rainfall over land, using a satellite-based algorithm and rain gauge networks involves two major issues: firstly to define the algorithm to derive the satellite based precipitation and secondly define and design a merging technique. In the first case, the “Hydroestimator” algorithm is used as the base algorithm for retrieving precipitation. The second issue has been largely discussed in several papers over the last years, but the general assumption is that rain gauge observations have lower bias. Therefore they will prevail over any satellite retrievals in those regions with dense networks, while over the ocean (not analyzed in this application) and non-well gauged areas, the multi-satellite estimates have a larger weight in the final analysis.

For more information on the procedure implemented in the notebook code fields - see as reference: Chapter 11 from the DevCoCast Manual: https://filetransfer.itc.nl/pub/52n/gnc_devcocast_applications/description/devcocast_application_manual.pdf.

Here it is assumed that the sample data folder '/Cosch' is situated within the notebook folder! It is furthermore assumed that you have locally installed ILWIS386 for data visualization when data is written to disk.

In [1]:
import os
import ilwis
import numpy as np
import pandas as pd
import struct
import matplotlib.pyplot as plt
In [2]:
print(ilwis.version())
1.0 build 20250715

Preparation of the input data¶

In [3]:
# set working folder
work_dir = os.getcwd() + r'/Cosch'
ilwis.setWorkingCatalog(work_dir)
print(work_dir)
d:\jupyter\notebook_scripts\Special\Cosch/data1

The rainfall information is be obtained from NOAA’s Climate Prediction Centre Unified Gauge-Based Analysis of Global Daily Precipitation, see: https://psl.noaa.gov/data/gridded/data.cpc.globalprecip.html.

The CPC data can be downloaded and the gauge map should be processed to match the resolution and geographic boundaries of the daily RFS. The CPC images have a spatial resolution of 0.5 degree and have global coverage, but here only a part is required covering the South American region with a resolution of 0.25 degrees. If you have an internet connection you can go to CPC ftp site and check the data available. In the sample data folder you will be able to find the file: “PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx. 20110103.RT”.

The procedure to import this image requires a closer look at the meta data provided:

  • The file format is binary, real (float) values in a little Endian byte order (no ‘swap’ required) where undefined / no data = -999.0;
  • header size = 0, data is stored in a band sequential (BSQ) file structure
  • Number of Rows / Columns = 360 / 720;
  • Minimum / maximum Latitude of corner coordinates -90 / 90; Minimum / maximum Longitude of corner coordinates 0 / 360;
  • Number of bands is 2: Rain = the grid analysis (0.1mm/day) and Gnum = the number of gauges.
In [4]:
# read prcp file with precipitation content into memory

cols = 720
rows = 360
bands = 2 
bytesperpixel = 4 # float

file = work_dir +'/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20110103.RT'
f = open(file, 'rb')
buffer = f.read(rows * cols * bands * bytesperpixel)
it = struct.iter_unpack("<f", buffer)
data_prcp = np.array([el[0] for el in it])
In [5]:
data_prcp.shape
Out[5]:
(518400,)
In [6]:
data_3d = data_prcp.reshape(2, 360, 720)
data_3d.shape
Out[6]:
(2, 360, 720)
In [7]:
#display the numpy array created in the previous steps using matplotlib 
fig = plt.figure(figsize =(6, 6))
plt.imshow(data_3d[0], interpolation='none')
Out[7]:
<matplotlib.image.AxesImage at 0x1bb4ca8d010>
No description has been provided for this image
In [8]:
#mirror rotate the image first
mirrored_rotated = data_3d[:, ::-1, ::-1]

#Split into western and eastern halves
#take the Western half: columns 0–359
eastern_half = mirrored_rotated[:, :, 360:]
#and the Eastern half: columns 360–719
western_half = mirrored_rotated[:, :, :360]

#flip the two halves
flipped_w = western_half[:, :, ::-1]
flipped_e = eastern_half[:, :, ::-1]

#concatonate the two flipped halves to recompose the full map
data_swapped_lr = np.concatenate((flipped_w, flipped_e), axis=2)
In [9]:
#display the first layer of the 3d-numpy array created in the previous step using matplotlib 
fig = plt.figure(figsize =(6, 6))
plt.imshow(data_swapped_lr[0], interpolation='none');
No description has been provided for this image
In [10]:
grf = ilwis.GeoReference("code=georef:type=corners,csy=epsg:4326,envelope=-180 90 180 -90, gridsize=720 360,cornerofcorners=yes,name=pcp.grf")
dfNum = ilwis.DataDefinition(ilwis.NumericDomain('code=value'), ilwis.NumericRange(-99999, 99999, 0)) 
rcNew = ilwis.RasterCoverage()
rcNew.setSize(ilwis.Size(720, 360, 2))
rcNew.setGeoReference(grf)
rcNew.setDataDef(dfNum)
In [11]:
rcNew.array2raster(data_swapped_lr)
In [12]:
print(rcNew.size().xsize)
print(rcNew.size().ysize)
print(rcNew.size().zsize)
720
360
2
In [13]:
rcNew.store('pcp_in.mpl')
In [14]:
# Use is made of the final georeference for south america 
georef025 = ilwis.GeoReference('sudamerica_0_25.grf')
In [15]:
# Resample the map list and store the results
pcp025 = ilwis.do("resample", rcNew, georef025, "nearestneighbour")
#pcp025.store('pcp_025_check.mpl')
In [16]:
# Create 2 output maps, for the rainfall and gauge (locations / number of gauges) respectively
rc_pcp = ilwis.do('selection',pcp025,"rasterbands(0)")
rc_pcp.store('pcp_025.mpr')
rc_gauge = ilwis.do('selection',pcp025,"rasterbands(1)")
rc_gauge.store('gauge_025.mpr')

Calculation of the additive and ratio bias and apply combined scheme¶

To start, only use those pixels from the cpc_gauge map that contain at least 1 gauging station (a). This map is resampled from 0.5 degrees to 0.25 degrees (b). The hydroestimator map was resampled to 0.25 degrees (c) by aggregation. Subsequently the 2 bias maps (additive and ratio) are calculated (d) and a filter is applied to obtain the mean additive and ratio bias (which can also be assigned to neighbouring pixels) (e). These mean bias maps are used to correct the hydroestimator. One particular scheme (additive or ratio) is selected for each pixel based on the minimum difference between that particular bias correction and the observation (cpc_precipitation) and the best estimate is selected (f). Pixels in ungauged areas will remain untouched and the original estimated value will be retained.

In [17]:
# # Load the satellite derived rainfall estimate layers
rc1 = ilwis.RasterCoverage('hydroestimator_025.mpr')
In [18]:
#some inital data preparation handling 0 values and no data
rc2 = ilwis.do('mapcalc', 'iff(@1 >= 1,1,?)', rc_gauge)
rc3 = ilwis.do('mapcalc', 'iff(@1 >= 0,@1,?)',rc_pcp)
rc4 = ilwis.do('mapcalc', '@1 * (@2/10)',rc2,rc3)

Additive Bias¶

In [19]:
#define the filter to be used 3 by 3 sum filter, last value is the gain factor
map_filter = 'code=3,3,1 1 1 1 1 1 1 1 1, 0.1111111'
In [20]:
#see reference for more information - equation 1
rc5 = ilwis.do('mapcalc', '@2 - @1', rc1, rc4)
rc6 = ilwis.do('mapcalc', 'ifnotundef(@1,@1,0)', rc5)
rc7 = ilwis.do('linearrasterfilter', rc6, map_filter)
rc8 = ilwis.do('mapcalc', 'iff(@1==0,0,1)', rc6)
rc9 = ilwis.do('linearrasterfilter', rc8, map_filter)
rc10 = ilwis.do('mapcalc', '(@1/@2)', rc7, rc9)
rc11 = ilwis.do('mapcalc','ifnotundef(@1,@1,0)',rc10)
rc12 = ilwis.do('mapcalc','(@1 + @2)',rc1, rc11)
rc12.store('hydro_add_jnb.mpr')

Ratio Bias¶

In [21]:
#see reference for more information - equation 2
rc14 = ilwis.do('mapcalc','iff(@1>0,@2/@1,1)',rc1,rc4)
rc15 = ilwis.do('mapcalc','ifnotundef(@1,@1,0)',rc14)
rc16 = ilwis.do('linearrasterfilter', rc15, map_filter)
rc17 = ilwis.do('mapcalc','iff(@1==0,0,1)',rc15)
rc18 = ilwis.do('linearrasterfilter', rc17, map_filter)
rc19 = ilwis.do('mapcalc', '(@1/@2)', rc16, rc18)
rc20 = ilwis.do('mapcalc','ifnotundef(@1,@1,1)',rc19)
rc21 = ilwis.do('mapcalc','(@1 > 0 , @1 * @2,@1)',rc1, rc20)
rc21.store('hydro_rat_jnb.mpr')

Calculation Combined Scheme (Cosch)¶

The difference between additive/ratio bias correction and values obtained from CPC analysis is performed. One particular scheme (additive or ratio) is selected for each pixel based on the minimum difference between that particular bias correction and the observation. For each non-masked pixel one particular method is assigned and for the rest of the pixels the original hydroestimator value is set.

In [22]:
rc13 = ilwis.do('mapcalc','abs(@1 - @2)',rc4, rc12)
rc22 = ilwis.do('mapcalc','abs(@1 - @2)',rc4, rc21)
rc24 = ilwis.do('mapcalc','abs(@1)',rc12)
rc25 = ilwis.do('mapcalc','abs(@1)',rc21)
rc26 = ilwis.do('mapcalc','iff(@2 < @4, @1, @3)',rc24, rc13, rc25, rc22)
rc27 = ilwis.do('mapcalc','iff(@1==?,@2,@1)',rc26,rc1)
In [23]:
#store results
rc27.store('cosch_jnb.mpr')

Create some Basin statistics¶

After deriving the daily precipitation using the Combined Scheme the next step is to obtain the statistical values for every ROI and export them into a table.

In [24]:
# Load the respective basin layer
rc100 = ilwis.RasterCoverage('Cuencas.mpr')
In [25]:
ct = ilwis.do('cross', rc100, rc27, 'ignoreundef')
In [26]:
print()
print(ct.columnCount(), ct.recordCount())
print(ct.columns())
5 7294
('cross_combinations', 'first_raster', 'second_raster', 'pixel_count', 'pixel_area')
In [27]:
#uncomment code lines below to see the full table listing
# for i in range(ct.recordCount()):
#     rec = ct.record(i)
#     print(rec)
In [28]:
ct = ilwis.do('tabcalc', 'iff(@1 >= 1, @1, 0)', ct, 'CoschGE1mm', 'no', 'second_raster')
In [29]:
#ilwis cross table to pandas dataframe
data = {}
for col in list(ct.columns()):
    data[col] = list(ct.column(col))
df = pd.DataFrame(data)
In [30]:
df = df.rename(columns={'first_raster': 'catchment_ID','second_raster': 'rainfall'})
In [31]:
df
Out[31]:
cross_combinations catchment_ID rainfall pixel_count pixel_area CoschGE1mm
0 0 3 27.893553 1 0.0625 27.893553
1 1 11 30.949806 1 0.0625 30.949806
2 2 7 55.637112 1 0.0625 55.637112
3 3 11 1.535156 1 0.0625 1.535156
4 4 11 33.087307 1 0.0625 33.087307
... ... ... ... ... ... ...
7289 7289 1 5.188477 1 0.0625 5.188477
7290 7290 1 5.215430 1 0.0625 5.215430
7291 7291 1 10.201757 1 0.0625 10.201757
7292 7292 1 3.400195 1 0.0625 3.400195
7293 7293 1 7.356445 1 0.0625 7.356445

7294 rows × 6 columns

Aggregate the rainfall over the catchments, calculate weighted mean of rainfall per catchment:

  • For rainfall (column = rainfall), include zeros in the weighted mean

  • Use pixel_count (column = pixel_count) as the weight

  • Group by catchment (catchment_ID)

In [32]:
# Calculate weighted average rainfall per catchment (include all values, including also the areas without rainfall = 0
rainfall_grouped = df.groupby('catchment_ID').agg(
    total_weighted_rainfall=('rainfall', lambda x: (x * df.loc[x.index, 'pixel_count']).sum()),
    total_pixel_count_rainfall=('pixel_count', 'sum')
)
rainfall_grouped['weighted_rainfall'] = rainfall_grouped['total_weighted_rainfall'] / rainfall_grouped['total_pixel_count_rainfall']
rainfall_grouped = rainfall_grouped[['weighted_rainfall']]
In [33]:
rainfall_grouped
Out[33]:
weighted_rainfall
catchment_ID
0 0.493714
1 4.798162
2 1.614744
3 15.138227
4 1.345299
5 1.851354
6 2.712959
7 25.193674
8 15.345473
9 6.864099
10 0.438687
11 8.803077

Create a new DataFrame that contains:

  • One row per catchment (catchment_ID)

  • The weighted mean rainfall (using CoschGE1mm as values and pixel_count as weights)

  • Excluding rows where CoschGE1mm == 0

In [34]:
# exclude areas with rainfall <1 mm
cosch_filtered = df[df['CoschGE1mm'] != 0].copy()

# Step 3: Weighted mean for CoschGE1mm (excluding 0s)
cosch_grouped = cosch_filtered.groupby('catchment_ID').agg(
    total_weighted_cosch=('CoschGE1mm', lambda x: (x * cosch_filtered.loc[x.index, 'pixel_count']).sum()),
    total_pixel_count_cosch=('pixel_count', 'sum')
)
cosch_grouped['weighted_CoschGE1mm'] = cosch_grouped['total_weighted_cosch'] / cosch_grouped['total_pixel_count_cosch']
cosch_grouped = cosch_grouped[['weighted_CoschGE1mm']]
In [35]:
#uncomment to see the resulting table
#cosch_grouped
In [36]:
#Merge both results
rainfall = rainfall_grouped.join(cosch_grouped, how='outer').reset_index()
#uncomment to see the resulting table
#rainfall
In [37]:
max_rain_df = (df.groupby('catchment_ID', as_index=False)['CoschGE1mm'].max().rename(columns={'CoschGE1mm': 'max_rainfall'}))
#uncomment to see the resulting table
#max_rain_df
In [38]:
rainfall = rainfall.merge(max_rain_df, on='catchment_ID', how='left')
#uncomment to see the resulting table
#rainfall

Compute the normalized area with rainfall ≥ 1mm per catchment. For each catchment (catchment_ID), compute:

  • The total area where CoschGE1mm >= 1

  • Then normalize it by dividing by the total area of that catchment

The result is a value between 0 and 1 and transformed into a percentage

In [39]:
#Total area per catchment
total_area = df.groupby('catchment_ID')['pixel_area'].sum().reset_index(name='total_area')

#Area with CoschGE1mm >= 1 per catchment
area_GE1 = (
    df[df['CoschGE1mm'] >= 1]
    .groupby('catchment_ID')['pixel_area']
    .sum()
    .reset_index(name='area_GE1mm')
)

#Merge and normalize
normalized_area_df = total_area.merge(area_GE1, on='catchment_ID', how='left')
normalized_area_df['area_GE1mm'] = normalized_area_df['area_GE1mm'].fillna(0)  # Fill missing with 0
normalized_area_df['norm_area_GE1mm'] = normalized_area_df['area_GE1mm'] / normalized_area_df['total_area']*100

# keep only relevant columns
result_df = normalized_area_df[['catchment_ID', 'norm_area_GE1mm']]

#uncomment to see the resulting table
#result_df
In [40]:
rainfall = rainfall.merge(result_df, on='catchment_ID', how='left')
#uncomment to see the resulting table
#rainfall
In [41]:
basinList = ('Magdalena','Chubut','Negro','Parnaiba','Salado','Colorado','Uruguay','Tocantins','San Francisco','Del Plata','Orinoco','Amazonas') 
In [42]:
rainfall = rainfall.sort_values('catchment_ID').reset_index(drop=True)
rainfall['basin'] = basinList
In [43]:
#uncomment to see the resulting table
#rainfall
In [44]:
df_final = rainfall[['catchment_ID','basin', 'weighted_rainfall', 'weighted_CoschGE1mm', 'max_rainfall', 'norm_area_GE1mm']]
df_final
Out[44]:
catchment_ID basin weighted_rainfall weighted_CoschGE1mm max_rainfall norm_area_GE1mm
0 0 Magdalena 0.493714 6.194022 16.033594 7.926829
1 1 Chubut 4.798162 8.029857 31.743166 59.507042
2 2 Negro 1.614744 7.012731 36.923439 22.891566
3 3 Parnaiba 15.138227 20.536871 164.978042 73.611111
4 4 Salado 1.345299 11.916673 45.530277 11.250000
5 5 Colorado 1.851354 9.967701 42.856640 18.373494
6 6 Uruguay 2.712959 7.635795 31.271484 35.150376
7 7 Tocantins 25.193674 30.270202 185.751572 83.169291
8 8 San Francisco 15.345473 23.701367 133.533401 64.634146
9 9 Del Plata 6.864099 22.125763 125.884583 30.970149
10 10 Orinoco 0.438687 8.924037 41.493164 4.881657
11 11 Amazonas 8.803077 18.593650 154.816940 47.246868
In [ ]:
 
In [ ]:
 
In [ ]: