ILWISPy for time series data processing

Notebook prepared by Ben Maathuis, Bas Retsios and Willem Nieuwenhuis. ITC-University of Twente, Enschede. The Netherlands

Within this Notebook your are going to use ILWISPy in conjunction with a number of common used Python libraries, like Numpy and Matplotlib, which have already been introduced before. Also use is made here of Pandas.

Within this notebook we focus on time series data. A number of open data sources are used:

Attention will be given to the following aspects:

Before we start, first load the required libraries to run this Notebook. If there is a message after executing the cell below, ensure that you install, using the command from within your python folder 'python -m pip install ...package...', e.g. for Pandas: 'python -m pip install pandas'.

Sample data available at: https://filetransfer.itc.nl/pub/52n/ilwis_py/sample_data/Intro_Timeseries_ILWISPy.zip. Unzip the file. Here it is assumed that the data folder '/Intro_Timeseries_ILWISPy' is situated within the notebook folder! It is furthermore assumed that you have locally installed ILWIS386 for data visualization when data is written to disk. This will be especially important when visualizing the time series data and anomalies created in the later part of this notebook, e.g. using the animation functionality of ILWIS 386, in order to check the results obtained.

Prerequisite: You have reviewed the notebook 'Intro_ILWISPy.ipynb' and 'Intro_RS_ILWISPy', both available at: https://filetransfer.itc.nl/pub/52n/ilwis_py/notebooks/. Note that the files with postfix _result.html files show the processed results if the notebook would be executed.

Import required libraries and set your ILWISPy working directory

Copernicus climate data store GloFAS

The gridded river discharge data used below has been downloaded for a selected area, the Buzi catchment in Mozambique, for the years 2020 and 2021. The data represents the discharge in m3/sec (per pixel) generated by the GloFAS system (see: https://www.globalfloods.eu/). The data has already been retrieved in grib2 format and was dowloaded from: https://cds.climate.copernicus.eu/cdsapp#!/dataset/cems-glofas-historical?tab=form. The data is going to be assigned as an ILWISPy raster coverage in the cells below, the dimensions are checked (note that 2020 is a leap year!)

Retrieve some meta data information, like raster stack dimensions, some additional information on geometry, projection, map extent aand for 2020 some additional statistics

Check some of the other meta data

Transform the stack (in ilwis format) to a 1D numpy array (using np.fromiter(iter()) for the two years under consideration, combine the two 1D arrays (using append) and transform back to a 3D array (using reshape)

Create a simple interactive plot using MatPlotLib Imshow of the first layer

We continue with inspection of the data, navigate with your mouse cursor over the map in the cell above and check the location X=50 and y=28, this is the outlet location of the Buzi River!

In the cell below the data for the outlet location is retrieved for 2020

Repeat the procedure for 2021

The two lists obtained above are now merged into a single list

Create an appropriate time stamp for the X-axis and check the output index list created

Copernicus Climate Data Store ERA5-Land

Hourly temperature of air at 2m above the surface of land, sea or in-land waters is retrieved from the ERA5-Land Reanalysis for 20230101, from 00:00 to 23:00 UTC, in grib format, see also: https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land?tab=overview

Load the data.

Retrieve the other meta data

Display the imported band using MatplotLib Imshow

Extract temperature value for a certain position, position can be selected using interactive plot above.

Create a sub-map

Now some calculations to get mean, min, max and standard deviation of the 24 map layers. We use the operation 'aggregaterasterstatistics'. Here the mean is further elaborated upon, you can replace it with the other statistics.

Plot water level data from Copernicus Global Land Service

We continue working with data obtained from altimetry measurements over lakes and major rivers. See also: http://land.copernicus.eu/global/products/wl. Two files are provided, one covering Lake Kariba (example for lake) and the other file covers the Jamuna (example for river). The data is provided in json format. You can also inspect the file using a text editor, note the attributes contained in the sections 'type', 'geometry', 'properties' and 'data'.

The example provided below is showing how to retrieve and process the data, you can uncomment the input file for river or lake accordingly in the cell below.

Get some idea of what is contained in the data set

Check the feature data types / domain

Read the information provided for columnindex 1 'basin', etc. Also get an idea of the location, using the information provided by the geometry

All information can be obtained using a loop over the attribute table

Another way to loop over the attribute table is provided in the cell below

Having inspected the meta data, we are going to load the section 'data' section as a Pandas dataframe and create a plot of the water level above the water surface reference datum. This is required as the data provided is not fully adhering to the common json format. Here a single point location is provided with additionally a number of obsevations, normally you would expect for each location the corresponding obesrvation (multi-point). As this is not the case here we open the file once more, but now with Pandas.

Implement some small adjustments in the active Pandas data frame, e.g. rename some column names and create a new column with only the date

Create a final plot of the water level time series, including the uncertainty parameter and subsequently save this plot as a jpg image

Timesat filtering, extraction of climatology and calculation of anomalies

Data import

Provided is a time series of MODIS - NDVI maps (scaled 0-255) in geotif format from 'year - day of year' 2000-001 to 2020-353 (21 years), having a 16 day repeat interval - a total of 483 time steps (for 21 years and 23 ndvi maps per year). The maps are clipped based on the Woreda Fagta Lakoma, situated in the zone called Awi Agew in the Amhara region, Ethiopia. The data is provided as a GeoTiff.

Map showing the Area of Interest.AOI.jpg

Let's have a look at the values for a specific pixel

Create a simple plot to inspect if there are anoalies - noice in the time series extracted for this pixel

As can be observed from the plot generated above, there are quite a number of 'dips', which might represent no data (e.g. due to persistent cloud cover - even during 16 days / multiple overpasses) or other atmospheric issues that result mainly in ndvi values which are lower h higher than expected. This can be corrected for using a temporal filtering procedure. See also: https://www.sciencedirect.com/science/article/pii/S0034425718303985

Timesat Filtering

Satellite-derived NDVI time-series products are usually affected by contamination due to the atmosphere (aerosols and dust) or clouds, which results in a data set containing spikes and dips and therefore it is necessary to reduce this residual noise to improve the quality of the NDVI time-series data.

TimeSat Filter options explanation:

Note that the timesat filter might take some time to process the imported time series

After processing has been completed let's check the values for the the same pixel, but now the filtered results

Create a plot showing the original and filtered data. First create the date stamps for the x-axis

Derive Climatology

To get an idea if a certain observation is in line with the average prevailing conditions you can compare the observation with the longer term average, or climatology. So now that we have a filtered time series, with the noice removed, we can calculate per time step over the years of observation, the average. The time series used here is 21 years and has 23 time steps per year, so the result should be a map stack of 23 layers containing the average (over the years under consideration) for each specific time step.

In the cell below a definition is created which subsequently is executed in the cell further below. The time series is assumed to have the same number of time steps for each year under consideration. Here calendar years 2001 - 2021 are used.

Retrieve the layers to calculate the short term climatology, e.g. 20 years data! = 20 years * 23 layers = 460 layers, here starting from second year, for the next 20 years

Check map stack dimensions

In the cell below a definition is created to calculate the climatology. Review the inline comments given below and execute the definition

Get the map stack details on the climatology computed

Select climatology for a given location for all time steps

Extract for year 2013 (below average) to compare with climatology for selected location, 13 years x 23 time steps per year = 299, index / counting starts from 0, so first time step = 298 and last time step is 320, so for parameter 'rasterbands(b_start..b_end)' the index number are 'rasterbands(298..320)'

Extract for year 2018 (above average) to compare with climatology for selected location, 18 years x 23 time steps per year = 414, index / counting starts from 0, so first time step = 413 and last time step is 436, so for parameter 'rasterbands(b_start..b_end)' the index number are 'rasterbands(413..436)'

Convert the climatology, as a list, to an ILWIS Map List. Define the number of leayer, read the full list as a 1D array. Prepare the ilwis destination Raster Coverage (x, y and z dimensions), read georeference and data type from know source. Finally create a 3D array. Check the output printed.

Store the results obtained as an ILWIS MapList, visualize the anomaly map stack created using the animation functions in ILWIS 386. Use as Representation 'anom.rpr'

Compute anomalies

Anomalies are the deviation between the NDVI climatology (here the short term average) calculated and the actual values observed for the same time step for a given year. The year 2013 is used.

Store your results and create an animation using ILWIS 386 to get an impression of the spatial distribution of the anomalies obtained.

Time Series change detection

This notebook ends with an example of the use of multi temporal satellite image assessment for change detection. As example the Grand Ethiopian Renaissance Dam (see: https://en.wikipedia.org/wiki/Grand_Ethiopian_Renaissance_Dam) is used. To access the changes the images as recorded by Sentinel-1 are used. For information on the data review: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD. The data, retrieved from Google Earth Engine, was pre-processed. Some of the details:

The 3 images are already provided in an ILWIS file format, their data type is value and the backscatter unit is dB

Get yourself a bit familar with the image statistics, note the unit is in dB, check the minimum and maximum values of each of the images

From the histogram plot provided below give some special attention to the negative value range, which values represent water and could you think of a value providing the upper water limit?

Some more calculations examples using time series

Aridity Index

Time series analysis requires calculations to be performed on map lists / map stacks. To demonstrate the map list calculation functionality an aridity index (AI) is going to be calculated to obtain an impression using a numerical indicator of the degree of dryness of the climate at a given location. Here the AI as proposed by De Martonne is used. The ‘De Martonne’ Aridity index (IndexDM) defines aridity as the ratio of precipitation to mean temperature according to the equation below and classifies these into different climate types. image.png Where:

Climatologies of the two parameters are provided, first load these as Raster Coverages

To calculate de Martone Index (DM) first the sum of the precipitation has to be obtained as well as the average temperature

Visualize the input data

Now we still need to classify the data according to the Aridity Index as proposed by De Martonne, classes are given below.

Climate type Aridity Index code
Arid 0 - 10 1
Semi-arid 10 - 20 2
Mediterranean 20 - 24 3
Semi-humid 24 - 28 4
Humid 28 - 35 5
Very humid 35 - 55 6
Extremely humid > 55 7

Start De Martonne Index calculation using the statisitics from the time series obtained.

To classify the index we are going to create a domain and slice the 'continuous' data range of 'dm_index' into a number of discrete classes, specified according to the table above.

From the operation MetaData a domain (specifying the name and data range) to slice the full data range is required. In the cell below an interval domain is created according to the class assignment proposed by De Martonne.

Conduct the slicing operation and to check the results obtained in ILWIS 386 uncomment the last line in the cell below

The same can also be done using the map calculation operator, note the equation provided below and execute the cell

Precipitation Concentration Index

As can be observed above, there is a large variability in precipitation in time and space. Also an index is available to evaluate the rainfall distribution and rain concentration. Here the Precipitation Concentration Index (PCI) method according to Michiels, Gabriels and Hartmann (1992) is going to be applied. In this index, the higher the PCI, the more irregular and greater the precipitation variability. To estimate this variability the input required is the monthly precipitation for a given year in mm/month. A yearly monthly precipitation map list is required to execute the calculations. The index applied firstly determines the coefficient of variation (CV):  image.png Where:

Subsequently the Precipitation Concentration Index (PCI) is related to the coefficient of variation (CV) using the following equation: image-2.png

Finally the classification as provided in the table below is applied to characterize the PCI.

PCI Temporal Concentration PCI Index code
Uniform < 10 1
Moderately concentrated 11 - 15 2
Concentrated 16 - 20 3
Strongly concentrated > 20 4

Use the precipitation time series, first calculate the sum and average and then conduct the other required calculations to derive the PCI. Note the combined use of time series data (containing multiple time steps) with a single layer map.

Conduct the slicing operation and to check the results obtained in ILWIS 386 uncomment the last line in the cell below

The same can also be done using the map calculation operator, note the equation provided below and execute the cell

Visualize your final results obtained

Concluding remark:

The combined use of ILWISPy with other Python sitepackages provides a powerfull toolbox to handle and analyse time series data. A number of examples are provided within this notebook. To make the notebook more attractive Matplotlib has been used for visualization of selected graphics and maps. For (time series) visualization it is recommended to use the exisiting (animation) visualization capability as offered by ILWIS 386.