Spatial correlation
Functionality
Point statistics may help to get an impression of the nature of your point data, for instance prior to a point interpolation, and to find necessary input parameters for Kriging, Anisotropic Kriging or Universal Kriging. First, distance and optionally direction is calculated between all points of possible point pairs; these distances and directions are also known as the separation vectors. Subsequently, autocorrelation, autovariance and experimental semi-variogram values are calculated from the values of those point pairs which fall within the same user-specified lag, i.e. the same distance (and direction) class.
Spatial autocorrelation measures dependence among nearby values in a spatial distribution. Variables may be correlated because they are affected by similar processes, or phenomena, that extend over a larger region. Odland (1988, p.7) mentions that spatial autocorrelation 'exists whenever a variable exhibits a regular pattern over space in which values at a certain set of locations depend on values of the same variable at other locations'. For example, if the concentration of a certain pollutant is very high at a certain location, it will most likely also be high in the direct surroundings. In other words, the concentration is autocorrelated at small distances. At larger distances, it is less likely that the concentration will be equally high. The correlation will probably be lower, and the variance higher.
By plotting the answers on autocorrelation against the distance classes, you will be able to see until which distance spatial autocorrelation exists between point pairs. This value can be used for the limiting distance in point interpolations such as moving average and moving surface. Furthermore, the user is encouraged to compare his or her data set with a data set consisting of the same point locations, with a set of attribute values, approximately in the same range as the measured variable, but created at random (using one of the RND functions in Table Calculation). If the graphs are very much the same for the measured data and the random data, no spatial autocorrelation exists between the data points. Hence, point interpolation is not useful. For more information, see Interpretation of Moran's I and Geary's c below.
Calculating semi-variograms is a basic geostatistical measure to determine the rate of change of a regionalized variable along a specific orientation (usually distances). Semi-variogram values are defined as the sum of the squared differences between pairs of points separated by a certain distance divided by two times the number of points in a distance class. By plotting experimental semi-variogram values against distance classes in a graph, you obtain a semi-variogram. By finding a model or function which fits these experimental semi-variogram values, you can obtain necessary input information (such as model type, sill, range, and nugget) for a Simple or Ordinary Kriging, an Anisotropic Kriging or a Universal Kriging operation later on. For more information, see the Additional info on semi-variograms below.
Tip:
When you suspect anisotropy in your input data, you can first perform the Variogram Surface operation. The output of this operation will show you the direction of the anisotropy. You can then do Spatial Correlation using the bi-directional method.
General process of this operation (omni directional):
- First, the distances between all points are calculated.
- Then, distance classes are determined (output column Distance). This is usually done according to the user-specified lag spacing: in the output table, records will appear for each multiple of the user-specified lag spacing. When specifying a lag spacing of 500 m., the values in the Distance column in the output table will be 0, 500, 1000, 1500, etc.
However, these values in the output column Distance represent the middle value of a distance class, thus for lag spacing 500, distance 500 represents the distance interval of 250-750m, distance 1000 represents the distance interval of 750-1250m, etc.
When a variable was sampled at regular distances, you can use this distance for the lag spacing.
- Subsequently, for each distance class, the number of point pairs is counted of which the points have such a distance towards each other.
Thus, when the user-specified lag spacing is 500 m.:
- the first record in the output table has value 0 in column Distance; this first distance class is only half a distance class: it contains all point pairs of which the distance of the points towards each other is 0-250 m.;
- the second record in the output table has value 500 in column Distance; this distance class contains all point pairs of which the distance of the points towards each other is between 250-750 m.;
- the third record in the output table has value 1000 in column Distance; this distance class contains all point pairs of which the distance of the points towards each other is between 750-1250 m. etc.;
On the command line, you can also use a certain expression to obtain log-scaled distance classes.
- Then, for all the point pairs within a certain distance class, the following statistical values are calculated:
- spatial autocorrelation (as Moran's I)
- spatial variance (as Geary's c)
- semi-variogram values.
The formula to calculate experimental semi-variogram values reads:
= S (zi - zi+h)2 / 2n
where:
|

|
experimental semi-variogram value of points that have a certain distance (h) towards each other
|
|
zi
|
the value of point i
|
|
zi+h
|
the value of a point at distance h from point i
|
|
S (zi - zi+h)2
|
the sum of the squared differences between point values of all point pairs within a certain distance class
|
|
n
|
the number of point pairs within a distance class
|
For more information on formulas, see Spatial correlation : algorithm.
Methods:
In the dialog box, you can choose to use either the omni directional or the
bi-directional method:
- The omni directional method simply determines all distances between all point pairs, regardless of any direction, i.e. in all directions. Thus, all point pairs that have a certain distance towards each other will be counted in a certain distance class.
Then, Moran's I, Geary's c, and experimental semi-variogram values are calculated for all point pairs within each distance class.
- The bi-directional method first counts, just like the omni directional method, all pairs of points that have a certain distance to each other, and then calculates the Moran's I and Geary's c for these point pairs within each distance class. Furthermore, all point pairs are counted with a certain distance to each other and with a certain direction towards each other. For the point pairs in a certain distance class and in the correct direction, experimental semi-variogram values will be calculated. Then, also, for the direction perpendicular to the specified direction, point pairs are counted and experimental semi-variogram values calculated.
Both for the omni directional or the bi-directional method, linear distance intervals are created where the middle values of these distance classes are multiples of the user-specified lag spacing.
To calculate experimental semi-variogram values in a certain direction, you thus have to use the bi-directional method. The parameters for the bi-directional method are schematically presented in Figure 1.
- The direction angle is measured clockwise from the Y-axis and defines the direction in which points should be located relative to each other. When you use a direction angle of 90°, it means that only point pairs for which the points are located in West-East or in East-West direction will be considered (i.e. +90° clockwise from the Y-axis).
When choosing the bi-directional method, semi-variogram values will always be calculated for the point pairs in the specified direction and the perpendicular direction.
- The tolerance angle is a parameter with which you can limit the number of point pairs.
When a tolerance of 45° is used, all point pairs in the map will contribute to calculated semi-variogram values.
When using a tolerance of 10°, the direction of every 2 points may differ -10° or +10° from the specified direction (90°). So, in fact, all points that are found in a position within 80° to 100° to one another are valid pairs. Then, for the valid point pairs, the distance class to which they belong will be determined.
- Optionally, you can specify a third parameter, the band width (m), to limit the tolerance angle to a certain width.

Fig. 1: Schematic explanation of the bi-directional method when experimental semi-variogram values are calculated for the specified direction as well as for the perpendicular direction. The user has to specify a Direction (blue angle) and a Tolerance (red angle), and optionally, also a Band width (green distance in meters) can be specified. These parameters are used to find valid point pairs. When an input point is located at the origin of this picture, it is calculated whether any other input point is within the specified direction, tolerance angle and band width. If this is the case, the 2 points are a valid point pair; otherwise the pair is ignored. For each valid point pair, the distance between the 2 points is calculated, and the point pair is counted in the appropriate distance class.
Finally, from the command line, you can even use another method by which logarithmic distance intervals are used. The lag spacing increases with the distance.
Spherical distance:
Optionally, when using the 'omni directional' method, you can choose to calculate with spherical distances, i.e. distances calculated over the sphere using the projection that is specified in the coordinate system used by the input point map. It is advised to use this spherical distance option for maps that comprise large areas (countries or regions) and for maps that use LatLon coordinates. In more general terms, spherical distance should be used when there are 'large' scale differences within a map as a consequence of projecting the globe-shaped earth surface onto a plane.
When the spherical distance option is not used, distances will be calculated in a plane as Euclidean distances.
Tip: When you used the spherical distance option in the Spatial Correlation operation, you should also use the spherical distance option in a subsequent point interpolation operation, or in a subsequent Kriging operation.
Input map requirements:
The input point map should either be a value map itself, or a Class or ID point map which has a linked attribute table with one or more value columns.
Output table:
An output table with domain None is created.
When you use the option Omni directional, the output table will contain 6 columns:
- Column Distance lists the middle values of the distance intervals;
- Column NrPairs lists for each distance interval, the number of point pairs found at these distances towards each other;
- Column I lists for each distance interval, the spatial autocorrelation of the point pairs in this distance interval;
- Column c lists for each distance interval, a statistic for spatial variance of the point pairs in this distance interval;
- Column AvgLag lists for each distance interval, the average distance between points of point pairs in this distance interval;
- Column Semivar lists for each distance interval, the experimental semi-variogram value of the point pairs in this distance interval.
When you use the option Bi-directional, the output table will contain 10 columns:
- Column Distance lists the middle values of the distance intervals;
- Column NrPairs lists for each distance interval, the number of point pairs found at these distances towards each other;
- Column I lists for each distance interval, the spatial autocorrelation of the point pairs in this distance interval;
- Column c lists for each distance interval, a statistic for spatial variance of the point pairs in this distance interval;
- Column AvgLag1 lists for each distance interval, the average distance between points of point pairs in this distance interval;
- Column NrPairs1 lists for each distance interval, the number of point pairs found in the user-specified direction and at these distances towards each other;
- Column Semivar1 lists for each distance interval, the experimental semi-variogram value of the point pairs found in the user-specified direction and at these distances towards each other;
- Column AvgLag2 lists for each distance interval, the average distance between points of point pairs in this distance interval;
- Column NrPairs2 lists for each distance interval, the number of point pairs found perpendicular to the user-specified direction and at these distances towards each other;
- Column Semivar2 lists for each distance interval, the experimental semi-variogram value of the point pairs found perpendicular to the user-specified direction and at these distances towards each other.
Mind:
When in an distance interval no point pairs are found, then the values in columns I, c, AvgLag and SemiVar will be undefined for these distance intervals.
From the results of the Spatial correlation operation, you can make a semi-variogram. In the semi-variogram, the discrete experimental semi-variogram values
that are the outcome of Spatial correlation will be modeled by a continuous function so that a semi-variogram value g will be available for any desired distance h (and optionally direction) for a Simple or Ordinary Kriging, an Anisotropic Kriging or a Universal Kriging operation later on.
How to display a semi-variogram:
Display the input table of the Spatial correlation operation or a histogram of an input value map in a table window.
- Determine the variance (s2) of your input variable. The variance of a column can be calculated by using an expression like OUT = var(columnname)
Display the output table of the Spatial correlation operation in a table window. Inspect the following columns in the output table:
- columns Distance, and AvgLag/AvgLag1/AvgLag2: usually not more than half of the total sampled distance should be taken into account; the larger the distance between the point pairs, the less point pairs, and the less reliable the outcome;
- columns NrPairs/NrPairs1/NrPairs2: for reliable semi-variogram values, distance classes should at least contain 30 point pairs.
Create point graphs, i.e. experimental semi-variogram(s), from the Distance and SemiVar columns in the output table of Spatial Correlation.
- From the File menu in the table window, choose the Create Graph command.
- In the Create Graph dialog box, when you used the omni directional method:
- choose for the X-axis, the Distance column or the AvgLag column;
- choose for the Y-axis, the SemiVar column.
When you used the bi-directional method, you can draw two graphs (e.g. in two graph windows):
- Distance or AvgLag1 against SemiVar1 (semi-variogram values in the specified direction), and
- Distance or AvgLag2 against SemiVar2 (semi-variogram values in the perpendicular direction).
- The experimental semi-variogram values will automatically be displayed as point graphs.
- You may wish to adapt the boundaries of the X-axis from 0 to more or less half of the total distance between the samples, and the Y-axis from 0 to more or less the expected variance (s2) of your input sample values.
In literature, the shown graph is called a discrete experimental semi-variogram.
Figure 2 below shows a semi-variogram depicting a spherical model:
- When the distance between sample points is 0, the differences between sampled values is also expected to be 0. Thus, the semi-variogram value at distance 0 equals 0, i.e. g(0)=0.
- Samples that are at a very small distance to each other are expected to have almost the same values; thus, the squared differences between sample values are expected to be small positive values at small distances.
- With increasing distance between point pairs, the expected squared differences between point values will also increase.
- At some distance the points that are compared are so far apart that they are not any more related to each other, i.e. the sample values will become independent of one another. Then, the squared differences of the point values will become equal in magnitude to the variance of the variable. The semi-variogram no longer increases and the semi-variogram develops a flat region, called the sill. The distance at which the semi-variogram approaches the variance is referred to as the range or the span of the variable.

Fig. 2: A semi-variogram depicting a spherical model.
Remarks on semi-variograms:
- A semi-variogram with a nugget effect is a semi-variogram that goes from 0 to the level of the nugget effect in a distance less than the sampling distance. The semi-variogram model shows the semi-variogram value 0 at distance 0 and a discontinuity (jump) to a semi-variogram value at an extremely small distance. A nugget effect indicates that the variable is erratic over very short distances, the variable is highly variable over distances less than the specified lag spacing or the sampling interval.
- For semi-variograms which, after a flat sill level, show an ongoing increase in semi-variogram values, probably a trend has to be taken into account for the longer distances. However, the experimental semi-variogram values for the distances up to the sill, are probably accurate enough to be used in a model.
- Possible dips in the semi-variogram indicate that at certain distances between points there is less difference between the samples than at other distances; this might indicate periodic trends.
The next step, before Kriging, is to model the discrete values of your experimental semi-variogram by a continuous function which will give an expected value for any desired distance.
- From the Edit menu in the graph window, choose the Add Graph Semi-variogram Model command.
- In the Add Graph Semi-variogram Model dialog box, you can choose a type of semi-variogram model (spherical, exponential, etc.), and you can fill out values for the sill, range and nugget. A line will then be drawn according to the model you selected and the values you selected for sill, range and nugget.
- You are advised to visually experiment a little with models and sill, range, and nugget values to find the best line through your experimental semi-variogram values. You can edit a semi-variogram model by double-clicking it in the Graph Management pane. For more information, refer to the Graph Options - Semi-variogram Model dialog box.
- When finished, you can save the graph by choosing the Save or the Save As commands from the File menu of the graph window.
To find which semi-variogram model fits your experimental semi-variogram values best, you can also use the Column SemiVariogram operation. This operation calculates semi-variogram values according to a user-specified semi-variogram model and parameters and stores calculated semi-variogram values in an output column.
Once you have decided which semi-variogram model, and which values for sill, range and nugget fit your data best, you can continue with the Simple or Ordinary Kriging operation, the Anisotropic Kriging operation or the Universal Kriging operation.
For all point pairs in a distance/direction class, you obtain a value for Moran's I and Geary's c; the formulae for these statistic measures can be found in topic Spatial correlation : algorithm. Geary's c compares the squared differences of point pair values to the mean of all values. Moran's I relates the product of differences of point pair values to the overall difference.
The general interpretation of both statistics can be summarized as:
0 < C < 1
|
Strong positive autocorrelation
|
I > 0
|
C > 1
|
Strong negative autocorrelation
|
I < 0
|
C = 1
|
Random distribution of values
|
I = 0
|
Geary's c multiplied by the variance of the input equals the semi-variogram values.
References:
- Berry, J. K. Beyond Mapping: extending spatial dependency to maps. In: Geoworld, 1999, vol. 12, no. 1, pp. 26-27.
- Clark, I. 1979. Practical geostatistics. Applied Science Publishers, London. 129 pp.
- Davis, J. C. 1973. Statistics and data analysis in geology. Wiley, New York. 646 pp.
- Isaaks, E. H., and R. M. Srivastava. 1989. An introduction to applied geostatistics. Oxford University Press, New York. 561 pp.
- Odland, J. 1988. Spatial autocorrelation. In: G.I. Thrall (Ed.), Sage University Scientific Geography Series no. 9. Sage Publications, Beverly Hills. 87 pp.
See also:
Point statistics
Variogram surface : functionality
Spatial correlation : dialog box
Spatial correlation : command line
Spatial correlation : algorithm
Graph window : Add semi-variogram model (dialog box)
Graph options - Semi-variogram Model (dialog box)
Point interpolation
Kriging : functionality
Anisotropic Kriging : functionality
Universal Kriging : functionality