Many of the array access efficiency techniques described in this section are applied automatically by the Visual Fortran loop transformation optimizations (set at /optimization:5).
Several aspects of array access can improve run-time performance:
Rather than use explicit loops for array access, use elemental array operations, such as the following line that increments all elements of array variable A:
A = A + 1.
When reading or writing an array, use the array name and not a DO loop or an implied DO-loop that specifies each element number. Fortran 90 array syntax allows you to reference a whole array by using its name in an expression. For example:
REAL :: A(100,100)
A = 0.0
A = A + 1. ! Increment all elements of A by 1
.
.
.
WRITE (8) A ! Fast whole array use
Similarly, you can use derived-type array structure components, such as:
TYPE X
INTEGER A(5)
END TYPE X
.
.
.
TYPE (X) Z
WRITE (8) Z%A ! Fast array structure component use
Avoid row-major order, as is done by C, where the rightmost subscript varies most rapidly.
For example, consider the nested DO loops that access a two-dimension array with the J loop as the innermost loop:
INTEGER X(3,5), Y(3,5), I, J
Y = 0
DO I=1,3 ! I outer loop varies slowest
DO J=1,5 ! J inner loop varies fastest
X (I,J) = Y(I,J) + 1 ! Inefficient row-major storage order
END DO ! (rightmost subscript varies fastest)
END DO
.
.
.
END PROGRAM
Because J varies the fastest and is the second array subscript in the expression X (I,J), the array is accessed in row-major order.
To make the array accessed in natural column-major order, examine the array algorithm and data being modified.
Using arrays X and Y, the array can be accessed in natural column-major order by changing the nesting order of the DO loops so the innermost loop variable corresponds to the leftmost array dimension:
INTEGER X(3,5), Y(3,5), I, J
Y = 0
DO J=1,5 ! J outer loop varies slowest
DO I=1,3 ! I inner loop varies fastest
X (I,J) = Y(I,J) + 1 ! Efficient column-major storage order
END DO ! (leftmost subscript varies fastest)
END DO
.
.
.
END PROGRAM
Fortran whole array access (X= Y + 1) uses efficient column major order. However, if the application requires that J vary the fastest or if you cannot modify the loop order without changing the results, consider modifying the application program to use a rearranged order of array dimensions. Program modifications include rearranging the order of:
In this case, the original DO loop nesting is used where J is the innermost loop:
INTEGER X(5,3), Y(5,3), I, J
Y = 0
DO I=1,3 ! I outer loop varies slowest
DO J=1,5 ! J inner loop varies fastest
X (J,I) = Y(J,I) + 1 ! Efficient column-major storage order
END DO ! (leftmost subscript varies fastest)
END DO
.
.
.
END PROGRAM
Code written to access multidimensional arrays in row-major order (like C) or random order can often make inefficient use of the CPU memory cache. For more information on using natural storage order during record I/O operations, see Write Array Data in the Natural Storage Order.
Whenever possible, use Fortran 95/90 array intrinsic procedures instead of creating your own routines to accomplish the same task. Fortran 95/90 array intrinsic procedures are designed for efficient use with the various Visual Fortran run-time components.
Using the standard-conforming array intrinsics can also make your program more portable.
Because the cache sizes are a power of two, array dimensions that are also a power of two may make inefficient use of cache when array access is noncontiguous.
One work-around is to increase the dimension to allow some unused elements, making the leftmost dimension larger than actually needed. For example, increasing the leftmost dimension of A from 512 to 520 would make better use of cache:
REAL A (512,100)
DO I = 2,511
DO J = 2,99
A(I,J)=(A(I+1,J-1) + A(I-1, J+1)) * 0.5
END DO
END DO
In this code, array A has a leftmost dimension of 512, a power of two. The innermost loop accesses the rightmost dimension (row major), causing inefficient access. Increasing the leftmost dimension of A to 520 (REAL A (520,100)) allows the loop to provide better performance, but at the expense of some unused elements.
Because loop index variables I and J are used in the calculation, changing the nesting order of the DO loops changes the results.
For more information:
On arrays and their data declaration statements, see Arrays.