Accessing Arrays Efficiently

Many of the array access efficiency techniques described in this section are applied automatically by the Visual Fortran loop transformation optimizations (set at /optimization:5).

Several aspects of array access can improve run-time performance:

The fastest array access occurs when contiguous access to the whole array or most of an array occurs. Perform one or a few array operations that access all of the array or major parts of an array rather than numerous operations on scattered array elements.
Rather than use explicit loops for array access, use elemental array operations, such as the following line that increments all elements of array variable A:
```
  A = A + 1.
```
When reading or writing an array, use the array name and not a DO loop or an implied DO-loop that specifies each element number. Fortran 90 array syntax allows you to reference a whole array by using its name in an expression. For example:
```
  REAL ::  A(100,100)
  A = 0.0
  A = A + 1.           ! Increment all elements of A by 1
  .
  .
  .
  WRITE (8) A          ! Fast whole array use
```
Similarly, you can use derived-type array structure components, such as:
```
  TYPE X
    INTEGER A(5)
  END TYPE X
  .
  .
  .
  TYPE (X) Z
  WRITE (8) Z%A     ! Fast array structure component use
```
Make sure multidimensional arrays are referenced using proper array syntax and are traversed in the "natural" ascending order column major for Fortran. With column-major order, the leftmost subscript varies most rapidly with a stride of one. Whole array access uses column-major order.
Avoid row-major order, as is done by C, where the rightmost subscript varies most rapidly.

For example, consider the nested DO loops that access a two-dimension array with the J loop as the innermost loop:
```
  INTEGER  X(3,5), Y(3,5), I, J
  Y = 0
  DO I=1,3                  ! I outer loop varies slowest
    DO J=1,5                ! J inner loop varies fastest
      X (I,J) = Y(I,J) + 1  ! Inefficient row-major storage order
    END DO                  ! (rightmost subscript varies fastest)
  END DO
  .
  .
  .
  END PROGRAM
```
Because J varies the fastest and is the second array subscript in the expression X (I,J), the array is accessed in row-major order.

To make the array accessed in natural column-major order, examine the array algorithm and data being modified.

Using arrays X and Y, the array can be accessed in natural column-major order by changing the nesting order of the DO loops so the innermost loop variable corresponds to the leftmost array dimension:
```
  INTEGER  X(3,5), Y(3,5), I, J
  Y = 0

  DO J=1,5                  ! J outer loop varies slowest
    DO I=1,3                ! I inner loop varies fastest
      X (I,J) = Y(I,J) + 1  ! Efficient column-major storage order
    END DO                  ! (leftmost subscript varies fastest)
  END DO
   .
   .
   .
  END PROGRAM
```
Fortran whole array access (X= Y + 1) uses efficient column major order. However, if the application requires that J vary the fastest or if you cannot modify the loop order without changing the results, consider modifying the application program to use a rearranged order of array dimensions. Program modifications include rearranging the order of:
- Dimensions in the declaration of the arrays X(5,3) and Y(5,3)
- The assignment of X(J,I) and Y(J,I) within the DO loops
- All other references to arrays X and Y
In this case, the original DO loop nesting is used where J is the innermost loop:
```
  INTEGER  X(5,3), Y(5,3), I, J
  Y = 0
  DO I=1,3                  ! I outer loop varies slowest
    DO J=1,5                ! J inner loop varies fastest
      X (J,I) = Y(J,I) + 1  ! Efficient column-major storage order
    END DO                  ! (leftmost subscript varies fastest)
  END DO
  .
  .
  .
  END PROGRAM
```
Code written to access multidimensional arrays in row-major order (like C) or random order can often make inefficient use of the CPU memory cache. For more information on using natural storage order during record I/O operations, see Write Array Data in the Natural Storage Order.
Use the available Fortran 95/90 array intrinsic procedures rather than creating your own.
Whenever possible, use Fortran 95/90 array intrinsic procedures instead of creating your own routines to accomplish the same task. Fortran 95/90 array intrinsic procedures are designed for efficient use with the various Visual Fortran run-time components.

Using the standard-conforming array intrinsics can also make your program more portable.
With multidimensional arrays where access to array elements will be noncontiguous, avoid left-most array dimensions that are a power of two (such as 256, 512). At higher levels of optimization (/optimize=3 or higher), the compiler pads certain power-of-two array sizes to minimize possible inefficient use of the cache.
Because the cache sizes are a power of two, array dimensions that are also a power of two may make inefficient use of cache when array access is noncontiguous.

One work-around is to increase the dimension to allow some unused elements, making the leftmost dimension larger than actually needed. For example, increasing the leftmost dimension of A from 512 to 520 would make better use of cache:
```
   REAL A (512,100)
   DO I = 2,511
     DO J = 2,99
       A(I,J)=(A(I+1,J-1) + A(I-1, J+1)) * 0.5
     END DO
   END DO
```
In this code, array A has a leftmost dimension of 512, a power of two. The innermost loop accesses the rightmost dimension (row major), causing inefficient access. Increasing the leftmost dimension of A to 520 (REAL A (520,100)) allows the loop to provide better performance, but at the expense of some unused elements.

Because loop index variables I and J are used in the calculation, changing the nesting order of the DO loops changes the results.
To minimize data storage and memory cache misses with arrays, use 32-bit data rather than 64-bit data, unless you require the greater range and precision of double precision floating-point numbers or the numeric range of 8-byte integers.

For more information:

On arrays and their data declaration statements, see Arrays.