OONSTD: Abstract Data Types

E. Robert Tisdale (edwin@maxwell.hrl.hac.com)
Tue, 7 Jul 1998 16:41:05 -0700 (PDT)

Standards are necessary to support portable high-performance applications.
I believe that there will be a standard for object oriented numerical computing.
There will be defacto standards even if there is no formal standard.
There is no need to impose object oriented numerical standards on anyone.
Compiler vendors should NOT be required to provide numerical libraries.
(Perhaps valarray should be removed from the Standard Template Library.)
Third party vendors should provide object oriented numerical libraries.
Compliance with standards would help third party vendors sell their libraries.
Standards should NOT be designed by the standards committee.
Competing standards should be proposed by individuals. The standards committee
should simply select the best proposal and reject all other proposals.

The first step in designing a library standard is to define the ADT
(Abstract Data Type). The ADT includes the Data Structures and the Methods
which operate on those data structures without specifying any details
about how the ADT is actually implemented. The methods required
should be obvious once the data structures have been determined.
The next step is to design the API (Application Programmer's Interface)
which specifies how the programmer will access the methods implemented
in the library. The last step is to design the language binding
for each [objected oriented] language of interest.

Vector, matrix and tensor arithmetic library standards
should be our first priority because so many of the other
object oriented numerical libraries depend upon them.
As expected, all of the existing libraries are very similar
differing mainly in the names given to functions and types.
None of them support exactly the same set of functions
and there are some subtle differences in functionality
but most of these difference could be resolved
by simply including a few standard functions.

All of the existing vector, matrix and tensor libraries probably implement
no more than a handful of ADTs which are both fundamentally different from
each other and are useful to significant a number of application programmers.
We could, if necessary, specify standards for each of them. Personally,
I am not interested just now in convenient rapid prototyping systems
like Matlab, Mathematica, Octave, MatClass, etc. or sparse matrix libraries.
I am not particularly interested in any programming languages except C++.
I am interested in a high-performance C++ class library which supports
scalar, vector, matrix and tensor arithmetic on dense arrays of real numbers.

The ADT that I have in mind would permit the application programmer
to view a one-dimensional array of real numbers as a vector, matrix
or [third order] tensor like many of the existing libraries
except that the actual representation of the one-dimensional array
would be hidden from the application programmer in order to permit
library developers to implement it as they see fit. For example,
the library developer might decide to store all or part of the 1D array
in cache memory or distribute it across the nodes of a multi-processor.
It may not possible to return a pointer or a reference to any element
of the 1D array so special library functions are required to access elements.
The C++ language binding would define SubScalar, SubVector, SubMatrix
and SubTensor classes to reference all or part of the 1D array
as a scalar, vector, matrix and tensor respectively.
These classes would contain a "handle" to reference the 1D array,
an offset from the beginning of the 1D array to the first element
of the subscalar, subvector, submatrix or subtensor, and a stride and length
for each dimension of the subvector, submatrix or subtensor.
Vector, Matrix and Tensor classes would be derived
from SubVector, SubMatrix and SubTensor classes respectively.
SubScalar, SubVector, SubMatrix and SubTensor classes
reference a 1D array created by Vector, Matrix or Tensor classes
which allocate their own 1D array when they are created
then deallocate their own 1D array when they are destroyed.

This design is similar to MV++, Blitz++, etc. but is more general.
If the library developer implemented the 1D array using an ordinary
one dimensional C array of type float or double, it may be possible
to use BLAS and LAPACK to implement some of the desired functionality.
Otherwise, the library developer would need to provide functions
compatible with the chosen representation. Bob Tisdale