Re: OONSTD: Abstract Data Types

Robert Clay (rlclay@ca.sandia.gov)
Wed, 08 Jul 1998 20:17:56 -0700

I'm coming into this discussion late, but want to share a few thoughts and
pieces of info.

At 09:08 AM 7/8/98 , Kent Budge wrote:
>
>E. Robert Tisdale writes:
>>
>> Standards are necessary to support portable high-performance applications.
>> I believe that there will be a standard for object oriented numerical
computing.
>> There will be defacto standards even if there is no formal standard.
>> There is no need to impose object oriented numerical standards on anyone.
>> Compiler vendors should NOT be required to provide numerical libraries.
>
>I'd like my compiler vendor to supply the basic transcendental functions,
>STL, and the complex number library. Beyond that, you may have a point.
>
>I certainly have no interest in using a linear solver that is supplied
>by a compiler vendor. I am perfectly happy using Sandia's AZTEC
>package, which runs on every imaginable platform, can be called from
>just about any language I'm interested in (it's written in C), is
>efficient, and is well-documented. (And it's free, at least for me.)
>I might wrap a set of C++ classes around it, but so far I haven't felt
>a great need to do so.

Aztec already has at least one C++ wrapper via the ISIS++ solver framework,
and I believe may have others. See http://z.ca.sandia.gov/isis for further
info.

Further, and more generally relevant, there exists a working group w/in the
DOE w/ external (i.e., non-DOE) representation which is specifically
working on definition of a "standard" set of interfaces for (sparse linear)
equation solvers. This effort (named ESI = eqn solver interface) grew out
of a perceived need for consolidation of our tools for both developers and
end-users.
The ESI effort is specifically addressing component objects for scalable
solvers, and hence reflects our bias towards OO design methodology
(although not necessarily OO languages) and large-scale problems (typical
of DOE apps). We believe this work will provide a sound basis for other
classes of users (e.g., serial codes). Finally, we know this will provide a
means for the developers to share relatively low-level modules (e.g.,
preconditioners), and perhaps more importantly, a means for end-users to
readily access the superset of methods/libs available. A demonstation of
this latter point is the FEI (finite element interface), which presents a
common interface to solvers for FEM developers. This has already proven
useful, and is being actively extended. Info on both the ESI and FEI work
can be found at http://z.ca.sandia.gov/esi and http://z.ca.sandia.gov/fei.

Updates to both of these web sites are immenent, including initial drafts
of the ESI specs and a new release of the FEI v1.0 spec and code for the
ISIS++ & Aztec implementations.

//robert

>I think what most of us have in mind by "numerical C++ library" is a set
>of classes that support stuff numerical programmers do a lot, but no one
>else is much interested in. I could compile a pretty respectable list
>of what these things are by simply copying the table of contents of
>the latest edition of "Numerical Recipes in C."
>

>But how much of this stuff really needs class interfaces? Or use of
>templates? I see no need to ++ code for which the existing C interface
>is adequate. For that matter, how much of this stuff really needs to be
>standardized? I could get an awful lot of mileage out of just two
>++izations: a set of function classes to replace every function
>pointer appearing in a Numerical Recipes argument list, and a simple
>1-D array to be used with computed indices.
>
>One thing that requires a standard class is complex numbers.
>complex<T> is part of the proposed standard C++ library. And that's
>where it belongs. Because, although no one who isn't a numerical
>programmer is much interested in it, there's almost no disagreement
>about what the class should look like. We just need to settle on the
>spelling. And it's surprisingly easy for a tyro to botch the
>implementation.
>
>> (Perhaps valarray should be removed from the Standard Template Library.)
>
>Valarray isn't part of the Standard Template Library. It's part of the
>numerics library, along with complex numbers and a number of generic
>algorithms. I discussed its history and motivation earlier in this
>thread.
>
>> Third party vendors should provide object oriented numerical libraries.
>
>They won't unless there's money to be made in doing so. And the evidence
>right now is that there is not. We're all having too much fun building
>our own and arguing their merits.
>
>There were two principal objections to valarray when it was first
>proposed to X3J16. The first was a technical objection from the Rogue
>Wave people that it didn't support the kind of math libraries they were
>writing. This objection was answered by modifying and extending the
>proposal. (In the process, the proposal was much improved.) The
>second objection was an economic one from the Microsoft and Borland
>representatives: Valarray costs money to implement, and at least 99%
>of their customers will never use it.
>
>This second objection, incidentally, is still the most persuasive
>argument for killing valarray, if that's what you want to do.
>Unfortunately, this is a razor that would shave complex<T> as well.
>(Heck, there are still purists around who object to the decision to
>support floating point numbers in C.)
>
>> Compliance with standards would help third party vendors sell their
libraries.
>> Standards should NOT be designed by the standards committee.
>> Competing standards should be proposed by individuals. The standards
committee
>> should simply select the best proposal and reject all other proposals.
>
>This isn't how the standardization process works. The function of an
>ANSI or ISO committee is to standardize existing practice. This does
>*not* mean taking proposals and choosing the one that wins a plurality
>on a committee vote. Rather, the committee is supposed to collect
>information on what the existing practice is, identify inconsistencies,
>and come to a consensus (e.g. almost a unanimous vote) on how the
>inconsistencies should be resolved. If a consensus cannot be reached
>on some topic, this is taken as an indication that there is not yet a

>sufficient body of existing practice to justify standardizing that
>particular area.
>
>Of course, it is widely recognized that X3J16 has wildly exceeded this
>charter. I believe the root cause of this was that the charter for
>X3J16 was unusually broad to begin with. Templates and exceptions were
>recognized as something that should be in the language, but prior art
>(existing practice) simply didn't exist. (No one had a working
>implementation except Stroustrup's group.) So the X3J16 charter
>specifically noted that the portion of the standard dealing with
>templates and exceptions would not be based on existing practice. This
>implied a creative process for standardizing these parts of the
>language. As the standard for templates (in particular) came together,
>it was found that numerous adjustments were required to other portions
>of the language to allow templates to do the things that people came to
>expect from them. And once the creative juices began to flow, it was
>hard to resist the temptation to add other things that solved important
>problems and seemed oh so innocuous: run-time type identification,
>new-style casts, and so on.
>
>Another reason for X3J16 exceeding the original charter was
>Stroustrup's oft-voiced opinion that the biggest mistake he made in
>releasing C++ to the public as early as he did was that he had not yet
>assembled an adequate class library to go with it. This was a strong
>hint to those of us in the library working group that we ought to set
>our sights high, and (you guessed it) be creative. The ultimate
>expression of this was STL. STL is a brilliant concept. It belongs in
>the standard library. But an adequate specification for STL required
>extensive changes to the specification for templates, which impacted the
>entire rest of the language. It also led to the unhappy precedent of
>standardizing language features that had not yet been implemented by
>anyone. (This may explain why so many current compilers gag on STL.)
>
>In any case, the standardization process very nearly broke down. When
>I first joined the committee, I witnessed occasions on which a straw
>vote over a proposed resolution to an issue was 40 in favor, 3 against
>-- and the three objections were enough to send the proposal back to
>the working group for further study. The desire for consensus was that
>strong. But two years later, at the last meeting I was able to attend,
>a very important extension was approved -- and adopted -- by the barest
>of majorities. The desire to get a standard out Real Soon Now had
>become greater than the desire for consensus.
>
>This should give you some idea why the committee didn't just vote on
>proposals submitted by outsiders. The committee received many such
>proposals -- each of us received six mailings a year, of at least three
>hundred pages, a good share of which was proposals from people on or
>off the committee. I daresay none of them was acceptable in the
>original form. Even the few that were not rejected out of hand
>required adjustments either to the proposal or to the rest of the
>language -- and every such adjustment became controversial. STL was

>such a proposal. It was so obviously brilliant that a majority of the
>committee became committed to seeing it in the standard. But a large
>minority fought it for the sole reason that it came late in the process
>and required huge adjustments to everything else. (I don't recall
>anyone objecting to it on its technical merits alone.)
>
>...
>>
>> All of the existing vector, matrix and tensor libraries probably implement
>> no more than a handful of ADTs which are both fundamentally different from
>> each other and are useful to significant a number of application
programmers.
>> We could, if necessary, specify standards for each of them. Personally,
>> I am not interested just now in convenient rapid prototyping systems
>> like Matlab, Mathematica, Octave, MatClass, etc. or sparse matrix
libraries.
>> I am not particularly interested in any programming languages except C++.
>> I am interested in a high-performance C++ class library which supports
>> scalar, vector, matrix and tensor arithmetic on dense arrays of real
numbers.
>
>I, too, have only a limited interest in languages other than C++. Java
>is C++ Lite without the ethanol. Eiffel is worthy of more serious
>consideration, but I find it rather overconstrained at times, and it
>doesn't support value types or genericity as well as I like. I don't
>know anything about object-oriented ADA other than that it exists and
>was standardized before C++.
>
>Fortran-90 is not an object-oriented language, notwithstanding claims to
>the contrary. I'd have a hard time taking it seriously, except that so
>many others do.
>
>Sparse matrices are of great interest to most numerical programmers.
>They are the representation of local operators, which are awfully
>important for (among other things) solving differential equations. I
>think we could get an awful lot of mileage out of standardizing just
>two layouts -- the layout of banded matrices, which should not be
>controversial; and the layout of arbitrary sparse matrices, which is
>more so. I would be happy to settle for modified sparse row (MSR)
>format, which is recognizec by AZTEC and which has been successfully
>generalized for distributed computing. It is more efficient (if
>slightly less intuitive) than the Yale format used by ITPACK. But I
>could probably live with two standard formats, so long as efficient
>conversion routines exist. (Copying memory is relatively cheap.) I
>believe both formats are described in "Numerical Recipes in C."
>
>No doubt someone will want to add triangular matrices to the list, or
>skyline matrices. I'm not particularly interested in the former and
>I think a general sparse matrix layout is adequate for the latter.
>
>The only thing controversial about the layout of dense arrays on single
>processors is whether they should be row-dominant or column-dominant. It's
>a toss-up, because Fortran chose one and C chose the other, so either way
>you're equally screwed. (The "Numerical Recipes in C" approach is not
>worthy of serious consideration.) The layout on parallel machines is
>probably not controversial, either -- the only sane decomposition is by

>block, which brings you back to the row/column dominance problem.
>
>>
>> The ADT that I have in mind would permit the application programmer
>> to view a one-dimensional array of real numbers as a vector, matrix
>> or [third order] tensor like many of the existing libraries
>> except that the actual representation of the one-dimensional array
>> would be hidden from the application programmer in order to permit
>> library developers to implement it as they see fit. For example,
>> the library developer might decide to store all or part of the 1D array
>> in cache memory or distribute it across the nodes of a multi-processor.
>> It may not possible to return a pointer or a reference to any element
>> of the 1D array so special library functions are required to access
elements.
>> The C++ language binding would define SubScalar, SubVector, SubMatrix
>> and SubTensor classes to reference all or part of the 1D array
>> as a scalar, vector, matrix and tensor respectively.
>> These classes would contain a "handle" to reference the 1D array,
>> an offset from the beginning of the 1D array to the first element
>> of the subscalar, subvector, submatrix or subtensor, and a stride and
length
>> for each dimension of the subvector, submatrix or subtensor.
>> Vector, Matrix and Tensor classes would be derived
>> from SubVector, SubMatrix and SubTensor classes respectively.
>> SubScalar, SubVector, SubMatrix and SubTensor classes
>> reference a 1D array created by Vector, Matrix or Tensor classes
>> which allocate their own 1D array when they are created
>> then deallocate their own 1D array when they are destroyed.
>>
>> This design is similar to MV++, Blitz++, etc. but is more general.
>> If the library developer implemented the 1D array using an ordinary
>> one dimensional C array of type float or double, it may be possible
>> to use BLAS and LAPACK to implement some of the desired functionality.
>> Otherwise, the library developer would need to provide functions
>> compatible with the chosen representation. Bob Tisdale
>
>What you are proposing sounds very much like valarray<T>, except that
>you put greater emphasis on hiding the representation. Why? Programs
>rarely have direct control over cache memory, and some exposure of the
>implementation is required if you want to cross languages. (Like it or
>not, most of us will be interfacing with FORTRAN and/or C for a long
>time to come.) A distributed array doesn't make sense to me; the
>communications cost would be prohibitive. Message passing is a more
>sensible approach to distributed computing.
>
>Kent G. Budge
>Sandia National Laboratories
>

______________________________________________________________

Robert L. Clay email: rlclay@ca.sandia.gov
Distributed Applications Research phone: 510/294-3193
Sandia National Labs fax: 510/294-2234
Livermore, CA 94550 USA http://z.ca.sandia.gov/rlc
______________________________________________________________