As a temporary mechanism enabling the support of generic classes, the GNU C++ Library distribution contains a directory (`g++-include') of files designed to serve as the basis for generating container classes of specified elements. These files can be used to generate `.h' and `.cc' files in the current directory via a supplied shell script program that performs simple textual substitution to create specific classes.
While these classes are generated independently, and thus share no code,
it is possible to create versions that do share code among subclasses. For
example, using typedef void* ent
, and then generating a
entList
class, other derived classes could be created using the
void*
coercion method described in Stroustrup, pp204-210.
This very simple class-generation facility is useful enough to serve current purposes, but will be replaced with a more coherent mechanism for handling C++ generics in a way that minimally disrupts current usage. Without knowing exactly when or how parametric classes might be added to the C++ language, provision of this simplest possible mechanism, textual substitution, appears to be the safest strategy, although it does require certain redundancies and awkward constructions.
Specific classes may be generated via the `genclass' shell script
program. This program has arguments specifying the kinds of base types(s)
to be used. Specifying base types requires two arguments. The first is the
name of the base type, which may be any named type, like int
or
String
. Only named types are supported; things like int*
are
not accepted. However, pointers like this may be used by supplying the
appropriate typedefs (e.g., editing the resulting files to include
typedef int* intp;
). The type name must be followed by one of the
words val
or ref
, to indicate whether the base elements
should be passed to functions by-value or by-reference.
You can specify basic container classes using genclass base
[val,ref] proto
, where proto
is the name of the class being
generated. Container classes like dictionaries and maps that require
two types may be specified via genclass -2 keytype [val, ref],
basetype [val, ref] proto
, where the key type is specified first and
the contents type second. The resulting classnames and filenames are
generated by prepending the specified type names to the prototype names,
and separating the filename parts with dots. For example,
genclass int val List
generates class intList
residing in
files `int.List.h' and `int.List.cc'. genclass -2 String
ref int val VHMap
generates (the awkward, but unavoidable) class name
StringintVHMap
. Of course, programmers may use typedef
or
simple editing to create more appropriate names. The existence of dot
seperators in file names allows the use of GNU make to help automate
configuration and recompilation. An example Makefile exploiting such
capabilities may be found in the `libg++/proto-kit' directory.
The genclass
utility operates via simple text substitution using
sed
. All occurrences of the pseudo-types <T>
and <C>
(if there are two types) are replaced with the indicated type, and
occurrences of <T&>
and <C&>
are replaced by just the types,
if val
is specified, or types followed by "&" if ref
is
specified.
Programmers will frequently need to edit the `.h' file in order to
insert additional #include
directives or other modifications. A
simple utility, `prepend-header' to prepend other `.h' files
to generated files is provided in the distribution.
One dubious virtue of the prototyping mechanism is that, because sources files, not archived library classes, are generated, it is relatively simple for programmers to modify container classes in the common case where slight variations of standard container classes are required.
It is often a good idea for programmers to archive (via ar
)
generated classes into `.a' files so that only those class
functions actually used in a given application will be loaded.
The test subdirectory of the distribution shows an example of this.
Because of #pragma interface
directives, the `.cc' files
should be compiled with -O
or -DUSE_LIBGXX_INLINES
enabled.
Many container classes require specifications over and above the base class type. For example, classes that maintain some kind of ordering of elements require specification of a comparison function upon which to base the ordering. This is accomplished via a prototype file `defs.hP' that contains macros for these functions. While these macros default to perform reasonable actions, they can and should be changed in particular cases. Most prototypes require only one or a few of these. No harm is done if unused macros are defined to perform nonsensical actions. The macros are:
DEFAULT_INITIAL_CAPACITY
<T>EQ(a, b)
<T>LE(a, b)
<T>CMP(a, b)
<T>HASH(a)
Nearly all prototypes container classes support container
traversal via Pix
pseudo indices, as described elsewhere.
All object containers must perform either a X::X(X&)
(or
X::X()
followed by X::operator =(X&)
) to copy objects into
containers. (The latter form is used for containers built from C++
arrays, like VHSets
). When containers are destroyed, they invoke
X::~X()
. Any objects used in containers must have well behaved
constructors and destructors. If you want to create containers that
merely reference (point to) objects that reside elsewhere, and are not
copied or destroyed inside the container, you must use containers
of pointers, not containers of objects.
All prototypes are designed to generate HOMOGENOUS container classes. There is no universally applicable method in C++ to support heterogenous object collections with elements of various subclasses of some specified base class. The only way to get heterogenous structures is to use collections of pointers-to-objects, not collections of objects (which also requires you to take responsibility for managing storage for the objects pointed to yourself).
For example, the following usage illustrates a commonly encountered danger in trying to use container classes for heterogenous structures:
class Base { int x; ...} class Derived : public Base { int y; ... } BaseVHSet s; // class BaseVHSet generated via something like // `genclass Base ref VHSet' void f() { Base b; s.add(b); // OK Derived d; s.add(d); // (CHOP!) }
At the line flagged with `(CHOP!)', a Base::Base(Base&)
is
called inside Set::add(Base&)
---not
Derived::Derived(Derived&)
. Actually, in VHSet
, a
Base::operator =(Base&)
, is used instead to place the element in
an array slot, but with the same effect. So only the Base part is
copied as a VHSet
element (a so-called chopped-copy). In this
case, it has an x
part, but no y
part; and a Base, not
Derived, vtable. Objects formed via chopped copies are rarely
sensible.
To avoid this, you must resort to pointers:
typedef Base* BasePtr; BasePtrVHSet s; // class BaseVHSet generated via something like // `genclass BasePtr val VHSet' void f() { Base* bp = new Base; s.add(b); Base* dp = new Derived; s.add(d); // works fine. // Don't forget to delete bp and dp sometime. // The VHSet won't do this for you. }
The prototypes can be difficult to use on first attempt. Here is an example that may be helpful. The utilities in the `proto-kit' simplify much of the actions described, but are not used here.
Suppose you create a class Person
, and want to make an Map that
links the social security numbers associated with each person. You start
off with a file `Person.h'
#include <String.h> class Person { String nm; String addr; //... public: const String& name() { return nm; } const String& address() { return addr; } void print() { ... } //... }
And in file `SSN.h',
typedef unsigned int SSN;
Your first decision is what storage/usage strategy to use. There are several reasonable alternatives here: You might create an "object collection" of Persons, a "pointer collection" of pointers-to-Persons, or even a simple String map, housing either copies of pointers to the names of Persons, since other fields are unused for purposes of the Map. In an object collection, instances of class Person "live" inside the Map, while in a pointer collection, the instances live elsewhere. Also, as above, if instances of subclasses of Person are to be used inside the Map, you must use pointers. In a String Map, the same difference holds, but now only for the name fields. Any of these choices might make sense in particular applications.
The second choice is the Map implementation strategy. Either a tree
or a hash table might make sense. Suppose you want an AVL tree Map.
There are two things to now check. First, as an object collection,
the AVLMap requires that the elsement class contain an X(X&)
constructor. In C++, if you don't specify such a constructor, one
is constructed for you, but it is a very good idea to always do this
yourself, to avoid surprises. In this example, you'd use something like
class Person { ...; Person(const Person& p) :nm(p.nm), addr(p.addr) {} };
Also, an AVLMap requires a comparison function for elements in order to maintain order. Rather than requiring you to write a particular comparison function, a `defs' file is consulted to determine how to compare items. You must create and edit such a file.
Before creating `Person.defs.h', you must first make one additional
decision. Should the Map member functions like m.contains(p)
take arguments (p
) by reference (i.e., typed as
int Map::contains(const Person& p)
or by value (i.e., typed as
int Map::contains(const Person p)
. Generally, for user-defined
classes, you want to pass by reference, and for builtins and pointers,
to pass by value. SO you should pick by-reference.
You can now create `Person.defs.h' via genclass Person ref defs
.
This creates a simple skeleton that you must edit. First, add
#include "Person.h"
to the top. Second, edit the <T>CMP(a,b)
macro to compare on name, via
#define <T>CMP(a, b) ( compare(a.name(), b.name()) )
which invokes the int compare(const String&, const String&)
function from `String.h'. Of course, you could define this in any
other way as well. In fact, the default versions in the skeleton turn
out to be OK (albeit inefficient) in this particular example.
You may also want to create file `SSN.defs.h'. Here, choosing call-by-value makes sense, and since no other capabilities (like comparison functions) of the SSNs are used (and the defaults are OK anyway), you'd type
genclass SSN val defs
and then edit to place #include "SSN.h"
at the top.
Finally, you can generate the classes. First, generate the base class for Maps via
genclass -2 Person ref SSN val Map
This generates only the abstract class, not the implementation, in file `Person.SSN.Map.h' and `Person.SSN.Map.cc'. To create the AVL implementation, type
genclass -2 Person ref SSN val AVLMap
This creates the class PersonSSNAVLMap
, in
`Person.SSN.AVLMap.h' and `Person.SSN.AVLMap.cc'.
To use the AVL implementation, compile the two generated `.cc' files, and specify `#include "Person.SSN.AVLMap.h"' in the application program. All other files are included in the right ways automatically.
One last consideration, peculiar to Maps, is to pick a reasonable default contents when declaring an AVLMap. Zero might be appropriate here, so you might declare a Map,
PersonSSNAVLMap m((SSN)0);
Suppose you wanted a VHMap
instead of an AVLMap
Besides
generating different implementations, there are two differences in
how you should prepare the `defs' file. First, because a VHMap
uses a C++ array internally, and because C++ array slots are initialized
differently than single elements, you must ensure that class Person
contains (1) a no-argument constructor, and (2) an assignment operator.
You could arrange this via
class Person { ...; Person() {} void operator = (const Person& p) { nm = p.nm; addr = p.addr; } };
(The lack of action in the constructor is OK here because Strings
possess usable no-argument constructors.)
You also need to edit `Person.defs.h' to indicate a usable hash function and default capacity, via something like
#include <builtin.h> #define <T>HASH(x) (hashpjw(x.name().chars())) #define DEFAULT_INITIAL_CAPACITY 1000
Since the hashpjw
function from `builtin.h' is
appropriate here. Changing the default capacity to a value
expected to exceed the actual capacity helps to avoid
"hidden" inefficiencies when a new VHMap is created without
overriding the default, which is all too easy to do.
Otherwise, everything is the same as above, substituting
VHMap
for AVLMap
.