Go to the first, previous, next, last section, table of contents.

Introduction to container class prototypes

As a temporary mechanism enabling the support of generic classes, the GNU C++ Library distribution contains a directory (`g++-include') of files designed to serve as the basis for generating container classes of specified elements. These files can be used to generate `.h' and `.cc' files in the current directory via a supplied shell script program that performs simple textual substitution to create specific classes.

While these classes are generated independently, and thus share no code, it is possible to create versions that do share code among subclasses. For example, using typedef void* ent, and then generating a entList class, other derived classes could be created using the void* coercion method described in Stroustrup, pp204-210.

This very simple class-generation facility is useful enough to serve current purposes, but will be replaced with a more coherent mechanism for handling C++ generics in a way that minimally disrupts current usage. Without knowing exactly when or how parametric classes might be added to the C++ language, provision of this simplest possible mechanism, textual substitution, appears to be the safest strategy, although it does require certain redundancies and awkward constructions.

Specific classes may be generated via the `genclass' shell script program. This program has arguments specifying the kinds of base types(s) to be used. Specifying base types requires two arguments. The first is the name of the base type, which may be any named type, like int or String. Only named types are supported; things like int* are not accepted. However, pointers like this may be used by supplying the appropriate typedefs (e.g., editing the resulting files to include typedef int* intp;). The type name must be followed by one of the words val or ref, to indicate whether the base elements should be passed to functions by-value or by-reference.

You can specify basic container classes using genclass base [val,ref] proto, where proto is the name of the class being generated. Container classes like dictionaries and maps that require two types may be specified via genclass -2 keytype [val, ref], basetype [val, ref] proto, where the key type is specified first and the contents type second. The resulting classnames and filenames are generated by prepending the specified type names to the prototype names, and separating the filename parts with dots. For example, genclass int val List generates class intList residing in files `int.List.h' and `int.List.cc'. genclass -2 String ref int val VHMap generates (the awkward, but unavoidable) class name StringintVHMap. Of course, programmers may use typedef or simple editing to create more appropriate names. The existence of dot seperators in file names allows the use of GNU make to help automate configuration and recompilation. An example Makefile exploiting such capabilities may be found in the `libg++/proto-kit' directory.

The genclass utility operates via simple text substitution using sed. All occurrences of the pseudo-types <T> and <C> (if there are two types) are replaced with the indicated type, and occurrences of <T&> and <C&> are replaced by just the types, if val is specified, or types followed by "&" if ref is specified.

Programmers will frequently need to edit the `.h' file in order to insert additional #include directives or other modifications. A simple utility, `prepend-header' to prepend other `.h' files to generated files is provided in the distribution.

One dubious virtue of the prototyping mechanism is that, because sources files, not archived library classes, are generated, it is relatively simple for programmers to modify container classes in the common case where slight variations of standard container classes are required.

It is often a good idea for programmers to archive (via ar) generated classes into `.a' files so that only those class functions actually used in a given application will be loaded. The test subdirectory of the distribution shows an example of this.

Because of #pragma interface directives, the `.cc' files should be compiled with -O or -DUSE_LIBGXX_INLINES enabled.

Many container classes require specifications over and above the base class type. For example, classes that maintain some kind of ordering of elements require specification of a comparison function upon which to base the ordering. This is accomplished via a prototype file `defs.hP' that contains macros for these functions. While these macros default to perform reasonable actions, they can and should be changed in particular cases. Most prototypes require only one or a few of these. No harm is done if unused macros are defined to perform nonsensical actions. The macros are:

DEFAULT_INITIAL_CAPACITY
The initial capacity for containers (e.g., hash tables) that require an initial capacity argument for constructors. Default: 100
<T>EQ(a, b)
return true if a is considered equal to b for the purposes of locating, etc., an element in a container. Default: (a == b)
<T>LE(a, b)
return true if a is less than or equal to b Default: (a <= b)
<T>CMP(a, b)
return an integer < 0 if a<b, 0 if a==b, or > 0 if a>b. Default: (a <= b)? (a==b)? 0 : -1 : 1
<T>HASH(a)
return an unsigned integer representing the hash of a. Default: hash(a) ; where extern unsigned int hash(<T&>). (note: several useful hash functions are declared in builtin.h and defined in hash.cc)

Nearly all prototypes container classes support container traversal via Pix pseudo indices, as described elsewhere.

All object containers must perform either a X::X(X&) (or X::X() followed by X::operator =(X&)) to copy objects into containers. (The latter form is used for containers built from C++ arrays, like VHSets). When containers are destroyed, they invoke X::~X(). Any objects used in containers must have well behaved constructors and destructors. If you want to create containers that merely reference (point to) objects that reside elsewhere, and are not copied or destroyed inside the container, you must use containers of pointers, not containers of objects.

All prototypes are designed to generate HOMOGENOUS container classes. There is no universally applicable method in C++ to support heterogenous object collections with elements of various subclasses of some specified base class. The only way to get heterogenous structures is to use collections of pointers-to-objects, not collections of objects (which also requires you to take responsibility for managing storage for the objects pointed to yourself).

For example, the following usage illustrates a commonly encountered danger in trying to use container classes for heterogenous structures:

class Base { int x; ...}
class Derived : public Base { int y; ... }

BaseVHSet s; // class BaseVHSet generated via something like
             // `genclass Base ref VHSet'

void f()
{
  Base b;
  s.add(b); // OK

  Derived d;
  s.add(d);  // (CHOP!)
}

At the line flagged with `(CHOP!)', a Base::Base(Base&) is called inside Set::add(Base&)---not Derived::Derived(Derived&). Actually, in VHSet, a Base::operator =(Base&), is used instead to place the element in an array slot, but with the same effect. So only the Base part is copied as a VHSet element (a so-called chopped-copy). In this case, it has an x part, but no y part; and a Base, not Derived, vtable. Objects formed via chopped copies are rarely sensible.

To avoid this, you must resort to pointers:

typedef Base* BasePtr;

BasePtrVHSet s; // class BaseVHSet generated via something like
                // `genclass BasePtr val VHSet'

void f()
{
  Base* bp = new Base;
  s.add(b);

  Base* dp = new Derived;
  s.add(d);  // works fine.

  // Don't forget to delete bp and dp sometime.
  // The VHSet won't do this for you.
}

Example

The prototypes can be difficult to use on first attempt. Here is an example that may be helpful. The utilities in the `proto-kit' simplify much of the actions described, but are not used here.

Suppose you create a class Person, and want to make an Map that links the social security numbers associated with each person. You start off with a file `Person.h'


#include <String.h>

class Person
{
  String nm;
  String addr;
  //...
public:
  const String& name() { return nm; }
  const String& address() { return addr; }
  void          print() { ... }
  //...
}

And in file `SSN.h',

typedef unsigned int SSN;

Your first decision is what storage/usage strategy to use. There are several reasonable alternatives here: You might create an "object collection" of Persons, a "pointer collection" of pointers-to-Persons, or even a simple String map, housing either copies of pointers to the names of Persons, since other fields are unused for purposes of the Map. In an object collection, instances of class Person "live" inside the Map, while in a pointer collection, the instances live elsewhere. Also, as above, if instances of subclasses of Person are to be used inside the Map, you must use pointers. In a String Map, the same difference holds, but now only for the name fields. Any of these choices might make sense in particular applications.

The second choice is the Map implementation strategy. Either a tree or a hash table might make sense. Suppose you want an AVL tree Map. There are two things to now check. First, as an object collection, the AVLMap requires that the elsement class contain an X(X&) constructor. In C++, if you don't specify such a constructor, one is constructed for you, but it is a very good idea to always do this yourself, to avoid surprises. In this example, you'd use something like

class Person 
{ ...; 
    Person(const Person& p) :nm(p.nm), addr(p.addr) {}
};

Also, an AVLMap requires a comparison function for elements in order to maintain order. Rather than requiring you to write a particular comparison function, a `defs' file is consulted to determine how to compare items. You must create and edit such a file.

Before creating `Person.defs.h', you must first make one additional decision. Should the Map member functions like m.contains(p) take arguments (p) by reference (i.e., typed as int Map::contains(const Person& p) or by value (i.e., typed as int Map::contains(const Person p). Generally, for user-defined classes, you want to pass by reference, and for builtins and pointers, to pass by value. SO you should pick by-reference.

You can now create `Person.defs.h' via genclass Person ref defs. This creates a simple skeleton that you must edit. First, add #include "Person.h" to the top. Second, edit the <T>CMP(a,b) macro to compare on name, via

#define <T>CMP(a, b) ( compare(a.name(), b.name()) )

which invokes the int compare(const String&, const String&) function from `String.h'. Of course, you could define this in any other way as well. In fact, the default versions in the skeleton turn out to be OK (albeit inefficient) in this particular example.

You may also want to create file `SSN.defs.h'. Here, choosing call-by-value makes sense, and since no other capabilities (like comparison functions) of the SSNs are used (and the defaults are OK anyway), you'd type

genclass SSN val defs

and then edit to place #include "SSN.h" at the top.

Finally, you can generate the classes. First, generate the base class for Maps via

genclass -2 Person ref SSN val Map

This generates only the abstract class, not the implementation, in file `Person.SSN.Map.h' and `Person.SSN.Map.cc'. To create the AVL implementation, type

genclass -2 Person ref SSN val AVLMap

This creates the class PersonSSNAVLMap, in `Person.SSN.AVLMap.h' and `Person.SSN.AVLMap.cc'.

To use the AVL implementation, compile the two generated `.cc' files, and specify `#include "Person.SSN.AVLMap.h"' in the application program. All other files are included in the right ways automatically.

One last consideration, peculiar to Maps, is to pick a reasonable default contents when declaring an AVLMap. Zero might be appropriate here, so you might declare a Map,

PersonSSNAVLMap m((SSN)0);

Suppose you wanted a VHMap instead of an AVLMap Besides generating different implementations, there are two differences in how you should prepare the `defs' file. First, because a VHMap uses a C++ array internally, and because C++ array slots are initialized differently than single elements, you must ensure that class Person contains (1) a no-argument constructor, and (2) an assignment operator. You could arrange this via

class Person 
{ ...; 
    Person() {}
  void operator = (const Person& p) { nm = p.nm; addr = p.addr; }
};

(The lack of action in the constructor is OK here because Strings possess usable no-argument constructors.)

You also need to edit `Person.defs.h' to indicate a usable hash function and default capacity, via something like

#include <builtin.h>
#define <T>HASH(x)  (hashpjw(x.name().chars()))
#define DEFAULT_INITIAL_CAPACITY 1000

Since the hashpjw function from `builtin.h' is appropriate here. Changing the default capacity to a value expected to exceed the actual capacity helps to avoid "hidden" inefficiencies when a new VHMap is created without overriding the default, which is all too easy to do.

Otherwise, everything is the same as above, substituting VHMap for AVLMap.


Go to the first, previous, next, last section, table of contents.