Dynamic Attributes

[Excerpt from IUE User Manual, November 1992:]

Dynamic Attributes

[by Sam Fenster and Terry Boult]

In the IUE, there is a pervasive need to be able to associate arbitrary attributes with objects (and even collections of objects). The iue-context class of objects was designed to do this in a flexible, efficient manner, using dynamic attributes. Although traditional property lists might suffice, we propose a mechanism which is more flexible and may also solve other difficult design issues within the IUE, especially the space problem for seldom-used attributes/slots.

First we introduce dynamic attributes with a high-level description of their use and function. We then discuss the background on inheritance which is a major motivation for dynamic attributes. This section has a strong software flavor. For those readers mostly interested in IU, this section can be skipped.

Introduction

At its simplest, a dynamic attribute can just be an attribute of an object. It can function just as a slot (member) of an object does. Attributes are accessed with methods called put-att and get-att, which take an attribute name as an argument, and accept or return a value.

The first difference between dynamic attributes and ordinary object slots is that one may give an object arbitrary attributes, regardless of its class. Attributes specific to its class are designated as slots at compile time, and, for convenience, access methods are provided for these; but any arbitrary attribute may be accessed with put-att and get-att. (For efficiency, some attributes which are designated as slots of a class may in fact be stored in slots provided by the underlying language. But certain advantages are lost if this is done.)

The second way that dynamic attributes differ is that an attribute may be associated with a combination of objects. To put it another way, one may specify an object's attributes in the context of one or more other objects. Thus, an attribute may have different values in combination with different other objects. The ``dictionary'' in which the attributes of a combination of zero or more objects is looked up is an iue-context object.

Thirdly, values can be automatically defaulted. An attribute's value need not be specified for every combination of objects; a subset of the objects (a less specific context for an object) may have a value for it. As a very useful instance of this, if a certain object has no value for some attribute, the empty set may have one. This value is the attribute's default for any object. (This default is in the scope of a particular iue-context; other iue-contexts may have different defaults.) Through the defaulting mechanism, an attribute's value in a large number of objects may be controlled at once. The objects may be heterogeneous (of many different classes).

Fourthly, attributes may have hooks. Thus, a value may be calculated from other values rather than being stored; certain attributes may be read-only; and when an attribute is changed, it may cause other things to happen.

An object's put-att and get-att methods look in an iue-context object. Users need never see iue-contexts, because by default, put-att and get-att look at predefined iue-contexts which exist for each class in the IUE. But users can also do useful things by declaring their own iue-contexts. Let us start by seeing how they work.

`iue-context` Objects

Dynamic attributes are stored in an iue-context object, which is an indexed table. The key is a fixed-length sequence of iue-object identifiers. Each key accesses a property list (a table of attribute/value pairs, indexed by attribute name). During execution, iue-contexts can be connected into a hierarchy for value inheritance. This is accomplished by giving each iue-context an ordered list of parents, and, for each parent, a specification of what subset of the key is used to access it. This search list is specified (and can be modified) during execution, though startup code can exist to link iue-contexts into an initial hierarchy.

Given a key (representing a combination of zero or more iue-objects) and an attribute, an iue-context looks to see if it has a value stored. If it does, it returns it. If not, it sequentially queries each parent iue-context until a value is returned. These queries can recurse, searching the data inheritance hierarchy up as many levels as desired. Cutoff conditions, based on depth, key value and parent identity, may be imposed.

Storing a value for an attribute always makes a local entry in the iue-context, not in a parent. By default, inherited values do not change when a parent's value is changed. (A copy-on-write policy makes it appear that each iue-context has a unique copy of each attribute, even if it was actually obtained by inheritance.) There is, however, a method for forcibly updating an inherited attribute in the descendants of an iue-context. (Simply omit the copy-on-write. Then descendants will see the updated value.) Finally, there is a mechanism allowing an iue-context (in a hierarchy) to maintain a collection of all objects that have queried it, and save/restore such collections.

While an iue-context may seem like a form of indexed database, it has a number of important differences. First, a key is always a sequence of object IDs, not values of arbitrary type. Secondly, the inheritance mechanism, with copy on write, allows for significantly greater flexibility and decreased storage requirements. For instance, there is an iue-context provided for each subclass of iue-object. Its key is a single iue-object ID, and it is used by default to store attributes for each object of that class. Its sole parent is an iue-context whose key is null. The parent thus provides intelligent defaults for attributes of that class. This is all done without taking space for each attribute in every object.

Figure 1: An illustration of dynamic attributes. In an iue-context, attributes are associated with a combination of objects. If no value is stored for this combination, the parents of the iue-context are queried, using a subset of the combination. [diagram]

An example is shown in figure 1. We want to know what color to draw the chimney of a particular house in image 1. We query an iue-context (called pbi-context) which stores attributes indexed by [part, building, image]. Our particular key is the ordered triple of object IDs [Chimney, Our House, Image 1]. We may use the get-att method of pbi-context, passing it the key and the attribute name ``color.'' We may also use the get-att method of the first iue-object in the key, Chimney. We pass ``color'' and the key [Our House, Image 1], to which Chimney's ID will automatically be prepended.

If pbi-context has a color stored for this triple, it returns it. If not, it checks its parents. The first parent is an iue-context of [part, building] pairs. Here we check if there is a color for Chimney on Our House (regardless of what image it is in) using the key [Chimney, Our House]. If no color is found, two parents are checked. One uses the key [Chimney], the other [Our House]. These would specify an inherent color for Chimney (in any image, as part of any building) or for any part of Our House (in any image). If no color is stored in any of the above iue-contexts, one more is checked -- the other parent of the original iue-context, pbi-context. It is given the key [Image 1]. It may have a color which is to be used by default for anything drawn in Image 1. If it does not, the get-att fails.

Parent links may have a limit on how many levels up to search. If this limit is 1, no recursion is done; only the parents explicitly specified in the list are searched. If it is 0, there is no limit. Also, a parent link may specify particular iue-contexts above which the search should not go. A parent link may also specify that if the key matches certain patterns, the link should or should not be used.

In the Class Hierarchy

Every iue-object has a put-att method and a get-att method. These look up attributes for the object in the class' iue-context (by default), or in a user-specified iue-context. The default key is a singleton, the object's ID; if a key is specified, the object's ID is prepended. A class' methods themselves use put-att and get-att to access attributes. Each class' default iue-context (context1) stores attributes on a per-object basis. It has a parent (context0) with a single set of default values for all objects of that class (the key is null). context0's parents are the context0s of classes from which this class is derived in the class hierarchy.

Each IUE class has a meta-object which describes it. A meta-object is of class iue-class. It has a list of the slots (attributes) of the class it describes. Each slot has a data type. For each attribute foo in a class' slot list, there are accessor methods put-foo and get-foo that accept or return the appropriate data type. Of course, methods put-att and get-att may be used too, just as they are used with attributes that are not slots of the class.

Most slots are ``pseudo-slots,'' a.k.a. ``soft slots.'' Like attributes that are not class slots, their values are stored in iue-contexts. But there are attributes which an object's put-att and get-att methods do not look up in the iue-context. This can happen in two ways.

First, certain attributes of a class may be hooks. These are functions that put-att or get-att call instead of looking in the iue-context. Hooks are specified as part of the definition of a class. They can calculate values for attributes. Thus, they allow the values of attributes to be constrained. They can update other attributes when a given attribute is changed. They allow attributes to be read-only. For example, the get-att hook for the diameter of a circle might actually return twice the circle's radius. If there were no corresponding put-att hook, the diameter attribute would be read-only. If there were one, it might actually change the circle's radius attribute. It would set it to half of the specified diameter.

Second, for efficiency, a class may have hard slots. These are attributes which are actually stored in the objects, as members. When the class' put-att or get-att methods access a hard slot, they do not look in an iue-context. They look directly in the object. For hard slots, the put-slotname and get-slotname methods can be just as fast as accessing a member using language primitives, because they are in-line functions (macros) which expand to those primitives. The advantage of hard slots is speed. The disadvantage is the loss of capabilities -- context-specific values, defaulting and (often) reduced storage.

If an attribute is not a hard slot in a class, it may be made a hard slot in a subclass. But if a class has a hard slot, all of its subclasses have it too. They cannot change it to a soft slot or a hook. (Aside from this, any kind of slot may be changed to any other kind.) Thus, classes which are intended to have subclasses should use few hard slots, so that the subclasses are not stuck with unnecessary hard slots. For efficiency, a class with soft slots may have a subclass in which those slots are ``hardened.'' The subclass need not reimplement its parent's methods -- put-slotname and get-slotname in the parent's methods will access the hard slots of the child. For the utmost speed, however, the methods may be copied, so that put-slotname and get-slotname will be expanded in-line by the compiler.

Hard slots and hooks are only accessible through the put-att and get-att methods of an iue-object. The put-att and get-att methods of an iue-context do not examine any class' slot list. They always just look up an attribute in the iue-context, using the key.

Figure 2: How dynamic attributes are used in the class hierarchy. [diagram]

Figure 2 shows an example. The class iue-circle inherits from iue-ellipse. Thus, iue-circle should have all the slots (attributes) and all the methods of iue-ellipse. For speed and efficiency, the radius and center of the circle are stored as hard slots. The two foci and the eccentricity of the ellipse are normal ``soft slots,'' or ``pseudo-slots.'' They cannot be hard slots, because then they would be hard slots for the circle too. We will want them to be hooks, not hard slots, in the circle. The color of a circle or ellipse is not stored in a class slot; it is an ordinary attribute, not specially associated with the definitions of the circle or ellipse classes. Any IUE object can have such arbitrary attributes. They only differ from soft slots in that no type checking is done when they are set, and the class provides no accessor methods (like put-center or get-center) for them.

Since a circle is a kind of ellipse, it must store eccentricity and foci. The methods for an ellipse (which exist for circles too) access those slots. But they must not be modified independently of the slots for radius and center! Thus, for the circle, they are hook functions. The get-att hook for eccentricity always returns 1. The get-att hook for each focus returns the center. There is no put-att hook for eccentricity, making it a read-only attribute. For the foci, the put-att hooks simply modify the center.

Let's say a get-att is done on circle23 without explicitly specifying an iue-context. If the attribute were a hook or a hard slot, no iue-context would be searched. Otherwise, by default, iue-circle's context1 is used. If the requested attribute is found there for circle23, it is returned. If the attribute is not there, its parent, iue-circle's context0, is searched. This iue-context does not have separate sets of attributes for individual circles; rather, it has one set of attributes for all circles. It is used only if no value is found for circle23 in context1. It has default values for radius and center. Its parent is iue-ellipse's context0, which has default values for eccentricity, foci and color. Eccentricity and foci will actually never be seen by iue-circles, since they have hook functions for those attributes; but iue-circles will see the default color. Thus circle17 is explicitly red and circle23 is, like all ellipses, black by default.

iue-circle's context0 and iue-ellipse's context0 have values for attributes which are hard slots. These will never be seen by ordinary put-atts and get-atts for circles and ellipses since they are hard slots. But they are used when instances are created, to give hard slots initial values. Also, arbitrary iue-contexts may use them as parents for defaulting purposes.

How Values Are Stored

Every attribute value in an iue-context is stored as an iue-value. An iue-value can be undefined, or any of several atomic types (integer, string, real number, byte, character), or an iue-object. It can also be a pointer to any of these, or an array of any of these. When an attribute's value is set to an iue-object, a natural copy of the iue-object is stored. If a retrieved value is an iue-object, a reference to the actual stored instance is returned. Its type method returns (as a string) its true data type (a subclass of iue-object). This can then be used in a switch statement to cast it down to the correct type. iue-value has a type method of its own which returns a name for the atomic type or the subclass of iue-object. The get-att and put-att methods have forms allowing the user to get any of the base C++ types directly.

Background: Inheritance and Hierarchies

We propose a scheme for dynamic attribute handling. It provides dynamic (user-configurable) data-inheritance hierarchies. As a consequence, it also permits data inheritance based on the static class hierarchy. Before we can discuss the advantages of this mechanism we need to address general issues of inheritance. The discussion assumes an object-oriented programming language (OOPL) that provides for subtyping, interface sharing and code sharing (both C++ and CLOS do).

Almost by definition, static class inheritance does not provide much flexibility for the user. The user can inherit from classes in the existing system, but cannot add new inheritance links or modify existing links. Yet, in many IU applications (and AI applications in general), there is a common need to group objects together and associate properties with the group, where each element in that group inherits any property shared by the group. For single-level grouping, standard language mechanisms can provide the needed functionality. A set object with attributes could do it. But this would not allow for hierarchical groupings. Thus, building an is-a hierarchy would be impossible for a user. The dynamic attribute mechanism being proposed solves this by providing the user with dynamic grouping via inheritance. We note that the compiler cannot optimize access to arbitrary attributes, since inheritance links do not necessarily exist at compile time. (There might be interesting run-time path compression and/or caching mechanisms to make this access faster at run-time).

In object-oriented programming (OOP), static inheritance has been used for numerous (not always consistent) purposes including subtyping, shared code, shared interfaces and shared data fields. Let us now look at these different intended purposes and their implications for the underlying system.

Subtyping refers to the use of inheritance to provide another class which, in a strict mathematical sense, is a subtype (a restricted form) of its parents. It has all of its parents' methods; it may contain additional information and methods; the information it stores may be constrained in some way. For example, a class of ellipse might have a subclass (subtype) of circle, since every circle can be considered an as an ellipse. This approach to inheritance provides mathematically consistent hierarchies and, in our opinion, logically cleaner designs. While it is not required to do so, subtyping can also provide a basis for some code sharing, data sharing and interface sharing. Subtyping is the main goal in the class inheritance hierarchy of IUE. The code, data and interface sharing that comes along with that approach is a secondary concern.

Interface sharing refers to the use of inheritance to provide interfaces which are shared by all subclasses of a given class. It also allows (even encourages) classes to provide more efficient local versions of methods, as long as the interface is the same. This is one of the primary forces in OOP today, and definitely a feature of inheritance used in the IUE. This type of inheritance provides consistency and generality, in that any method of a class should also work for any subclass. (It will function correctly only if the semantics of the interface are also maintained). A hierarchy built using subtyping naturally provides shared interfaces since, mathematically, an instance of a subclass includes an instance of its superclass(es). Locally reimplementing inherited methods can provide greater efficiency, and may be necessary if the internal representation has changed.

Code sharing refers to the use of inheritance to provide code in one class that can be used, directly, by a subclass. This has two main advantages: It saves coding during development, and can simplify maintenance by localizing code changes. It has two main difficulties. The first is that designing for code reuse can cause difficulties since it involves producing generic code for any subclass. This can be very difficult if the data of a superclass is not shared with a subclass, or if a subclass is a mathematical restriction of a superclass. The second and more dangerous difficulty is that it can lead to implementation-dependent hierarchies. This occurs because unrelated objects which happen to be implemented in a similar manner may be connected in a hierarchy just so they can share code. This can lead to difficulties in maintenance since changing one implementation might have a significant impact on the hierarchy. Obviously, if a hierarchy shares code, it also shares interfaces.

Data sharing refers to the use of inheritance to provide shared data field definitions for all descendants of a given class. This is probably the weakest reason for building a static hierarchy, though many OOPLs (both C++ and CLOS) actually have this as a natural part of their inheritance mechanisms. In this approach, data fields are added to classes as one proceeds down from the root, with all fields below a node necessarily containing at least as much data. As we shall see, this mechanism is often at odds with subtyping and code sharing.

We now briefly review some of interactions of these different goals in inheritance schemes. We will often use the example of an ellipse class and a circle subclass/subtype. This provides for a concrete discussion, though the issues being discussed are all general. We proceed assuming that subtyping is the main goal of the hierarchy, with interface sharing and code sharing as secondary goals. The main problem occurs because subtyping is often directly at odds with data sharing via inheritance and causes difficulties with code sharing.

The incompatible interaction of subtyping and data sharing occurs because a subtype often requires less data than a supertype. It is very common, from a mathematical point of view, to generate a subtype by restricting a parameter or property of an existing class. This often takes the form of setting the parameter to a constant, or constraining a number of parameters to a functional relation. This restriction is generally dependent on the representation of the superclass and need not be unique. E.g., for the ellipse vs. circle, we can see a circle as an ellipse with both foci at the same point, or as an ellipse with major and minor axes of the same length (and a meaningless orientation parameter) or as an ellipse where the curvature is a constant function of arc length. But in an OOPL where data fields of a class are automatically data-fields of all its subclasses (C++ and CLOS, for example), almost any representation of a ellipse will result in extraneous fields for a circle. This obviously wastes space, and in a deep hierarchy (which is mathematically appropriate) with many instances of the leaf classes, this space problem can become acute. In the IUE, if a 2D-line (which only needs 4 parameters) inherits from a 2D-conic (which has, say, 4 more slots) which inherits from 2D-curve (another 2 slots) which inherits from 3D-curve (another 4 parameters) which inherits from spatial-object (no new slots), which inherits from iue-object (3 slots), we end up with an object with 17 slots instead of the 4-7 which are needed. If we have 10,000 or so 2D-lines in a large image and 10-100 images, the wasted space is staggering.

The other problems with subtype-based inheritance have to do with code sharing, and are intimately tied to the storage/representation problem mentioned above. One issue arises in methods that read slots, another in methods that write them. If we want all read-only methods of an ellipse to be applicable to a circle (which, if we are subtyping, they mathematically are) then we must have ways of having them access all of the same slot values (e.g. first and second foci). If we represent a circle as having only one focus, which we call the center, then any methods that access the foci will fail. We could, of course, recode these methods, but that defeats the goals of code reuse. In some OOPLs we could simply provide reader methods for the non-existent slots. However in some OOPLs, such as C++, slot access is not by method invocation but rather by direct access.

The second subtyping/code reuse issue has to do with methods that modify slots. If a method modifies one focus of an ellipse, the ellipse changes shape. If one modifies the first focus of a circle then either one intends the circle to change shape or one intends the circle to move. If it is the first case, we believe the user should first promote the object to be an ellipse (an explicit call) and then modify its shape. For a subtype that is defined by mathematical constraints, the attempt to modify a restricted slot should either modify it according to the restriction (e.g. change both foci), or return that the slot cannot be changed. As we shall see, the proposed dynamic attributed mechanism allows either forcing the slot to be read-only or modifying other slots so that constraints are satisfied.

Motivation: Dynamic Attributes

This section discusses most of the motivating factors in development of the dynamic attributes.

First and foremost, we believe in the subtyping-based approach to building hierarchies, and the biggest motivation for the class of dynamic attributes is to help solve some of the subtyping-related problems discussed above. Chief among the subtyping problems we hope to solve was the issue of space vs. code sharing and flexibility. To allow code to be reused, we wanted to be able to access all important parameters, in at least a read-only manner, for all subtypes. Thus, if our implementation of an ellipse uses 2 foci, we should be able to access both foci of a circle, without requiring storage for two. The dynamic attribute mechanism solves this by allowing objects to have hooks for certain attributes. These attributes are not looked up in an iue-context. Instead, a hook function is called. In the case of a read (a get-att), this function may calculate the required value from other attributes. The focus2 attribute of a circle might call a hook that actually looks up the circle's center attribute. If a user tried to change such an attribute (with a put-att), two approaches could be used. There could be no hook for changing it, in which case the attempt would be an error, and the attribute would be read-only; or there could be a hook which changed the appropriate attributes, e.g. the circle's center. Hooks are specified as part of the definition of a class.

If the definition of an ellipse is modified to no longer have slots for foci, the definition of circle does not need to be changed, because the mathematical relation between circle and foci is still, of course, true. Note that hooks also provide a mechanism for defining parameters which are always constant for a subtype. The eccentricity attribute of a circle could be fixed at 1.

A related motivation for DAs has to do with attributes we would like to associate with many (most) objects. Such attributes are only occasionally used, and/or have predetermined intelligent defaults, and/or have a common, shared value for many objects. Examples of these types of attributes include: the comment field associated with an object; the reflectance model for a surface; the display color of an object. For uncommon attributes, allocation of a slot in the object to allow storage of this field would generally waste space. Yet it is convenient for the interface to assume that such a field can always be accessed (maybe with intelligent defaults if it is not there). For intelligent defaults, it again would be a waste to store such information in each instance of the class since it is likely to be identical for almost all instances. Shared attributes can be common to objects that are potentially unrelated in the static class hierarchy, e.g. the attribute of display-color. While we might store this with each object, if we have a collection of 10,000 edges and want them all the same color, it is again a waste of space to store it for each one separately. An ancillary benefit of sharing the attribute is that by changing one value (in a destructive manner, which is not the default behavior), we can change the property for a large collection of related objects. Again, display color provides a convenient example.

Some of these attributes would be stored in a class' default iue-context at compile time. Intelligent defaults would most likely be declared at compile time and stored there. Other attributes, such as those shared by arbitrary groups of objects, would be stored at run time in iue-contexts created by user code. In any case, an object's get-att method first checks the iue-context to see if there is a value specific to it. (In fact, it may be checking something even more specific -- a value for that object in combination with particular other objects.) If the attribute is not found, a value that is shared with other objects may be found in one of the parents of the iue-context. For instance, each class' default iue-context stores attributes on a per-object basis; it has a parent with a single set of values for all objects of that class (the key is null).

These sharing examples also provide a motivation for the multi-object keys used by DA. Often, we might want these shared attributes to be associated with other objects. An object may have different values of an attribute in combination with different other objects. This motivates the need for a hierarchical structure with flexibility in the way attributes are searched for. For example, we might want a collection of line segments to be colored blue when they are displayed in window 1 and green when they are in window 2. But we want line segment 94 to be an exception: It is red in window 1 and ochre in window 2. Window 3 has no default color for line segments; they are displayed in whatever inherent color they possess. This could be specific to a line segment, or it could be the class default for line segments. (Or maybe windows have a default display color. Maybe line segments do not.) An iue-context hierarchy, using different subsets of the key (line-ID, window-ID), could automatically find a display color by sequentially checking all of these possibilities.

In addition, we wanted a mechanism to allow users to add properties, not necessarily anticipated by the system builders, to objects and later be able to retrieve these properties. (Again, because these properties are often context sensitive, we still wanted a multi-object key to access the properties.) The difficulty here is that the types of such attributes will not be known ahead of time, which is a problem in C++. The solution is to store every value as an iue-value. An iue-value can be undefined, or any of several atomic types (integer, string, real number, byte, character), or an iue-object, or a pointer to or array of any of these. Its type is returned by the type method. For iue-objects, the specific type of object is returned. The get-att and put-att methods will have forms allowing the user to get any of the base C++ types directly. (These methods will exist in the CLOS side too. Maybe they will be useful for compilers.)

We wanted the user not only to be able to define data hierarchies dynamically, allowing an object to have a property that it inherits from another, but also to save these related objects. That is, we want the DA mechanism to support a kind of object collection. Because there may be a large space overhead for this ability (we need to keep at least 1 extra pointer for each object in the collection), we chose to have a subclass of iue-context which support collection. These collectible iue-contexts have additional overhead at runtime. There are still some thorny issues about exactly what should be in the collection, and what additional methods should be supported on collections.

Recall that an iue-context is basically an indexed table of attribute lists. The key is a fixed-length sequence of object identifiers. In addition, each iue-context can be part of a hierarchy. This is accomplished with a parent list, which for each parent specifies which part of the key to use. The parent list is set at run-time, though certain special iue-contexts, and their parent links, are declared at compile time. The lookup mechanism works as follows: given a key and an attribute, an iue-context sees if it has a value stored. If it does, it returns it. If it does not, it asks its parents, in order, possibly recursively. Modifying an attribute always makes a local entry in the iue-context. To make inheritance transparent to the user (by default), a copy-on-write mechanism makes each iue-context appear to have its own copy of a value, even if it was actually obtained by inheritance. There is, however, a method to make a change in value visible to an iue-context's children. Finally, there is a mechanism to allow an iue-context (in a hierarchy) to maintain a collection of all objects that have queried it, and save/restore such collections.

Contact: Sam Fenster, Computer Science Dept., Columbia University