Attribute Lists and the ATL project


An Attribute List is a set of name-value pairs.  They provide a mechanism through which operational information and configuration values for various layers of the communication system can be passed through other layers without prior knowledge.
Attribute lists are used in various places in the ECho and CM systems.  Generally each layer will define the set of attributes that it creates and that it understands.   This document will not describe individual attributes, but only define the general properties and use of attribute lists.
 

History

Version 1 -- 1996-1999

Attribute lists were first used in the Object Transport Layer, a fast, configurable, object infrastructure whose development began in 1996.  One of the key decisions in the early implementation was on the nature of the name portion of the name-value pairs.  In particular, since attributes were a core component in OTL, a decision was made not to use string-based values for the name because of the overhead of doing string comparisons in attribute lookup.  So the initial implementation introduced Atoms,  a concept borrowed from X windows.  In X, atoms are 32-bit values which are associated with particular strings and are used to represent characteristics of windows and other X entities.  The X server is responsible for assigning an atom value to a particular string and for maintaining that association.

In OTL, the string-atom association was maintained by the atom_server, a third-party server running on a well-known host/port.  Individual client processes would request the atom value for a particular string using the OTL subroutine, attr_atom_from_string().  That subroutine would contact the atom_server to get the atom associated with the string.  A client-side cache was used to limit traffic to the atom_server and speed requests.  The cache could be pre-loaded with all the associations known to the atom_server.  This implementation had two key characteristics:

A typical use of attributes looked like this:
{
    static atom_t home_host_atom = 0;
    attr_list attrs;
    attrs = create_attr_list();
    if (home_host_atom == 0) {
        /* first time through, get atom translation */
        home_host_atom = attr_atom_from_string("OTL:HOME_HOST");  
    }
    add_attr(attrs, home_host_atom, Attr_String, (attr_value)strdup(host_name));
}

Version 2 -- 1999-2002

In 1999, a new package, Connection Manager, was introduced. CM was designed as a configurable communication system that used attributes to pass information and configuration parameters from applications down to plug-and-play transport layers.   To support this, the attribute list support was separated from OTL and placed in its own project, ATL.  However, because the original atom_server was based on DataExchange, an incompatible communication system, the use of a third-party server to maintain the string/atom association was eliminated.  Essentially, the implementation kept the caches of the original implementation, but eliminated the server communication by assigning atom values within each client.  Because of this: The attribute API was unchanged from Version 1, but because the 32-bit atoms could not be used on the wire, string names were used instead with the result that attribute communication was slow and encoded attributes were bulky.  Other complications in this version were that the internal representation of attribute lists used a C-style union to save memory.  This made it difficult to use PBIO to transport attributes because PBIO does not support union types.

Version 3 - August 2002

The current version of ATL has been significantly reworked, at the cost of changes to the API.  The principal goal of the redesign was to allow attributes to be used in efficient interprocess communication.

At the heart of the ATL changes was a reworking of atom assignment.  We wanted to return to the use of 32-bit atoms on the wire, but without re-introducing required communication with a third party atom server.  The two disadvantages of the atom_server were that we were pushing our software into kernels and embedded systems where off-client communication was difficult, and that we were also venturing into wide-area systems where requiring all communicating components to use a single common server was impractical.   Our approach is based on the observation that if two software layers wish to communication via attributes, they must first agree upon a name for the attribute.  Given this, we felt it reasonable to require those layers to also agree up a 32-bit atom value to be associated with that string name.  A 32-bit atom-space allows for 4 billion possible values, making accidental collisions unlikely.  (Though our suggested approach for definining names probably increases the likelihood by limiting the space that is typically used.)  We still maintain a third-party atom_server, but rather than being a necessary part of the system it now has a lesser role.   In particular, while the atom server is advised of the atom/string translations that are in use, it no  longer assigns them.  Instead it is present to warn of possible collisions and to allow textual "dumping" of attribute lists by layers that might not be privy to the atom/string assignments by others.  Communication with the new atom_server is via UDP and normal operation should be possible even if the atom_server is not present.  Generally, we recommend picking a 32-bit atom value by combining the binary values of 4 chosen characters.  For example,  the CM package has a "PEER_IP" attribute for which it uses a 32-bit atom value composed of the characters 'C', 'P', 'I',  and 'P' concatenated (0x43504950).  The "PEER_HOSTNAME" atttribute uses 'C','P', 'H', 'O' (0x4350484F) and the "IP_PORT" attribute uses 'C', 'I', 'I', 'P' (0x43494950).  It is hoped that with judicious choices of character values and the use of the atom_test program to probe the in-use atom name space, conflicts will be infrequent.

As a result of the redesign, the old API is deprecated.  Parts will still work at the moment, but something bad may happen if you use it.    Instead, this is the recommended approach to atom assignment and attribute use.

#define CM_IP_HOSTNAME ATL_CHAR_CONS('C','I','P','A')

{
    int atom_init = 0;
    if (atom_init == 0) {
        /* first time through, set string/atom association */
        set_attr_atom_and_string("IP_HOST", CM_IP_HOSTNAME);
        atom_init++;
    }
    add_attr(attrs, CM_IP_HOSTNAME, Attr_String, (attr_value)strdup(host_name));
}
Note the use of the ATL_CHAR_CONS() macro to construct the 4-byte atom value.  This is just a shortcut to avoid looking up the hex values of characters and typing them in.   The call to set_attr_atom_and_string() sends a message to the atom server to register the string/atom association.  If the atom_server was previously unaware of the association, or if it corresponds to an assocation that was already known, the atom server does not respond.  If the atom server sees a conflict with a previously registered association, it will return a warning message to the client. However, communication with the atom server is asynchronous, so a warning printout will not occur immediately, but instead may occur on a later communication with the atom server.

One of the most-used calls from the prior ATL implementation, attr_atom_from_string() has no role in the current implementation.  Essentially, any layer which knows the string name of the attribute should already also know the 32-bit value associated with it.  The sister call, attr_string_from_atom() should be used only by routines "dumping" unknown attributes.

Another change in most recent version of ATL is a modification of the "stringification" technique used for attributes.  In prior versions, attributes could be converted to and from strings with the calls attr_list_to_string() and attr_list_from_string().  The strings produced looked something like this: "{IP_ADDR,4,-2100361916},{IP_PORT,4,63715},"  That is,  they were human-readable, but contained special characters like '{' that were meaningful to the shell.  This latter characteristic was a nuisance when attempting to pass event channel names or CM contact lists as command line arguments.  Because of this, ATL now uses a base64-encoding of the marshalled form of the attribute list as a "stringified" form.  This new form looks more like: "AQIAAENJUEGCzwVEQ0lQUAAA+04=".   This form is more efficient to create and decode and has no special character problems, but readability is lost.  To counter the lost readability, a new utility program has been introduced.  Running 'attr_dump' with an encoded attribute list as an argument will result in a textual dump of the attribute list:

latte% attr_dump "AQIAAENJUEGCzwVEQ0lQUAAA+O4="
[
    { IP_ADDR ('CIPA'), Attr_Int4, -2100361916 }
    { IP_PORT ('CIPP'), Attr_Int4, 63726 }
]
latte%
 

 



This page is maintained by Greg Eisenhauer