Subsections


Directory Services Overview

This chapter describes the fundamental concepts of X.500 and related directories services. Each aspect of the directory service is outlined by a pertaining model [30].


Information Model

Directories contain information about real world objects. Information concerning one object is stored in an entry--the basic building block of a directory (see Figure [*]).

Figure: DIT and Entry Structure [38]

An entry is a collection of name-value pairs called attributes. There can be more than one value associated with an attribute name--a person might have multiple first names, for example. However, some attributes may only have one value--a person definitely would only have one date of birth. Attributes of that kind are said to be single-valued. The others are called multi-valued.

Each attribute has a syntax, which describes the type of information that can be stored in this attribute. This might be a character string, a number, a photo or some complex data type, e.g. a certificate for digital signatures. The definition of an attribute also lists its matching rules. These rules govern how values of this attribute type should be compared. It is possible to specify rules for equality and substring matching as well as specifing rules that determine how values of this attribute should be ordered--the ordering matching rule.

Objects in the real world often show similarities and are therefore said to be of a particular kind. This grouping principle is met in the directory world by the notion of object classes. Object classes specify the attributes that an entry must or may contain. Attributes that an entry must contain are called mandatory, attributes that may be present optional. It is possible to derive an object class from another one. This allows the design of an object class hierarchy. Subclasses inherit all mandatory and optional attributes of their superior class and can incorporate additional ones. It is also possible to declare an inherited optional attribute as mandatory for the subclass.

Three types of object classes exist: abstract, structural and auxiliary. Object classes of the abstract type are used to form the upper levels of an object class hierarchy. Entries can only be added to the directory if they meet the requirements of at least one structural object class. Structural object classes reflect the principal fabric of an object. Common structural object classes are, e.g. ``person'' or ``organization''. The object classes to which an entry conforms are listed in its ``objectClass'' attribute. The ``objectClass'' attribute is introduced as a mandatory attribute by the ``top'' object class. ``top'' forms the root of the object class hierarchy. All other object classes are directly or indirectly derived from it. This ensures that every entry has at least one object class.

Sometimes the need arises to store additional data that is not strictly tied to the structure of an object or which may not be present for all objects of a particular class. This additional data can be stored in attributes, which are governed by auxiliary object classes. With auxiliary classes, attributes can be introcuded as a mandatory requirement to a subset of entries that have the same structural class. An auxiliary class can also be added to entries that have a different structural classes, e.g. both a person and an organization could have a homepage, which would be stored in the common ``labeledUri'' attribute.

When more applications with partly different needs employ the directory as their data storage, the advantage of introducing auxiliary classes over deriving from structural object classes becomes obvious: The latter approach would require a structural class for every combination of attributes (or sets of attributes) whereas auxiliary classes allow the addition of object classes to existing entries as needed.

There is one very special structural object class: alias. Entries of this type do not actually contain information on objects but rather act as placeholders that point to other entries. With the use of aliases it is possible to access the same data under a different name.

The collection of syntax, matching rule, attribute type and object classe definitions is called the schema. This meta information governs what might be stored in the directory.


Naming Model

All entries in the directory are arranged in a hierarchical manner--forming the Directory Information Tree (DIT) (see Figure [*]). In this tree, directories are similar to file systems. There is however one decisive difference: An entry in a directory can simultaneously hold information itself in addition to being a container for other entries, whereas objects in a file system can either be a directory or a file2.1.

Each entry in a directory is identified by its Distinguished Name (DN). It is composed by concatenating the entry's Relative Distinguished Name (RDN) with those of its superiors along the path to root entry, whose RDN is a hypothetical empty string (``''). Figure [*] shows how entries in a directory could be named. An RDN consists of an attribute name, the equal sign and the attribute value (e.g. ``cn=Sam Carter'')2.2. The attribute used in the RDN is called a naming attribute. RDNs must be different for all siblings in a tree. This ensures that all entries will have a unique DN.

Figure: A namespace example

X.500 initially followed a naming-scheme based on geographic or national regions. Since acquiring a registered name in this scheme proved cumbersome, a new naming-scheme based on the Domain Name System (DNS) was introduced [50]. To map a DNS name to a DN, the dc attribute--short for domain component--is used: ``directory.dfn.de'' thus maps to ``dc=directory, dc=dfn, dc=de''.


Functional Model

The functional model describes the means by which information is accessed in the directory. It defines the operations by which the user--by means of a program called Directory User Agent (DUA)--interacts with the application providing the directory service--the Directory System Agent (DSA). Operations can be classified into three groups [30].


Interrogation

The primary use of directories is to provide information. To request such information the search operation is used. How a search is actually performed is controlled by the following parameters:
baseDN--gives the node in the DIT from which the search starts.
scope--either ``base'', ``one-level'' or ``subtree''. If ``base'' is specified, only the entry specified by baseDN is searched. With ``one-level'' only entries directly below the baseDN enty are considered as candidates in the search. As the name implies, ``subtree'' searches through all entries in the subtree whoose root is formed by the baseDN entry.
filter--what is actually searched for. This could be as simple as ``givenname = Sam'' or a complex nested logical expression, e.g.:
``( & (objectclass=person) (modifyTimeStamp> =200101010000Z)
( | (gn=Sam) (gn=Ted) ) )''2.3
Other parameters determine which attributes should be returned, whether time and size limits would have to be observed, and if aliases should to be de-referenced.

The compare operation checks, if a specific entry contains an attribute of a given value. It is still included in the standard because of historic reasons, since its semantics can be reproduced by a tailored search operation.


Modification

Before one can retrieve information from the directory it first has to be populated with entries. Two function exists to add and delete entries in the directory, namely the add and delete functions. Furthermore, there is the function modifyDN to rename and/or move an entry in the DIT. But entries can also be changed on the attribute level. Using the modify operation, new values can be added to an attribute, particular values removed from an attribute, or all values replaced by new ones.

Although directories offer no support for transaction mechanisms, operations that modify an entry are atomic. Either all attribute changes can be committed to the entry or none at all.


Authentication and Control

Some operations in the directory require special privileges. Before these can be carried out, the identity of the client has to ascertained. This is done using the bind operation. (See also Section [*]) A client can terminate a connection by issuing the unbind request. Upon receiving such a request the DSA will discard all outstanding requests from this client and close the connection. abandon is sent by the DUA if the results of an operation initiated earlier are not required anymore--e.g. when a user has clicked on a ``Cancel'' button in a graphical user interface (GUI).


Distribution Model

Due to the hierarchical structure of directories, directory services are well suited to be implemented as distributed systems. To achieve this, the DIT is partitioned into smaller areas, each being a connected subtree, which do not overlap with other partitions.

Figure: DIT Partitioning and Knowledge Information [4]

Figure [*] shows this approach. A separate server would master each such partition. The links between partitions are know as knowledge information and can themselves be stored in the directory. If a client issues a request, which effects entries that do not fall into the partition mastered by the server, the server has two options. The first one is to retrieve this information from the responsible server on behalf of the client. This feature is called chaining. The other option is to refer the client to the right server. It is then the client's task to follow this referral and contact the new server to retrieve the information.

To increase the availability of a directory service, multiple hosts can run directory servers that hold the data of a partition. Such a set up would also be feasible if a directory service is to be provided to two different sites, which are connected by a WAN link. The clients at each site can then access the local server, thus avoiding the use of the slower inter-site connection. As the data needs to be kept in sync, changes made by clients would have to be propagated to all other servers holding a copy of the effected entries. This procedure is called replication. The current standard only defines a replication architecture, where only one server masters the data in one partition. In such a single-master environment, all clients have to contact the master server if the need to update the directory would arise. Some proprietary implementations such as NDS or Active Directory have multi-master capabilities: Clients can connect to any server to request operations that will modify entries in the directory.

Norbert Klasen 2001-10-22