Pierangelo Masarati had the idea of storing bibliographical references in a directory. His LDAP2BibTeX package12.1 provides a schema and two utility programs, one for converting an existing collection of bibliographical references in BibTeX bib format to LDIF, which can be loaded into a directory server, and another for retrieving the information from the directory again.
\cite{} command and a ``bibtexEntryType'' attribute that
describes the kind of publication, e.g. book, manual or article. All
other attributes are optional.
The approach in this thesis was however to build a two-level object class
hierarchy. The abstract ``bibtexEntry'' class is a collection of
attributes common to all BibTeX resource types. It includes the
mandatory ``cn'' (short for ``common name'') attribute. Its value is
used as parameter to the LaTeX \cite{} command. ``cn'' also is
the naming attribute for ``bibtexEntry''. Structural object classes
exist for every BibTeX resource type. These classes are derived from
``bibtexEntry''. Attributes that are required for a resource type,
e.g. the publisher of a book, are declared as mandatory in these object
classes.
With the modified schema, application can take advantage of the information contained in the schema without having to be adapted specifically to BibTeX. For example, a dialog for creating a new reference could first enumerate the available resource types and would then only display fields for those attributes that make sense for the given type.
Another guideline in designing the schema was to store information in
human-readable and standards-compliant form, and still keep the semantics of
TeX. This makes information accessible with standard tools while avoiding any
information loss. TeX-specials can generally be converted to respective
Unicode characters. However, no Unicode representation exists for
mathematical formulas, which sometimes appear in titles. The ``author''
attribute is also an area of concern. First, BibTeX uses curly brackets to get
name prefixes right. Curly brackets are also used mark those words in titles,
whose case must be preserved. Secondly all authors of a document appear on one
line separated by and. When stored in the directory, the author
field should be a multi-valued attribute. This allows for better search
capabilities. However, LDAP does not guarantee the order of values in a
multi-valued attribute. To cope with these problems, all values that have
special TeX code in them are additionally stored in an attribute subtype that
is identified by the ;lang-x-tex tag12.2. The ;lang-x-tex form
should be used by BibTeX and also for editing purposes. If information is
displayed by non-BibTeX-aware applications, the base form is used instead.
For example, the entry for [30] would look like this in LDIF notation:
dn: cn=Howes:1999:UDL, cn=Bibliography
cn: Howes:1999:UDL
objectclass: top
objectclass: bibtexEntry
objectclass: bibtexBook
bibtexAuthor: T. Howes
bibtexAuthor: M. Smith
bibtexAuthor: G. Good
bibtexAuthor;lang-x-tex: T. Howes and M. Smith and
G. Good
bibtexTitle: Understanding and Deploying LDAP
Directory Services
bibtexTitle;lang-x-tex: Understanding and Deploying
{LDAP} Directory Services
bibtexYear: 1999
bibtexPublisher: Macmillan Technical Publishing
Separate tools have been developed for two special bibliographical collections:
A pool of persistent connections to the directory server is used to maximise performance. To serve an http request, a connection from the pool is requested. This avoids the overhead of having to establish a new LDAP connection for each request. By using the integrated session management of the servlet container, authenticated LDAP connections are used for tasks that involve modifying entries in the directory.
Norbert Klasen 2001-10-22