The OpenDJ directory server is highly scalable and can process all sorts of requests from different types of clients over various protocols. The following diagram provides an overview of how OpenDJ processes these requests. (See The OpenDJ Architecture for a more detailed description of each component.)
Note: The following information has been taken from ForgeRock’s OpenDJ Administration, Maintenance and Tuning Class and has been used with the permission of ForgeRock.
Client requests are accepted and processed by an appropriate Connection Handler. The Connection Handler decodes the request according to the protocol (LDAP, JMX, SNMP, etc.) and either responds immediately or converts it into an LDAP Operation Object that is added to the Work Queue.
Analogy: I like to use the analogy of the drive-through window at a fast food restaurant when describing this process. You are the client making a request of the establishment. The Connection Handler is the person who takes your order; they take your request and enter it into their ordering system (the Work Queue). They do not prepare your food; their jobs are simply to take the order as quickly and efficiently as possible.
Worker Threads monitor and detect items on the Work Queue and respond by processing them in a first in, first out fashion. Requests may be routed or filtered based on the server configuration and then possibly transformed before the appropriate backend is selected.
Analogy: Continuing with the fast food analogy, the Worker Threads are similar to the people who prepare your food. They monitor the order system (Work Queue) for any new orders and process them in a first in, first out fashion.
Note: OpenDJ routing is currently limited to the server’s determination of the appropriate backend. In future versions, this may take on more of a proxy or virtual directory type of implementation.
The result is returned to the client by the Worker Threads using the callback method specified by the Connection Handler.
Analogy: Once your order is completed, the food (or the results of your request) is given to you by one of the Worker Threads who has been tasked with that responsibility. This is the only place where the analogy somewhat breaks down. In older fast food restaurants (ones with only one window) this may sometimes be the person who took your order in the first place. In our analogy, however, the Connection Handler never responds to your request. This model is more closely attuned to more recent fast food establishments where they have two windows and there is a clear delineation of duties between the order taker (Connection Handler) and the one who provides you with your food (the Worker Thread).
Other services such as access control processing (ACIs), Logging, and Monitoring provide different access points within the request processing flow and are used to control, audit, and monitor how the requests are processed.
So, what do OpenDJ and McDonald’s have in common? They are both highly efficient entities that have been streamlined to process requests in the most efficient manner possible.
An understanding of the components that make up the OpenDJ Architecture is useful for administering, configuring, or troubleshooting the OpenDJ server.
The following information has been taken from ForgeRock’s OpenDJ Administration, Maintenance and Tuning Class and has been used with the permission of ForgeRock.
The OpenDJ server has been developed using a modular architecture in which most or all components are written to a well-defined specification. This image above provides an overview of these components. The following sections provide a brief description of some of the more prevalent components shown in this image.
The OpenDJ Configuration Handler is responsible for managing configuration information within OpenDJ’s configuration files (i.e. config.ldif). Configuration information may impact one or more components; as such, the Configuration Handler is responsible for notifying appropriate components when a configuration change occurs.
Connection and request handlers manage all interaction with LDAP clients. This includes accepting new connections and reading and responding to client. Connection handlers are responsible for any special processing that might be required for this communication, including managing encryption or performing protocol translation. It is possible to have multiple concurrent implementations active at any given time and as such, OpenDJ includes connection handlers which support various forms of communication that clients use to interact with the server (JMX, LDAP, LDAPS, LDIF, SNMP). Administrators have the ability to enable or disable these connection handlers to support their client environment.
Note: ForgeRock is currently working on REST and JSON interfaces to provide direct access to directory server data.
Connection handlers place client requests onto OpenDJ’s Work Queue. Worker threads detect requests placed on the work queue and are responsible for performing the processing necessary to respond to the request. Today’s directory servers must be able to handle a tremendous number of requests in a short period of time; as such, OpenDJ’s Work Queue has been built to be both highly efficient and provide high performance.
A backend database serves as a repository for searching, retrieving, and storing directory data. OpenDJ supports multiple backends including those considered typical databases (such as Oracle, MySql, and Berekely DB) as well as file-based and memory-based backends. There can be multiple backend databases active at any given time, each of which handle mutually exclusive subsets of data (selection of the appropriate database is based on the root suffix specified in the operation). OpenDJ facilitates interaction with these backends and provides tools for enabling, disabling, creating, removing, backing up, and restoring the databases independently from each other without impacting other backends.
Note: Backends may consist of local or remote repositories (i.e. the database is stored on a remote machine). This can be found in cases where the backend interacts with a proxy or a virtual server. Support for proxy and virtual server backends are scheduled for a future release.
OpenDJ has a robust logging capability that allows server information to be retained in various repositories. The most common loggers are as follows:
- Access Logger – stores server operations (binds, searches, modifications, etc.)
- Error Logger – stores warnings, errors, and significant events that occur with the server
- Debug Logger – records debug information when the server is run with debugging enabled and Java assertions are active.
Multiple loggers can be configured for each of these and each logger may be actively storing different information (filtered or not) in different formats in different repositories.
Note: Some error loggers can be used as an alerting mechanism to actively notify administrators of potential problems.
The LDAP protocol supports two methods that clients may use to authenticate to the server:
- LDAP simple authentication
- Simple Authentication and Security Layer (SASL)
SASL is an authentication framework that supports multiple authentication mechanisms including ANONYMOUS, CRAM-MD5, DIGEST-MD5, EXTERNAL, GSSAPI, and PLAIN.
OpenDJ includes a set of handlers that implement each of these SASL mechanisms in order to determine the identity of the client.
OpenDJ contains an access control module that is used to determine if a client is permitted to perform a particular request or not.
OpenDJ includes several password storage modules that can be used to obscure user passwords using a reversible or one-way algorithm. Password storage schemes encode new passwords provided by users so that they are stored in an encoded manner. This makes it difficult or impossible for someone to determine the clear-text passwords from the encoded values. They can also be used to determine whether a clear-text password provided by a client matches the encoded value stored in the server.
OpenDJ includes a series of modules that define logic used to determine whether a user’s password meets minimum requirements or not.
Syntax and Matching Rules
Attributes must follow a particular syntax and search filters determine matches based on a set of matching rules. OpenDJ contains a set of syntaxes and matching rules that define the logic for dealing with different kinds of attributes.
Interacting with data in memory is much faster than interacting with data on disk. As such, OpenDJ includes a database caching module that loads directory data into memory.
Directory Services Timeline
The Most Complete History of Directory Services You Will Ever Find
(Until the next one comes along)
|1969||First Arpanet node comes online; first RFC published.|
|1973||Ethernet invented by Xerox PARC researchers.|
|1982||TCP/IP replaces older Arpanet protocols on the Internet.|
|1982||First distributed computing research paper on Grapevine published by Xerox PARC researchers.|
|1984||Internet DNS comes online.|
|1986||IETF formally chartered.|
|1989||Quipu (X.500 software package) released.|
|1990||Estimated number of Internet hosts exceeds 250,000.|
|1990||First version of the X.500 standard published.|
|1991||A team at CERN headed by Tim Berners-Lee releases the first World Wide Web software.|
|1992||University of Michigan developers release the first LDAP software.|
|1993||NDS debuts in Netware 4.0.|
|July 1993||LDAP specification first published as RFC 1487.|
|December 1995||First standalone LDAP server (SLAPD) ships as part of U-M LDAP 3.2 release.|
|April 1996||Consortium of more than 40 leading software vendors endorses LDAP as the Internet directory service protocol of choice.|
|1996||Netscape Hires Tim Howes, Mark Smith, and Gordon Good from University of Michigan. Howes serves as a directory server architect.|
|September 1997||Sun Microsystems releases Sun Directory Services 1.0, derived from U-M LDAP 3.2||
|November 1997||LDAPv3 named the winner of the PC Magazine Award for Technical Excellence.|
|December 1997||LDAPv3 approved as a proposed Internet Standard.|
|1998||The OpenLDAP Project was started by Kurt Zeilenga. The project started by cloning the LDAP reference source from the University Of Michigan.|
|January 1998||Netscape ships the first commercial LDAPv3 directory server.|
|March 1998||Innosoft acquires Mark Walh’s Critical Angle company, relesases LDAP directory server product 4.1 one month later.|
|July 1998||Sun Microsystems ships Sun Directory Server 3.1, implementing LDAPv3 standards||
|July 1998||Estimated number of Internet hosts exceeds 36 million.|
|1999||AOL acquires Netscape and forms the iPlanet Alliance with Sun Microsystems.|
|March 1999||Innosoft team, led by Mark Wahl, releases Innosoft Distributed Directory Server 5.0||
|March 2000||Sun Microsystems acquires Innosoft, merges Innosoft directory code with iPlanet. This forms the foundation for the iPlanet Directory Access Router.||
|October 2001||The iPlanet Alliance ends and Sun and Netscape fork the codebase.|
|October 2004||Apache Directory Server Top Level Project is formed after 1 year in incubation||
|December 2004||RedHat Purchases Netscape Server products|
|2005||Sun Microsystems initiates the OpenDS project. An open source directory server based on the Java platform.|
|June 2005||RedHat Releases Fedora Directory Server|
|October 2006||Apache Directory Server 1.0 is released||
|2007||UnboundID releases its directory server||
|2008||AOL Stops Supporting Netscape Products|
|April 2009||Oracle purchases Sun Microsystems|
|May 2009||RedHat changes the Fedora Directory Server to 389 Directory Server|
|Feb 1, 2010||ForgeRock is founded||
|Dec 2010||ForgeRock releases OpenDJ|
|July 2011||Oracle releases Oracle Unified Directory|
(1) Understanding and Deploying LDAP Directory Services; Second Edition; Timothy A. Howes, Ph.D., Mark C. Smith, and Gordon S. Good.
(2) 389 Directory Server; History (http://directory.fedoraproject.org/wiki/History).
(3) Email exchange with Ludovic Poitou (ForgeRock).
(4) Press Release, March 16th, 1998; “Innosoft Acquires LDAP Technology Leader Critical Angle Inc. (http://www.pmdf.process.com/press/critical-angle-acquire.html).
(5) OpenLDAP; Wikipedia (http://en.wikipedia.org/wiki/OpenLDAP).
(6) iPlanet; Wikipedia (http://en.wikipedia.org/wiki/IPlanet).
(7) OpenDS; Wikipedia (http://en.wikipedia.org/wiki/OpenDS).
(8) Netscape; Wikipedia (http://en.wikipedia.org/wiki/Netscape).
(9) Press Release, April 20th, 2000; “Oracle Buys Sun” (http://www.oracle.com/us/corporate/press/018363).
(10) 389 Directory Server; 389 Change FAQ (http://directory.fedoraproject.org/wiki/389_Change_FAQ).
(11) OpenDJ; Wikipedia (http://en.wikipedia.org/wiki/OpenDJ).
(12) Email exchange with Nick Crown (UnboundID).
(13) Press Release, July 20th, 2011; “Oracle Announces Oracle Unified Directory 11g” (http://www.oracle.com/us/corporate/press/434211).
An interesting question was posed on LinkedIn that asked, “If you were the architect of LinkedIn, MySpace, Facebook or other social networking sites and wanted to model the relationships amongst users and had to use LDAP, what would the schema look like?”
You can find the original post and responses here.
After reading the responses from other LinkedIn members, I felt compelled to add my proverbial $.02.
Directory Servers are simply special purpose data repositories. They are great for some applications and not so great for others. You can extend the schema and create a tree structure to model just about any kind of data for any type of application. But just because you “can” do something does not mean that you “should” do it.
The question becomes “Should you used a directory server or should you use a relational database?” For some applications a directory server would be a definite WRONG choice, for others it is clearly the RIGHT one, for yet others, the choice is not so clear. So, how do you decide?
Here are some simply rules of thumb that I have found work for me:
1) How often does your data change?
Keep in mind that directory servers are optimized for reads — this oftentimes comes at the expense of write operations. The reason is that directory servers typically implement extensive indexes that are tied to schema attributes (which by the way are tied to the application fields). So the question becomes, how often do these attributes change? If they do so often, then a directory server may not be the best choice (as you would be constantly rebuilding the indexes). If, however, they are relatively static, then a directory server would be a great choice.
2) What type of data are you trying to model?
If your data can be described in an attribute: value pair (i.e., name:Bill Nelson), then a directory server would be a good choice. If, however, your data is not so discrete, then a directory server should not be used. For instance, uploads to YouTube should NOT be kept in a directory server. User profiles in LinkedIn, however, would be.
3) Can your data be modeled in a hierarchical (tree-like) structure?
Directory servers implement a hierarchical structure for data modeling (similar to a file system layout). A benefit of a directory server is the ability to apply access control at a particular point in the tree and have that apply to all child elements in the tree structure. Additionally, you can start searching at a lower (child element) and increase your search performance times (much like selecting the proper starting point for the Unix “find” command). Relational databases cannot do this. You have to search all entries in the table. If your data lends to a hierarchical structure then a directory server might be a good choice.
I am a big fan on directory servers and have architected/implemented projects that sit 100% on top of a directory, 100% on top of relational databases, and a hybrid of both. Directory servers are extremely fast, flexible, scalable, and are able to handle the type of traffic you see on the Internet very well. Their ability to implement chaining, referrals, web services, and a flexible data modeling structure make them a very nice choice to use as a data repository to many applications, but I would not always lead with a directory server for every application.
So how do you decide which is best? It all comes down to the application, itself, and the way you want to access your data.
A site like LinkedIn might actually be modeled pretty well with a directory server as quite a bit of the content is actually static, lends well to an attribute:value pair, and can easily be modeled in a heirarchical structure. The user profiles for a site like facebook or YouTube could easily be modeled in a directory server, but I would NOT attempt to reference the YouTube or facebook uploads or the “what are you working on now” status with a directory server as it is constantly changing.
If you do decide to use a directory server, here are the general steps you should consider for development (your mileage may vary, but probably not too much):
- Evaluate the data fields that you want to access from your application
- Map the fields to existing directory server schema (extend if necessary).
- Build a heirarchical structure to model your data as appropriate (this is called the directory information tree, or DIT)
- Architect a directory solution based on where your applications reside thorughout the world (do you need one, two, or multiple directories?) and then determine how you want your data to flow through the system (chaining, referrals, replication)
- Implement the appropriate access control for attributes or the DIT in general
- Implement an effective indexing strategy to increase performance
- Test, test, test