This page documents how 389DS can be used as a LDAP key/value store for simpkv. It provides a basic overview of simpkv and then discusses Directory Information Trees (DITs) that can be used to organize the data, an Object Identifier (OID) tree and custom schemas to support those DITs, and the technology choices for the implementation of the simpkv plugin interface to an 389DS instance.
simpkv Overview
The simp/simpkv module provides a library that allows Puppet to access one or more key/value stores (aka backends), each of which, can be used to store global keys and keys specific to Puppet environments. This section will present an overview of simpkv. Please refer to the module design documentation, README, and library documentation for more details.
Operations supported
The operations simpkv supports are as follows:
Function Name | Description |
---|---|
| Deletes a key from a backend. |
| Deletes an entire folder from a backend. |
| Returns whether a key or key folder exists in a backend. |
| Retrieves the value and any metadata stored for a key from a backend. |
| Returns a listing of all keys and sub-folders in a folder in a backend The list operation does not recurse through any sub-folders. Only information about the specified key folder is returned. |
| Sets the value and optional, user-provided metadata for a key in a backend. |
Backend logical structure
Logically, keys for a specific backend are organized within global and environment directory trees below that backend's root directory. You can visualize this tree as a filesystem in which a leaf node is a file named for the key and whose contents contains the value for that key. For example,
To facilitate implementations of this tree, key and folder names are restricted to sequences of alphanumeric, ‘.
', ‘_
', and ‘-
’ characters, with '/
' used as the path separator. Furthermore, when specifying the path to a key in an access operation, the path cannot contain relative path subsequences (e.g., '/./
' or '/../
').
Backend selection
simpkv allows the user to select and configure one or more backends to be used when simpkv::*
functions are called in Puppet manifests during catalog compilation. The configuration is largely made via hieradata.
Each backend has its own configuration.
All backends must specify simpkv plugin type (e.g., ‘file’, ‘ldap’) and a user-provided instance identifier.
A plugin is a backend interface written in Ruby and conforming to an explicit plugin API and expected behavior. It is the logic that actually affects the keystore operation when a
simpkv::*
function is called during a Puppet catalog compilation. For the ‘ldap' plugin, this will be the software that modifies key/value pairs stored in an LDAP server.The same plugin can be used for multiple backend instances.
The combination of plugin type and instance identifier uniquely identifies a backend instance.
Each backend may specify plugin-specific configuration (such as LDAP server URL and port, TLS configuration,…).
simpkv plugin internals (10,000 foot view)
Internally, simpkv constructs a plugin object for each unique backend, and uses the plugin object to interface with it corresponding backend. When a simpkv::*
function is called, an internal adapter calls the plugin’s corresponding API method with normalized arguments to affect the operation. The adapter then (de)normalizes the results of the operation and reports them back to the calling simpkv::*
function. For example, for a simpkv::put
operation using a LDAP plugin, the sequence of operations is notionally as follows:
One of the normalizations done by the simpkv adapter involves the value and optional, user-provided metadata associated with a key. In a simpkv::put
operation, the simpkv adapter serializes a key’s value and optional metadata into a single JSON string and then sends that to the plugin for storage in the backend . Then, after a key’s information has been retrieved by a plugin during a simpkv::get
or simpkv::list
operation, the simpkv adapter deserializes each JSON string back into the key’s value and metadata objects before serving the results back to the calling function. This encoding of a key’s value an metadata into a single string with a known, parsable format is intended to simplify backend operations.
The table below shows a few examples of the serialization for clarification.
Value Type | Serialization Example |
---|---|
Basic value* without metadata |
|
Basic value with user-provided metadata**** |
|
Complex value** with basic sub-elements with no user-provided metadata |
|
Binary value*** transformed by simpkv with no user-provided metadata |
|
*‘Basic value’ refers to a string, boolean, or numeric value.
**'Complex value’ refers to an array or hash constructed from basic values.
***simpkv currently provides limited support for binary data.
simpkv attempts to detect when the value is Puppet Binary type, transforms it into Base64 and records the transformation with ‘encoding' and 'original_encoding' attributes in the JSON. It then uses those attributes to properly deserialize back to the binary on a retrieval operation.
simpkv does does not support binary data in arrays, hashes, or the metadata.
****simpkv currently only supports metadata hashes comprised of basic values.
LDAP Directory Information Tree design
Requirements
There must be one LDAP backend DIT for all SIMP application data.
This is distinct from the DIT containing user accounts data.
Data to be stored must include simpkv data.
Data to be stored may in the future include other application data, (e.g., IP firewall data).
The simpkv data must be a subtree of the DIT.
The simpkv subtree must support partitioning the data into LDAP backend instances.
The simpkv subtree must allow storage of per-LDAP-backend-instance global and environment-specific key/value entries.
Entries may be stored in subtrees within the LDAP instance subtree.
Each key/value entry must be a leaf node in the LDAP instance subtree.
The DistinguishedName (DN) to each key/value entry throughout the entire DIT must be unique.
The JSON value of the key/value entry must be stored in some form in the key/value entry.
The key/value entry may have an attribute containing the JSON-encoded value.
The key/value entry may have attributes that map to the value’s JSON attributes.
The tree must support efficient
simpkv::get
,simpkv::exists
, andsimpkv::list
operations.Folder and/or key objects may store data in attributes to leverage LDAP search capabilities.
The simpkv LDAP plugin should not have to retrieve the entire tree or subtree in order to fulfill any of these operations.
Any custom schema
attributeType
orobjectClass
will be specified with an Object Identifier (OID) below the official SIMP Object Identifier (OID).
Design Considerations
At first blush, the mapping of the logical simpkv tree structure into a LDAP DIT appears to be straight forward, because LDAP is fundamentally a tree whose leaf nodes hold data. For example, we could design a tree as follows:
Use Organizations or Organizational Units to represent folders in a key path and other grouping (e.g., environments).
Create a custom schema element with key name and value attributes to represent a key/value entry.
Construct the DN for the key as a Relative DN (RDN) with the key name followed by sequence of RDNs where each one represents a folder in the key’s path.
So, for a key path production/app1/key1
the key/value pair could be found at the DN simpkvKey=key1,ou=app1,ou=production,ou=environments,<root DN for the backend instance>
, where simpkvKey
is an attribute of a simpkvEntry
LDAP object used to store the key/value pair. Visually, this subtree in the DIT would look something like the following:
Unfortunately, there is a nuance in 389DS that complicates that simple mapping: 389DS instances treat DNs as case invariant strings. So, the key paths production/app1/key1
and production/App1/Key1
both resolve to the same DN inside of 389DS, even though from simpkv’s perspective, they were intended to be distinct. This unexpected collision in the backend needs to be addressed either by simpkv or within the DIT itself.
Root Tree
The proposed root tree to hold all SIMP data in LDAP is as follows:
This trivial root tree can be expanded in the future to hold data for other Puppet applications or even site-specific data not associated with Puppet, if necessary.
simpkv Subtree Option 1
The simplest design option enforces DN case invariance by requiring all the values of all attributes used in a DN for a key/value pair to be lowercase. In other words, change the experimental simpkv API to only allow lowercase letters, numerals, and ‘.
', ‘_
’, and ‘-
’ characters for all key names, folder names, and plugin instance identifiers. Then, because each key’s DN is unique and case invariant, the simple mapping scheme described in 'Design Considerations’ can be used.
With this simple mapping, the proposed simpkv LDAP subtree will look nearly like that of the logical key/value tree. It just inserts a few extra “folders” into the tree in order to clarify the roles of the nodes beneath it. The new “folders” are
‘instances’ under which you will find an individual subtree for each backend instance
‘globals’ under which you will find a subtree for global keys for a backend instance
‘environments’ under which you will find individual subtrees for each Puppet environment for a backend instance.
Below is an example of the DIT in which simpkvEntry
is a custom LDAP object class with simpkvKey
and simpkvJsonValue
attributes holding the key and value, respectively:
simpkv Subtree Option 2
The second design option enforces DN case invariance without impacting the existing simpkv API. Its simpkv subtree has the same layout as that of Option 1, including the use of the ‘instances’, 'globals', and ‘environments’ grouping “folders”. However, in this design the LDAP plugin transforms any problematic attributes that are to be used in a DN for a key/value pair to an encoded representation (e.g., hexadecimal, Base 64) . For example, with a hexadecimal transformation, all backend instance identifiers, key names, and folder names would be represented in hex, minus the ‘0x
’ or ‘0X
’ preface**. So, key paths production/app1/key1
and production/App1/Key1
would be mapped to simpkvHexId=61707031,simpkvHexId=6b657931,ou=production,ou=environments,...
andsimpkvHexId=41707031,simpkvHexId=4b657931,ou=production,ou=environments,...
respectively, where simpkvHexId
is an attribute of both a simpkvFolder
LDAP object used to represent backend identifiers/folders and a simpkvEntry
LDAP object used to store the key/value pair.
**Puppet environment names are not allowed to include uppercase letters.
In addition, in this design each node with an encoded identifier in its RDN in would contain an attribute with the raw identifier.
This additional information is necessary in order to support external searches of the LDAP tree using the raw backend instance identifiers, key names, and folder names.
Some ‘OrganizationalUnits’ in Option 1 would now be represented by a custom object that had encoded and raw identifier attributes.
The custom class for the key/value nodes would have encoded and raw key attributes.
Below is an example of the DIT in which
simpkvFolder
is a custom LDAP object class withsimpkvHexId
andsimpkvId
attributes holding the transformed backend identifier/folder and raw identifier/folder, respectivelysimpkvEntry
is a custom LDAP object class withsimpkvHexId
,simpkvId
andsimpkvJsonValue
attributes holding the transformed key, raw key and value, respectively.
Recommendation
Option 1 is the recommended solution for the following reasons:
It yields a DIT that is simple to understand and navigate.
API change is not unexpected for simp/simpkv since it is still experimental (version < 1.0.0) and not used by default.
SIMP can help with the transition to lowercase key names for any existing simpkv key paths or
simplib::passgen
password names (whether using legacy mode or simpkv mode).Any SIMP-provided modules that uses
simplib::passgen
can be modified to ensure the password names are downcased.The
simplib::passgen
function that uses simpkv can be modified to downcase existing password names that have any uppercase letters and then to emit a warning.In the script SIMP will provide to import any existing simpkv key entries or
simplib::passgen
passwords into an LDAP simpkv backend, there can be a check for uppercase letters in the destination key paths. The script can either skip the import of the problematic entries, or warn the user of the conversion. Then, it would be up to the user to make any adjustments to their manifests.
OID Subtree Design and Custom LDAP Schema
Either option for the LDAP DIT for SIMP data requires at least one custom LDAP object class. The LDAP object class, in turn, must be specified by a unique OID. This section proposes a SIMP OID subtree design to support LDAP OIDs and then uses the OIDs in schemas for the two DIT options discussed above.
SIMP OID Subtree
SIMP has an officially registered OID, 1.3.6.1.4.1.47012, under which all OIDs for Puppet, SNMP, etc should reside. Once an OID is in use, its definition is not supposed to change. In other words, an OID can be deprecated, but not removed or reassigned a different name. So, the OID tree must be designed to allow future expansion.
Below is the proposed SIMP OID subtree showing the parent OIDs for attributes and class objects needed for the SIMP DIT.
LDAP Schema Elements
Technologies for Plugin Implementation
Requirements
Plugins are written in Ruby and implement the simpkv plugin API.
Plugins must be multi-thread safe.
Manifests that use
simpkv::*
functions must be able to be compiled withpuppet agent
,puppet apply
or Bolt commands. This means the plugin code will run in JRuby in the puppetserver, run in the Ruby installed with puppet-agent, or run using the Bolt user’s Ruby into which the puppet gem is installed.
Options Considered
Option | PROs | CONs |
---|---|---|
Tools provided by openldap-utils RPM |
|
|
net-ldap Ruby gem |
|
|
Support both tools provided by openldap-utils and net-ldap Ruby gem, using whichever it discovers is available | More installation flexibility when not on isolated networks. |
|
Tools provided by openldap-utils RPM
Recommendation
Option 1 without the auto-discovery mechanism is recommended for the following reasons:
Options 2 and 3 require additional packaging in order to work on isolated networks for Bolt users. So, if you are going to require a Bolt user to install a package, anyways, might as well be an existing vendor package.
The auto-discovery mechanism can be added after the initial implementation.