SIMP 6.5+: Safely Referencing SIMP assets from multiple Puppet Environments

This document requires familiarity with Puppet Environments and their relationship to SIMP Environments.


SIMP-3479 - Getting issue details... STATUS

Managing Multiple Puppet environments (without SIMP)

Control repositories

Outside of SIMP, git-based control repositories are the dominant technology to manage and deploy Puppet code.

At a high-level, a control repository workflow involves:

  • Puppetfile that defines a collection of Puppet modules and where to get them
  • A (Git) Control repository, where each branch defines a complete Puppet environment (which includes a Puppetfile)
  • A tool (r10k or PE Code Manager) to deploy Puppet environments from the control repository branches
  • (Advanced, but preferred) Repository webhooks that automatically trigger environment deployments on the Puppet master(s) whenever a branch is updated.

The upshot is that a control repository branch defines everything needed to recreate a specific Puppet environment. 

Puppet environments as code

Defining everything needed to create a Puppet environment allows Puppet Environments to be handled like code:

  • Adding or removing a git branch in the control repository will add or remove that environment on the Puppet master(s) 
    • (using r10k/Code Manager + webhook triggers)
  • The Puppet environment change control process can be handled as merge/pull requests 
  • Developers can test, diff, and review Puppet environment changes before promoting them into important environments
  • As control repo branches are updated, CI pipelines can automatically trigger tests and master-side deployments

Multiple Puppet environments

Because 1 control repo branch = 1 Puppet environment, managing multiple Puppet environments is as simple as using git:

Scaling Puppet with Control Repository-based workflows

  • r10k, Code Manager make it easy to deploy consistent Puppet environments across Puppet architectures that involve multiple masters
  • Bolt makes it possible to manage remote "agentless" hosts by checking out an environment's modules (using bolt puppetfile install) before running bolt apply

SIMP + Multiple Puppet environments

Limitations and conflicts

It has been technically possible to use multiple Puppet environments and control repositories since SIMP 5.0 (SIMP 5.0-6.3). 

However, in practice this has been cumbersome and easy to get wrong.  Getting it wrong can accidentally leak or lose site data (such as passgen secrets and PKI files).  Getting it right requires an in-depth knowledge of Puppet environments and SIMP's "extra" environments.  Even then, there are some significant limitations—for instance, it's not possible to pool redundant (compile) masters behind a VIP unless site administrators implement their solution to keep the SIMP "extra" environment data in sync across each server.


Issue 1: SIMP RPMs and tools interfere with files under the Puppet and SIMP environment directories 

The specifics of this issue have changed over the years, and were largely "solved" when SIMP 6.4+ stopped deploying RPMs into environment directories.

Starting with 6.4.0 (SIMP 6.4) SIMP now safely supports Puppetfile deployments (e.g., r10k puppetfile install from a Puppet environment directory) into a single, permanent Puppet environment (i.e., production).

  1. SIMP 5.0-6.3 RPM updates and simp_rpm_helper could modify existing files Puppet and SIMP environment directories
  2. SIMP 5.0-6.3 User-initiated tools like simp config could modify existing files Puppet and SIMP environment directories
  3. SIMP 6.4 PARTIAL FIX The user-initiated tool simp environment new can ensure that there is a corresponding secondary and writable environments
    1. This is a safety improvement from earlier releases, because it only alters environment files when the user initiates it
    2. However, the simp environment tool is only partially implemented, and the remaining actions cannot be implemented safely.
    3. As things stand, this means that SIMP 6.4.0:
      1. Safely supports SIMP extra data in the Local (Puppetfile-only) deployment scenario with a single environment (production)
      2. Does not safely support SIMP extra data in the Control Repository deployment scenarios (without additional conventions and limitations)

Issue 2: SIMP expects Secondary and Writable asset paths for each Puppet environment


SIMP has always assumed that each Puppet environment directory will be accompanied by two SIMP-specific "environment" data directories:

  1. SIMP (ALL) "Secondary" data/assets under /var/simp/environments/$environment/.  
    These are files too sensitive and/or too large to check into git in plaintext, maintained by site admins.
    1. "Site Files" modules, like pki_files (keydist) and krb3_files
    2. The FakeCA support script, including the FakeCA's private key
    3. RSync directories, used by the simp::server::rsync_shares profile to serve files for various modules.

  2. SIMP (ALL) "Writable" data/assets under under /opt/puppetlabs/server/data/puppetserver/simp/environments/$environment/
    These are secrets, read by special SIMP Puppet functions while compiling catalogs (and automatically generated if they are missing).
    There are currently only two Puppet functions in SIMP that use the writeable environment directory:
    1. simplib::passgen()
      • a popular function, used to generate secret passwords
      • Used in 14 SIMP modules
      • Usually (but not always) exposed as a parameter default
      • Secrets read by this function can also be generated by the simp passgen CLI tool
    2. ssh::autokey()—this function is unused by SIMP modules.  It is provided to allow local admins a way to automatically arrange SSH keys for service accounts.

SIMP running a single Puppet environment (safe)

A stock SIMP install starts with a unified SIMP Omni-Environmentproduction.

Issues with SIMP and multiple Puppet environments

After deploying a new Puppet environment (new_env, forked from production), there are several problems:



  1. (Red lines) The modulepath setting in environment.conf still points to production.
    1. (warning) Puppet catalogs compiled in new_env still source secondary module data (FakeCA PKI, Kerberos) from production!
    2. Any pki/krb3 files under /var/simp/environments/new_env/site_files are never used.

  2. (Blue lines) the SIMP "extra" ("secondary" and "writable") environment directories don't exist for new_env yet!
    1. Puppet code using SIMP's rsync type will fail in the new_env environment, because the source path will not exist.
      • (The rsync type is used in 14 SIMP modules)
    2. (warning) simplib::passgen() will silently create new and different secrets for each identifier in the new_env environment
      • This breaks authentication with passgen-configured accounts/services still in production.
        • e.g., TPM/TPM2 owner authentication, kdb5 passwords, rsync servers, SIMP GitLab auth
      • This is especially destructive to Canary nodes'—after successfully testing a new account/service that uses passgen-configured credentials, authentication will break after the node is returned to production.


Environment safety improvements in SIMP 6.4.0

SIMP 6.4.0 addressed many of SIMP's tool and RPM-related problems by making Puppetfile-based module deployments the preferred method to deploy modules.

  • Puppetfile-based deployment tools like r10k or Code Manager and now preferred to deploy SIMP's Puppet modules
    • To prevent conflicts with these tools, SIMP no longer interferes with any files under the Puppet or SIMP environment directories, unless directed to by the user.
  • SIMP module RPMs now install tagged versions into local git repositories, for use in local Puppetfiles
    • The simp puppetfile tool was introduced to automatically generate Puppetfiles based on the current module RPMs.
  • The simp environment new tool was introduced to help users ensure that corresponding SIMP and Puppet environments are created.


Safely Referencing SIMP assets from multiple Puppet Environments

SE01.SIMP 6.4 PARTIAL FIX (Okay:) Ensuring new SIMP extra environments with simp environment new 

Site admins must ensure that SIMP "extra" environment directories exist for every Puppet environment.  There are basically three strategies for this:

  1. Create a new "clean" SIMP extra environment directories for the new environment
  2. Link to another SIMP extra environment to keep all assets the same
    1. pros: keeps environments' 
    2. cons: (warning) linked directories can fool admins into thinking it is safe to alter/remove data from a 
  3. Copy directories from another SIMP extra environment

These choices in have been automated in SIMP 6.4.0 by the simp environment new tool.

Advantages

  • simp environment new automates the error-prone process of ensuring all three members of the SIMP Omni-environment exist on a local Puppet master

Drawbacks

  • Automates a mis-modeled workaround (devised long ago, when Puppet environments were new)
  • Doesn't make SIMP assets safer or easier to manage over time
  • Can't work across multiple Puppet masters
  • Can't be implemented to safely remove extra environments

During the development of SIMP 6.4.0, it became apparent that the simp environment workflow would have problems down the road:

  • Requiring a SIMP writable and secondary environment to exist for every Puppet environment was a mistake. 

    • It requires coarse workarounds like `simp environment new [--copy|--link]`

      • every time a new Puppet environment is deployed

      • even when environments used the same (or similar) resources

    • It prevents referring to a mix of the same assets in some cases

    • It adds a source of truth that is independent from the control repository

    • Linked SIMP extra environments make it easy to assume that it is safe to alter/delete assets, because the path shows an unimportant name

  • The proposed simp environment rm command would make it too easy to permanently lose data in linked environments


SE02. SIMP 5.0-6.3 (Good:) Using hiera-eyaml in the control repo to replace Writeable environment data 

Site admins can prevent SIMP from using the Writable environment directory by overriding all uses of simplib::passgen() with the Hiera eyaml backend.

Advantages

  • Secret data now scales with additional compile masters as part of the r10k/code manager deployment
  • Secret data is encrypted and versioned
    • PCKS7 and GPG are supported 
  • It is simple to configure multiple Hiera eyaml backends tiers, with separate keys
    • Separate teams can encrypt their secrets with their own key

Drawbacks

  • Site administrators are responsible to manage and distribute the hiera-eyaml key files (independently of SIMP)
    • key files must exist at the paths hiera.yaml expects them on compile masters
  • (warning) This approach is not possible for some SIMP users

Suggested improvements

  • FUTURE Expose all uses of simp::passgen() in SIMP classes as parameter defaults, so users can override them via hiera-eyaml

Examples

  • Example tiers from control repository's hiera.yaml:

    ---
    version: 5
    hierarchy:
      # [...]
    
      # ----------------------------------------------------------------------------
      # NOTE: This tier determines which secrets to use via the top-scope variable 
      #       `$::hostgroup` (set by ENC or manifests/site.pp, prior to any lookups)
      # ----------------------------------------------------------------------------
      - name: "Per-hostgroup data (encrypted)"
        lookup_key: eyaml_lookup_key
        path: "secrets/hostgroups/%{::hostgroup}.eyaml"
        options:
          pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/hostgroups/%{::hostgroup}__private_key.pkcs7.pem
          pkcs7_public_key:  /etc/puppetlabs/puppet/eyaml/hostgroups/%{::hostgroup}__public_key.pkcs7.pem
    
      # ----------------------------------------------------------------------------
      # WARNING: The interpolations in this tier rely on an agent-determined fact 
      #          (`%{facts.datacenter}`) to determine which secrets will be returned
      #           during lookups.  
      #
      #          When possible, it is STRONGLY recommended to use *trusted* facts
      #          instead of agent-determined facts for this purpose.
      #          (i.e., `%{trusted.datacenter}` instead of `%{facts.datacenter}`).
      # 
      #          However, this requires either:
      #
      #            1. Baking the information into the Puppet certificate at the time
      #               it is signed, with CSR attributes/certificate extensions and
      #               setting short names in the `config_file_oid_map.yaml` file:
      #
      #               * https://puppet.com/docs/puppet/5.5/config_file_oid_map.html 
      #               * https://puppet.com/docs/puppet/5.5/ssl_attributes_extensions.html
      #
      #            2. Use the (experimental as of Puppet 6.11) `trusted_external_command`
      #               setting to add trusted facts from an external source during 
      #               catalog compilation.
      #                
      #                * https://tickets.puppetlabs.com/browse/PUP-9994
      #                * https://puppet.com/docs/puppet/latest/release_notes_puppet.html#experimental-feature:-catalog-compilation-with-external-trusted-data-from-third-parties
      # 
      # ----------------------------------------------------------------------------
      - name: "Per-datacenter secret data (encrypted)"
        lookup_key: eyaml_lookup_key
        path: "secrets/datacenters/%{facts.datacenter}.eyaml"
        options:
          pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/datacenters/%{facts.datacenter}__private_key.pkcs7.pem
          pkcs7_public_key:  /etc/puppetlabs/puppet/eyaml/datacenters/%{facts.datacenter}__public_key.pkcs7.pem
    
      - name: "Site-wide secret data (encrypted)"
        lookup_key: eyaml_lookup_key
        path: "secrets/site.eyaml"
        options:
          pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/site__private_key.pkcs7.pem
          pkcs7_public_key:  /etc/puppetlabs/puppet/eyaml/site__public_key.pkcs7.pem
      # [...]


  • Example encrypted data in `data/secrets/site.eyaml`:

    ---
    simp::puppetdb::database_password : >
        ENC[PKCS7,Y22exl+OvjDe+drmik2XEeD3VQtl1uZJXFFF2NnrMXDWx0csyqLB/2NOWefv
        NBTZfOlPvMlAesyr4bUY4I5XeVbVk38XKxeriH69EFAD4CahIZlC8lkE/uDh
        jJGQfh052eonkungHIcuGKY/5sEbbZl/qufjAtp/ufor15VBJtsXt17tXP4y
        l5ZP119Fwq8xiREGOL0lVvFYJz2hZc1ppPCNG5lwuLnTekXN/OazNYpf4CMd
        /HjZFXwcXRtTlzewJLc+/gox2IfByQRhsI/AgogRfYQKocZgFb/DOZoXR7wm
        IZGeunzwhqfmEtGiqpvJJQ5wVRdzJVpTnANBA5qxeA==]
    
    krb5::kdc::config::kdb5_password: > 
        ENC[PKCS7,Y22exl+OvjDe+drmik2XEeD3VQtl1uZJXFFF2NnrMXDWx0csyqLB/2NOWefv
        NBTZfOlPvMlAesyr4bUY4I5XeVbVk38XKxeriH69EFAD4CahIZlC8lkE/uDh
        jJGQfh052eonkungHIcuGKY/5sEbbZl/qufjAtp/ufor15VBJtsXt17tXP4y
        l5ZP119Fwq8xiREGOL0lVvFYJz2hZc1ppPCNG5lwuLnTekXN/OazNYpf4CMd
        /HjZFXwcXRtTlzewJLc+/gox2IfByQRhsI/AgogRfYQKocZgFb/DOZoXR7wm
        IZGeunzwhqfmEtGiqpvJJQ5wVRdzJVpTnANBA5qxeA==]
    
    # ...
  •  For more details, see:

Better: [WIP] Define the site_files directory.independently of the environment