Idiomatic Containers

Idiomatic Containers

Using a container properly can be the difference between your success or failure. Knowing how to do this lies in following a few simple idiomatic patterns.

You may have heard that the ideal pattern for containers is one process per container. This is a simplified statement explaining one of the idiomatic patterns. Idiomatic patterns are not hard rules, they are simply the more natural or appropriate way to use containers, and staying closer to these patterns will help you get the most of out of your own containers.

I have created this list from my own observations as well as other sources, most notably is the Twelve-Factor App established by a team from Heroku. These Idiomatic Container patterns are specifically oriented to the operational behavior of building and running container systems with full pipelines, where the Twelve-Factor App is oriented towards a hands-free service such as Heroku, and focuses more on the application and the developer impact. Both of these provide value, and it is suggested as an exercise for the reader to study the Twelve-Factor App.

Elements of a Service

Before discussing Idiomatic patterns, let’s first discuss what elements are typically part of a running service:

  • Stack — The ecosystem which runs your application (operating system plus any language frameworks, such as Node, Python, Ruby, Java, etc).
  • Code — the code that defines the service, in addition to all required dependent libraries and add-on modules (in a node world you would include npm install in this element, where node itself is considered part of the Stack).
  • Config — the environment specific configurations and secrets that describe how the application runs within a single environment or for a single customer.
  • Data — all of the persistent data used by the application, regardless of the storage format.

Idiomatic Container Patterns

This list is a Work-In-Progress and is part of an ongoing series on Docker, started with The Docker Mind Shift.

  1. One Function per Container — Each running container should only focus on one function. This keeps the way you build it simple and makes managing it easier.
  2. Loosely Coupled Backing Services — use external services for Data elements that must persist, and couple to them loosely.
  3. Ephemeral — your container should be disposable, requiring no persistent state between restarts, and no updates to state during the course of its lifecycle. This is targeted to the Config and Data elements.
  4. Logs as Streams — Send your logs to stdout and stderr, do not log anything to syslog or local disk. The container service should take care of these streams for you.
  5. Configs at Runtime — Introduce your Config elements at runtime. Do not put these into the image.
  6. Immutable Releases — When you make a software release, build an image that is immutable, and reuse that image between lanes in your pipeline. Do not rebuild the image to address Config element changes for each lane.
  7. Real Service Monitoring — Define a method to test the actually running service through its external Port Binding.
  8. CI Data Migrations — Data itself should be treated in the same manner as code, with a pipeline and quality assurance testing and tests around them. Nobody should be manually adjusting data or data schemas on live systems.

1. One Function per Container

Each running container should only focus on one function for its lifecycle. This keeps the way you build it simple and makes managing it easier. If you have a service which needs a cache component, resist the desire to embed Redis into the container. Instead, break it out as a separate function.

The mantra of “one process per container” has good intentions, but it can be interpreted too strictly. You can have cases where multiple processes in a container still serve one function, such as a service runner that manages a queue of processes all doing the same thing.

The goal of this idiomatic pattern is to simplify the container. If you have a webservice, focus on it being a webservice in a container. If you also need to do batch processing, run it in a different container that inter-communicates over the network.

This also helps you keep your Dockerfiles simple. You can find more on doing this in the document Best Practices for Dockerfiles

2. Loosely Coupled Backing Services

Backing Services are used for Data Elements (such as Databases). These should be external to the container, and connections to them should be loosely established such that if the external service goes down or is replaced with a new one, your container can survive the change. This is effectively supporting other services to follow the same Idiomatic Container patterns.

3. Ephemeral

Your container should run in a throwaway manner, meaning that anything that happens as part of its running behavior is discarded and not persistent between restarts. You do not update the running container after the service is started. If a change is needed, you redeploy a new container.

To this point, processes should startup quickly and shutdown gracefully. Ideally, a process takes only a few seconds from start to operational readiness. This aids robustness and scalability. The process should also gracefully shutdown when it receives a SIGTERM. If there is shutdown work involved, the service should first cease listening on the port (no longer taking new connections) and then resolve any outstanding work.

4. Logs as Streams

Logs are time-ordered data output from a running process describing its behavior, for many purposes including diagnostics, performance, analysis, and auditing.

Logs should be formatted as a continuous stream, one event per line, sent to stdout and stderr of the process, leaving it up to the container management infrastructure to manage the streams.

The application running within a container should never concern itself with routing or storage of log data. It should not attempt to manage log files nor write to syslog.

5. Configs at Runtime

The configuration that defines how the environment executes is delivered at runtime. This is similar to the 12-factor app Factor, however many have taken this to be the literal OS "environment," which is easy, yet insecure.

The point of this idiomatic pattern is that nothing is stored within the packaged app to tell it how to run for a given environment, but rather that it is introduced in the general running environment of the application.

Additionally, these configurations should be delivered securely, which may be more than simple OS environment variables. Ideally, the secrets are delivered directly to the application as an input stream (stdin), avoiding any disk and notably avoiding the OS Environment space. Reflex Engine supports this.

See also Container ENV configs are an Anti-Pattern

6. Immutable Releases

Building on 12-Factor V. Build, Release, Run, the Immutable Releases pattern is that the final product of an assembled solution, including all dependencies (i.e. elements Stack and Code), is packaged as a container image that is then reused throughout the delivery pipeline. The container image for a given release is not rebuilt for different environments, and can instead be delivered to any number of environments. This is then paired with pattern #6. Configs at Runtime, to define how it should run.

7. Real Service Monitoring

Testing and validating the process is working as expected is mandatory for a stable pipeline between the various steps in an environment.

The service itself should be monitored in a manner similar to how it is used. Do not rely upon the process simply being in an execution state with open ports as being sufficient to know it is operating as expected. The testing should validate the service is online and ready for use (more than a simple port check), as well as to discover if it has stopped operating as expected.

If it is a web service, web calls can be made directly to the service. For services which do not have an external surface to connect to (such as batch jobs), use heartbeats that push out signals at regular intervals.

8. CI Data Migrations

Data itself should be treated in the same manner as code, with a pipeline and quality assurance testing and tests around them. Nobody should be manually adjusting data or data schemas on live systems. Suggested reading: Avoiding Calamitous Data Migrations

When you are building your containers, consider these patterns. Do they help you build better container systems? Do you think any are missing?

Image courtesy of Dmitri Popov