Docker and containers are the latest new tech "hot thing." But it is easy to make mistakes, and with the breadth of options the first step can seem like an insurmountable cliff.
Before you begin a docker implementation, regardless of which platform you are using, you should know what you are expecting to get out of it. Understanding these mistakes may help you to avoid them as you investigate your own docker solutions.
Ranked inversely, the first few mistakes are really about not understanding the reasons behind the top two mistakes.
6. Not understanding your Pipeline (CI & Delivery)
How you get your software delivered is a challenge. Docker is great, but the real value comes in how it creates an immutable image which can be leveraged in a pipeline. This isn't necessarily an easy task, and there are a LOT of options available.
Before tackling the problem you should understand: what does your delivery pipeline look like? How many steps in the process do you need? (Dev, Test, Prod, Sandbox, etc). The next big question is: how do you plan on testing and moving between these steps in the pipeline?
5. Confusion about how to handle configurations
To run docker in an ideal manner, you should not store any configuration that may change at run-time within the docker image. See point #1: not understanding the purpose.
But, this means you must make an immutable image, which can somehow learn its configuration at run time. Not an easy order, and the industry is still wrestling with this challenge. Generally speaking, it means you need some mechanism that can quickly configure your system at run time. A few options:
- Use environment variables (the common default). This is easy, but quickly becomes a management nightmare. It is also a security risk (see mistake: Secrets).
- Use a launch script to figure out the configurations and then run your app after the container is ready.
- Use something in your code like Node Config which can bring in the configurations from an external source.
- Use an ephemeral config management system like Reflex Engine.
4. Not appropriately handling secrets
Secrets are the passwords, database strings and other sensitive and highly valuable bits that define how your app gets its other information or is unique. This may be SSL private keys, connection strings, or db passwords. The problem is, secrets are always hard to manage, and containers make it more so.
It is easy to be lulled into the lemming reasoning that, "everybody is doing it so it must be okay" and then store your secrets in an environment variable. And the problem is, with the newness of the docker ecosystem, there are not many good options available. However, this doesn't make it right. Some things to look into:
- Docker has started to address this by adding Docker Secrets. Kubernetes also has secrets. The challenge with both of these is your app must figure out how to get them out--it isn't as convenient as an environment variable.
- Reflex Engine - ABAC based encrypted engine supporting json objects and inheritance, which can write secrets at run time directly to STDIN, the environment, or local files before your program launches.
- Vault - an encrypted key-value store.
- Keywiz - uses gpg and TLS certificates.
Make sure you spend time to study the options and use something smart. Don't make the mistake of embedding secrets in your container, or take the easy but dangerous route of using environment variables.
3. Confusion around Building vs Testing
Once you have your application building in a container, there is a lot of work to figure out how to build it with a service. Do you build all of the testing into your container image, then run it and have it self test, or do you build a production container, and test it externally? Do you rebuild your container for each step in the pipeline?
There are a lot of SaaS services to help do automated builds of code and have easy to use automatic code validation and testing systems (Travis CI, Circle CI, etc)... but these run from within a container already! Do you build your code twice — once by the build/test service, and then once again in a container?
Consider a node application being built with Travis CI. It is easy to have travis build and run your tests. But if your
npm install takes 5 minutes to complete, do you want to let Travis CI do its "test" build, only to turn around and then build a container, re-doing the
npm install again? This effectively doubles your build time, and you may end up having a different image in the second build than the first!
Some work must be spent figuring out the best way for your needs. Ideally, the docker image that is tested is the same one delivered to production, but this takes more orchestration, and you loose the value-add of some off-the-shelf build services.
2. Not understanding the value-add
The ideal value-add of containers is that they make it easy to realize the true goal of simple and secure management of services. This is something the industry has struggled with for some time, and by using Ephemeral Applications we are finally able to encapsulate things to small building blocks, which helps us crest the ridge and things start to look easy.
But to do this, we cannot just turn our containers into virtual machines. That is why it is extremely important to understand the reason you may want containers. Some indicators:
- Do you understand and believe in The Factors of Ephemeral Applications?
- Does your service run as a single process, or is it an ecosystem of many different processes and tasks?
If what you are trying to do is run your container as a standalone machine... you might be using the tool incorrectly.
1. Trying to do it all at once
The biggest mistake people make is trying to do everything docker at the same time, hoping they can just "throw docker" at the challenge of microservices, for example, and have their problem solved. The problem is docker can be such a change to everything that it feels you must do it all at once (or metaphorically speaking, trying to boil the entire ocean).
Sometimes it is best to start small. Rather than figuring out the complete ecosystem of managing services, configurations, deployments, scaling of the nodes, scaling of your application, failover, and more, perhaps break it into smaller steps.
The major pitfalls of this step usually come in the challenge of Traffic Routing, or how to "address" your ephemeral container instances as they change ports and addresses. By its very nature, when run in an ideal configuration, the ports and address will change. This makes it difficult to point traffic to, so you need something in front of it to balance the traffic and keep up with the ports and nodes of the cluster as they change.
Suggested ways to simplify:
- Start by just getting your application to build in a container (local machine), following The Factors of Ephemeral Applications. There is no reason to deep dive Kubernetes, Docker Datacenter or some of the more unique systems like Rancher or Nanobox, if your application isn't a good fit.
- Figure out how you want to resolve the earlier challenges listed here.
- Skip the traffic routing problem as a first iteration. If you already have a server and application system in place and you are happy with the traffic routing, perhaps consider just replacing how the application is delivered onto the server, with having it delivered onto the server in a container.
- Identify what is good enough. There are so many software solutions available, and more are coming every day. Be careful of the kid in a candy shop problem. You don't need everything at first. Layer your solution, improve upon it.