17 July 2007 / HowTo

Application Installation Delivery

Note: This concept pre-dates containers by some time, but has since been superseded by this newer technology. However, if you are still limited to running without containers, it still has merit.

The AID approach to application installations is geared towards not just documenting how an application is installed, but following capable and mature methods for installation and automation so that the complete system can be refreshed or reloaded in a new environment, fully patched and operational, in as short a time as possible.

The AID process maturity supports many business objectives:

Documentation - Helps to prove documentation is complete through frequent re-deployments and reduces reliance on tribal knowledge.
Implementation Timeline - Assists with application installation and implementation time lines by providing an easy to use reset method.
Security - A system built using AID methodology is typically more secure.
Life Cycle - Reduces long-term "lint" growth of systems that nobody dares to touch.
HA/BC/DR - Facilitates High Availability, Business Continuance and Disaster Recovery scenarios with much less cost than working these processes out after the fact.

Precepts

There are four fundamental precepts to the AID methodology. These precepts form the foundation of what AID is:

Automation - reproduction and inherent documentation of tasks, through reliable computer scripted actions instead of relying upon human actions.
Rapid Refresh - (either during patching, implementation, or cloning to a new environment). Rapid Refresh is the ability to scrape and reload a server, bringing it back to fully patched and operational capability, in as short a time as possible.
Encapsulation - vital to containing the application data sprawl to a manageable state
Security - by building to a high security profile out the gate, the overall pain that security can cause in the long-run is dramatically reduced.

Automation

Anybody who has spent any time with system administration has experienced the common problem where even with copious documentation, when a new system is built, it is discovered that either A) somebody forgot to document a step (usually a critical one), or B) a step was missed during the installation, making for different installed systems depending upon who did the installation. This is unavoidable-people are not exacting compute systems that reliably reproduce the same results with the same inputs every time. That is what automation is for.

With a proper framework, automation can provide that reliable reproduction of steps without forgetting anything, waiting for personnel availability, doing things in the wrong order, or forgetting to document a step.
The struggle with automation is to determine the best degree of automation you can support. It is easy to get stuck spending more time automating a task than is necessary. The key part is to analyze your installation and identify the key steps that can be easily and simply automated, using existing frameworks or extensions to new frameworks. Automation must become a part of the organization's DNA.

At a minimum, Automation should include scripted installation of the operating system (not templates*) and Post Installation of Operating System configurations and libraries to support the local baseline build. The Post Installation steps should bring a system from the OS baseline to a fully secured STIG** installation.

* why not templates? The best form of automation is one which starts from the source (OS distribution) and applies all of your normal changes. Templates have a tendency to invite manual tweaks over time, creating a manually calibrated drift from automated deployments, and go against the entire purpose of having Automation.

** STIG = Standard Technical Implementation Guideline. Originally developed by the DoD, STIGs are a highly secured baseline for an Operating System. Often times a STIG configuration is mostly non-functional for any application, but it is the high-water mark of security settings.

Encapsulation

Applications install in their own varied manner. Each vendor does its own thing. Despite repeated attempts by the industry to standardize, there is always a new standard! Because of this, applications sprawl across an operating system at will. From vendor to vendor there is no conventional method for where different parts of the application may live.

The key to AID Rapid Deployment is to standardize for yourself where applications are installed. This can be a struggle because each vendor wants to do their own thing. Some install into the OS (/usr), some install into the 90's "local" install directory (/usr/local) and some install into the optional software folder (/opt, \Program Files\ in windows, or /Applications in OSX). But the way they install underneath these locations is almost always unique to the application, although frequently follows a unix standard structure (even in windows), of a bin, lib and config or etc directory. Furthermore, applications often muddy the line between software and content, by having content stored in the same location as as the software. This makes it very difficult during upgrades.

The AID methodology uses an alternative, which is to define a new application installation domain (/app), and a new content domain (/data). This is only for applications which are supported by the system, not for the system. This is a distinction which can sometimes be difficult to make. What it comes down to is that a system is built for a purpose, other than to live on its own. It will have an operating system with built-in libraries, configurations, and then it will have the application installed which it is supporting. This latter is what should be encapsulated.

By having the application encapsulated in this manner, it makes it easy to backup the encapsulated location, format the host, reload the OS from a new automated install, and restore the application. Through this manner AID supports rapid refresh of an application installation, as well as DR and other similar technologies.

Pathing

The AID methodology defines a uniform location as an application root which cannot conflict with other installations: /app. Underneath the application root AID defines a naming standard to be followed which allows for easy updates, rollbacks and clean naming conventions.

Application Root (/app)

Within /app is a folder for each application, named using a simple root for the application, and the version:

    {application}-{version}

A symbolic link is then created to the application version, and all configuration files and external scripts should use the abstract symbolic link. The benefit here is transparent application upgrades and rollback (just relink to the new or old version and restart).

A few examples:

    /app/apache-2 -> apache-2.0.53
    /app/apache-2.0.53
    /app/apache-1 -> apache-1.3.33
    /app/apache-1.3.33
    /app/java1.3 -> java1.3.1_02
    /app/java1.3.1_02
    /app/mysql -> mysql-4.1.5-sparc32
    /app/mysql32 -> mysql-4.1.5-sparc32
    /app/mysql64 -> mysql-4.1.5-sparc64
    /app/mysql-4.1.5-sparc32
    /app/mysql-4.1.5-sparc64

Data Root (/data)

Within /data exist folders, similar to their application parents. Because data needs are more varied, the data folders may be grouped named by the application abstract name, or they may not. The version of application should only be used where it is relevant. For instance, a database may change file formats between major versions. This will be a good reason for keeping one database data root from another. There may also be more than one folder for the application.

The purpose of the data pathing is configurations in the software. The AID data path should be used in configurations, not the real path in the local host, as that may change.

A few examples:

    /data/mysql-data -> mysql-data-4.x
    /data/mysql-data-4.x
    /data/mysql-backup
    /data/apache

An application may be too complex to separate the data from the software. While Data separation is an ideal goal, it is not a requirement for AID to work. Additionally, the data folder may be a symbolic link to a different Mount Point or location entirely.

Security

Security can be the bane of any system or application administrator. There is a constant stream of patches and advisories recommending changes to the software configuration baseline. These require incessant testing and evaluations, and it is not uncommon that an OS security baseline change will break an installed application!

The AID methodology takes an alternative approach, which is fairly simple. Start with a highly secured baseline, and reduce only the security postures necessary during application installation for the app to work.

An easy way to think about this is to conceptualize a sound technicians mixer board, with fifty or so sliders that can go up or down. Think of these sliders as security settings on an operating system. The typical OS begins with most of these sliders pushed clear down to the bottom, or if they are lifted up, it is only marginally so. This is the blacklist approach to security that most people use by default. They turn on only point security solutions, leaving open holes across the board. With AID the idea is to push all of the sliders up to the top, and then only lower the sliders as required to get the application to work.

The resulting solution is much more secure, and you know what works and what breaks your application. The effort involved during installation is greater, but not by much, especially considered to the long term security changes required over time. There are less patches and changes to be made down the road, making for less security pain in an ongoing basis.