This document describes the reasoning behind Ensemble’s use of ZooKeeper, and also the structure and semantics used by Ensemble in the ZooKeeper filesystem.
ZooKeeper offers a virtual filesystem with so called znodes (we’ll refer to them simply as nodes in this document). The state stored in the filesystem is fully introspectable and observable, and the changes performed on it are atomic and globally ordered. These features are used by Ensemble to maintain its distributed runtime state in a reliable and fault tolerant fashion.
When some part of Ensemble wants to modify the runtime state anyhow, rather than enqueuing a message to a specific agent, it should instead perform the modification in the ZooKeeper representation of the state, and the agents responsible for enforcing the requested modification should be watching the given nodes, so that they can realize the changes performed.
When compared to traditional message queueing, this kind of behavior enables easier global analysis, fault tolerance (through redundant agents which watch the same states), introspection, and so on.
The semantics and structures of all nodes used by Ensemble in its ZooKeeper filesystem usage are described below. Each entry here maps to a node, and the semantics of the given node are described right below it.
Note that, unlike a traditional filesystem, nodes in ZooKeeper may hold data, while still being a parent of other nodes. In some cases, information is stored as content for the node itself, in YAML format. These are noted in the tree below under a bulleted list and italics. In other cases, data is stored inside a child node, noted in the tree below as indented /bold. The decision around whether to use a child node or content in the parent node revolves around use cases.
Each formula used in this environment must have one entry inside this node.
Readable by: | Everyone |
---|
Represents a formula available in this environment. The node name includes the formula namespace (ubuntu, ~user, etc), the formula name, and the formula revision.
Each formula to be deployed must be included under an entry in this tree.
Readable by: | Everyone |
---|
Node with details about the configuration for one formula, which can be used to deploy one or more formula instances for this specific formula.
Options for the formula provided by the user, stored internally in YAML format.
Readable by: | Formula Agent |
---|---|
Writable by: | Admin tools |
Each node under this parent reflects an actual service agent which should be running to manage a formula.
One running service.
Readable by: | Formula Agent |
---|---|
Writable by: | Formula Agent |
/machines
/machine-<0..N>
- /provisioning-lock
- The Machine Provisioning Agent
- /machine-agent-connected
- Ephemeral node created when the Machine Agent is connected.
- /info
Basic information about this machine.
- public-dns-name: The public DNS name of this machine.
- machine-provider-id: ie. EC2 instance id.
When the need for a new machine is determined, the following sequence of events happen inside the ZooKeeper filesystem to deploy the new machine:
This verification of the connected machine agent helps us guard against any transient errors that may exist on a given virtual node due to provider vagaries.
When a machine provisioning agent comes up, it must scan the entire instance tree to verify all nodes are running. We need to keep some state to distinguish a node that has never come up from a node that has had its machine agent connection die so that a new provisioning agent can distinguish between a new machine bootstrap failure and an running machine failure.
use a one time password (otp) via user data to guard the machine agent permanent principal credentials.
TODO... we should track a counter to keep track of how many times we’ve attempt to launch a single instance.
When a machine is launched, we utilize cloud-init to install the requisite packages to run a machine agent (libzookeeper, twisted) and launch the machine agent.
The machine agent reads its one time password from ec2 user-data and connects to zookeeper and reads its permanent principal info and role information which it adds to its connection.
The machine agent reads and sets a watch on /machines/instances/<N>/services/. When a service is placed there the agent resolve its formula, downloads the formula, creates an lxc container, and launches a formula agent within the container passing the formula path.
The formula agent connects to zookeeper using principal information provided by the machine agent. The formula agent reads the formula metadata, and installs any package dependencies, and then starts invoking formula hooks.
The formula agent creates the ephemeral node /services/<service name>/instances/<N>/formula-agent-connected.
The formula is running when....