I have been recently working with Russ Miles on coding microservices that follow principles he has laid out in the Antifragile Software book. As a simple demo we decided on using Kafka as a simple event store to implement an event sourcing pattern.
The main focus of this article is to talk about how to deploy a kafka cluster and manage its lifecycle via Docker containers and Kubernetes on AWS.
Initially, I wanted to quickly see how to get one instance of kafka to be available from outside the AWS world so that I could interact with it. My first move is always to look at the main Docker image repository for official or popular images. Interestingly, as of this writing, there is no official Kafka image. The most popular is wurstmeister/kafka which is what I decided to use.
However, this was not enough. Indeed, Kafka relies on Zookeeper to work. Spotify offers an image with both services in a single image, I don’t think that’s a good idea in production so I decided to forego it. For Zookeeper, I didn’t use the most popular because its documentation wasn’t indicating any possibility to pass on parameters to the service. Instead, I went with digitalwonderland/zookeeper which was supporting some basic parameters like setting the broker id.
Setting one instance of both services is rather straightforward and can be controlled by a simple Kubernetes replication controller like:
--- apiVersion: v1 kind: ReplicationController metadata: name: kafka-controller spec: replicas: 1 selector: app: kafka template: metadata: labels: app: kafka spec: containers: - name: kafka image: wurstmeister/kafka ports: - containerPort: 9092 env: - name: KAFKA_ADVERTISED_HOST_NAME value: [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT] - name: KAFKA_ZOOKEEPER_CONNECT value: zook:2181 - name: zookeeper image: digitalwonderland/zookeeper ports: - containerPort: 2181
This could be exposed using the following Kubernetes service:
--- apiVersion: v1 kind: Service metadata: name: zook labels: app: kafka spec: ports: - port: 2181 name: zookeeper-port targetPort: 2181 protocol: TCP selector: app: kafka --- apiVersion: v1 kind: Service metadata: name: kafka-service labels: app: kafka spec: ports: - port: 9092 name: kafka-port targetPort: 9092 protocol: TCP selector: app: kafka type: LoadBalancer
Notice the LoadBalancer type is used here because we need to create a AWS load-balancer to access those services from the outside world. Kubernetes scripts are clever enough to achieve this for us.
In the replication controller specification, we can see the requirement to let Kafka advertize its hostname. To make it work, this must be the actual domain of the AWS load-balancer. This means that you must create the Kubernetes service first (which is good policy anyway) and then, once it is done, write down its domain into the replication controller spec as the value of the KAFKA_ADVERTISED_HOST_NAME environment variable.
This is all good but this is not a cluster. It is merely a straight instance for development purpose. Even though Kubernetes promises you to look after your pods, it’s not a bad idea to run a cluster of both zookeeper and kafka services. This wasn’t as trivial as I expected.
The reason is mostly due to the way clusters are configured. Indeed, in Zookeeper’s case, each instance must be statically identified within the cluster. This means you cannot just increase the number of pod instances in the replication controller, because they would all have the same broker identifier. This will change in Zookeeper 3.5. Kafka doesn’t show this limitation anymore, indeed it will happily create a broker id for you if none is provided explicitely (though, this requires Kafka 0.9+).
What this means is that we now have two specifications. One for Zookeeper and one for Kafka.
Let’s start with the simpler one, kafka:
--- apiVersion: v1 kind: ReplicationController metadata: name: kafka-controller spec: replicas: 1 selector: app: kafka template: metadata: labels: app: kafka spec: containers: - name: kafka image: wurstmeister/kafka ports: - containerPort: 9092 env: - name: KAFKA_ADVERTISED_PORT value: "9092" - name: KAFKA_ADVERTISED_HOST_NAME value: [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT] - name: KAFKA_ZOOKEEPER_CONNECT value: zoo1:2181,zoo2:2181,zoo3:2181 - name: KAFKA_CREATE_TOPICS value: mytopic:2:1
Nothing really odd here, we create a single kafka broker, connected our zookeeper cluster, which is defined below:
--- apiVersion: v1 kind: ReplicationController metadata: name: zookeeper-controller-1 spec: replicas: 1 selector: app: zookeeper-1 template: metadata: labels: app: zookeeper-1 spec: containers: - name: zoo1 image: digitalwonderland/zookeeper ports: - containerPort: 2181 env: - name: ZOOKEEPER_ID value: "1" - name: ZOOKEEPER_SERVER_1 value: zoo1 - name: ZOOKEEPER_SERVER_2 value: zoo2 - name: ZOOKEEPER_SERVER_3 value: zoo3 --- apiVersion: v1 kind: ReplicationController metadata: name: zookeeper-controller-2 spec: replicas: 1 selector: app: zookeeper-2 template: metadata: labels: app: zookeeper-2 spec: containers: - name: zoo2 image: digitalwonderland/zookeeper ports: - containerPort: 2181 env: - name: ZOOKEEPER_ID value: "2" - name: ZOOKEEPER_SERVER_1 value: zoo1 - name: ZOOKEEPER_SERVER_2 value: zoo2 - name: ZOOKEEPER_SERVER_3 value: zoo3 --- apiVersion: v1 kind: ReplicationController metadata: name: zookeeper-controller-3 spec: replicas: 1 selector: app: zookeeper-3 template: metadata: labels: app: zookeeper-3 spec: containers: - name: zoo3 image: digitalwonderland/zookeeper ports: - containerPort: 2181 env: - name: ZOOKEEPER_ID value: "3" - name: ZOOKEEPER_SERVER_1 value: zoo1 - name: ZOOKEEPER_SERVER_2 value: zoo2 - name: ZOOKEEPER_SERVER_3 value: zoo3
As you can see, unfortunately we cannot rely on a single replication controller with three instances as we do with kafka brokers. Instead, we run three distinct replication controllers, so that we can specify the zookeeper id of each instance, as well as the list of all brokers in the pool.
This is a bit of an annoyance because we therefore rely on three distinct services too:
--- apiVersion: v1 kind: Service metadata: name: zoo1 labels: app: zookeeper-1 spec: ports: - name: client port: 2181 protocol: TCP - name: follower port: 2888 protocol: TCP - name: leader port: 3888 protocol: TCP selector: app: zookeeper-1 --- apiVersion: v1 kind: Service metadata: name: zoo2 labels: app: zookeeper-2 spec: ports: - name: client port: 2181 protocol: TCP - name: follower port: 2888 protocol: TCP - name: leader port: 3888 protocol: TCP selector: app: zookeeper-2 --- apiVersion: v1 kind: Service metadata: name: zoo3 labels: app: zookeeper-3 spec: ports: - name: client port: 2181 protocol: TCP - name: follower port: 2888 protocol: TCP - name: leader port: 3888 protocol: TCP selector: app: zookeeper-3
Doing so means traffic is routed accordingly to each zookeeper instance using their service name (internally to the kubernetes network that is).
Finally, we have our Kafka service:
--- apiVersion: v1 kind: Service metadata: name: kafka-service labels: app: kafka spec: ports: - port: 9092 name: kafka-port targetPort: 9092 protocol: TCP selector: app: kafka type: LoadBalancer
That one is simple because we only have a single kafka application to expose.
Now running these in order is the easy part. First, let’s start with the zookeeper cluster:
$ kubectl create -f zookeeper-services.yaml $ kubectl create -f zookeeper-cluster.yaml
Once the cluster is up, you can check they are all happy bunnies via:
$ kubectl get pods zookeeper-controller-1-reeww 1/1 Running 0 2h zookeeper-controller-2-t4zzx 1/1 Running 0 2h zookeeper-controller-3-e4zmo 1/1 Running 0 2h $ kubectl logs zookeeper-controller-1-reeww ...
One of them should be LEADING, the other two ought to be FOLLOWERS.
Now, you can start your Kafa broker, first its service:
$ kubectl create -f kafka-service.yaml
On AWS, you will need to wait for the actual EC2 load balancer to be created. Once that’s done, take its DNS name and edit the kafka-cluster.yaml spec to set it to KAFKA_ADVERTISED_HOST_NAME.
Obviously, if you have setup a DNS route to your load-balancer, simply use that domain instead of the load-balancer’s. In that case, you can set its value once for good in the spec.
Then, run the following command:
$ kubectl create -f kafka-cluster.yaml
This will start the broker and automatically create the topic “mytopic” with one replica and two partitions.
At this stage, you should be able to connect to the broker and produce and consume messages. You might wantt o try kafkacat as a simple tool to play with your broker.
For instance, listing the topic on your broker:
$ kafkacat -b [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]:9092 -L
You can also produce messages:
$ kafkacat -b [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]:9092 -P -t mytopic
Consuming messages is as simple as:
$ kafkacat -b [AWS_LB_DNS_or_YOUR_DNS_POINTING_AT_IT]:9092 -C -t mytopic ...
At this stage, we still don’t have a kafka cluster. One might expect that running something like this should be enough:
$ kube scale --replicas=3 rc/kafka-controller
But unfortunately, this will only create new kafka instances, it will not automatically start replicating data to the new brokers. This has to be done out of band as I will explain in a follow-up article.
As a conclusion, I would say that using existing images is not ideal because they don’t always provide the level of integration you’d hope for. What I would rather do is build specific images that initially converse with an external configuration server to retrieve some information they need to run. This would likely make things a little more smooth. Though in the case of zookeeper, I am looking forward for its next release to support dynamic cluster scaling.