Ok, yeah, I probably don’t really need a Zookeeper cluster running in the house, but there are a few Raspberry Pi’s around that are mostly idle — we use them for streaming music from our various devices to the speakers in some of the rooms.
But, since I had some spare time, and wanted to do something technical, small and relaxing, I thought I would give it a try. After all, you never know when a Zookeeper cluster would come in handy.
The first adventure was discovering that the Pi’s — all of various ages — did not have the same versions of Java and some other bits and pieces installed. So first thing was to bring them up to date… SSH onto them and:
$ sudo apt update $ sudo apt upgrade $ sudo apt autoremove
Turns out that Meatloaf was correct, and 2 out of 3 ain’t bad. One of the Pi turned out to have a dodgy micro-SD card so there was a lengthy detour while I replaced that and rebuilt the Pi. Back on track though, it was all fairly straightforward to get up and running to be reasonably reliable.
Big caveat, this is not something you want to run your critical production infrastructure on. For one thing, rather than 3 nodes, you really want 5 or 7 for redundancy, and it’s unlikely that this more-or-less default configuration would support requests at internet scale.
First step was to verify that I really did have Java installed:
$ java --version openjdk 11.0.16 2022-07-19 OpenJDK Runtime Environment (build 11.0.16+8-post-Raspbian-1deb10u1) OpenJDK Server VM (build 11.0.16+8-post-Raspbian-1deb10u1, mixed mode)
If you don’t have Java running, it’s pretty straightforward now. At the time of writing in late 2022, this installs OpenJDK 11 but in the future that may change
$ sudo apt install default-jdk
Second step was to grab the software and install it:
$ cd /tmp $ wget https://dlcdn.apache.org/zookeeper/zookeeper-3.8.0/apache-zookeeper-3.8.0-bin.tar.gz $ tar xvf apache-zookeeper-3.8.0-bin.tar.gz $ sudo mv apache-zookeeper-3.8.0-bin /usr/local/zookeeper
Hmm. Not quite right, that’s going to leave the software in /usr/local
owned by the default pi
user. Not great. Let’s create a user that will run and own this (and incidentally create a directory for data):
$ sudo addgroup --system zookeeper $ sudo adduser --system --ingroup zookeeper \ --no-create-home --disabled-password zookeeper $ mkdir /var/zookeeper $ chgrp -R zookeeper /usr/local/zookeeper /var/zookeeper $ chown -R zookeeper /usr/local/zookeeper /var/zookeeper
Ok! All the basic bits ready to roll. There’s three things I want to do now:
- configure the Zookeeper software
- configure logging
- make sure the service will restart when the Pi is rebooted
The configuration is easy to find — we want to create a zoo.cfg
file in /usr/local/zookeeper/conf
. The documentation covers a lot of different use cases and options for configuration, but we really only need to worry about two things, and use the conventional defaults for everything else:
tickTime=2000 dataDir=/var/zookeeper clientPort=2181 initLimit=10 syncLimit=5 server.1=192.168.1.4:2888:3888 server.2=192.168.1.10:2888:3888 server.3=192.168.1.11:2888:3888
The first key thing is to specify where the data is going to be kept, which you can see is the /var/zookeeper
location I created above. The second is a list of the Pi’s that will take part in the cluster, and the two ports they listen on. You will notice the servers are all numbered 1, 2, 3… that’s another small gotcha that is not obvious. In my data directory for (e.g.) node “2”, I need to write “2” into /var/zookeeper/myid
Logging next — Zookeeper makes use of LogBack, which has a configuration file /usr/local/zookeeper/conf/logback.xml
. Yep, XML time, which always seems so very 1990’s. There are several changes to make in the default Zookeeper 3.8.0 logback.xml
- update the
zookeeper.log.dir
property to belogs
- update
zookeeper.log.maxfilesize
to50MB
- update
zookeeper.log.maxbackupindex
to3
- uncomment the
ROLLINGFILE
appender section - change the
root
logger fromCONSOLE
toROLLINGFILE
The overall effect (sorry, wall of XML coming) is this (I’ve omitted all the comments in the default file):
<configuration> <property name="zookeeper.console.threshold" value="INFO" /> <property name="zookeeper.log.dir" value="logs" /> <property name="zookeeper.log.file" value="zookeeper.log" /> <property name="zookeeper.log.threshold" value="INFO" /> <property name="zookeeper.log.maxfilesize" value="50MB" /> <property name="zookeeper.log.maxbackupindex" value="3" /> <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender"> <encoder> <pattern>%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n</pattern> </encoder> <filter class="ch.qos.logback.classic.filter.ThresholdFilter"> <level>${zookeeper.console.threshold}</level> </filter> </appender> <appender name="ROLLINGFILE" class="ch.qos.logback.core.rolling.RollingFileAppender"> <File>${zookeeper.log.dir}/${zookeeper.log.file}</File> <encoder> <pattern>%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n</pattern> </encoder> <filter class="ch.qos.logback.classic.filter.ThresholdFilter"> <level>${zookeeper.log.threshold}</level> </filter> <rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy"> <maxIndex>${zookeeper.log.maxbackupindex}</maxIndex> <FileNamePattern>${zookeeper.log.dir}/${zookeeper.log.file}.%i</FileNamePattern> </rollingPolicy> <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy"> <MaxFileSize>${zookeeper.log.maxfilesize}</MaxFileSize> </triggeringPolicy> </appender> <root level="INFO"> <appender-ref ref="ROLLINGFILE" /> </root> </configuration>
The result of this will be that log files will show up in /var/zookeeper/logs
, and the number of logs will be capped at 3, each no more than 50Mb in size.
Final step, wrapping Zookeeper up as a systemd service so that Zookeeper will restart when the Pi is rebooted. Again, not too horrible:
$ sudo vi /lib/systemd/system/zookeeper.service [Unit] Description=ZooKeeper Service Documentation=https://zookeeper.apache.org Requires=network.target After=network.target [Service] Type=forking User=zookeeper Group=zookeeper ExecStart=/usr/local/zookeeper/bin/zkServer.sh start /usr/local/zookeeper/conf/zoo.cfg ExecStop=/usr/local/zookeeper/bin/zkServer.sh stop /usr/local/zookeeper/conf/zoo.cfg ExecReload=/usr/local/zookeeper/bin/zkServer.sh restart /usr/local/zookeeper/conf/zoo.cfg WorkingDirectory=/var/zookeeper [Install] WantedBy=default.target
So let’s start it (note of course that the cluster won’t be happy until all three nodes are up and running):
$ sudo systemctl start zookeeper.service
and… silence. That’s one of the hassles of systemd, you don’t always get feedback. Let’s see what the status is:
$ sudo systemctl status zookeeper.service ? zookeeper.service - ZooKeeper Service Loaded: loaded (/lib/systemd/system/zookeeper.service; disabled; vendor preset: enabled) Active: active (running) since Sun 2022-11-20 13:38:34 GMT; 1min 25s ago Docs: https://zookeeper.apache.org Process: 3397 ExecStart=/usr/local/zookeeper/bin/zkServer.sh start /usr/local/zookeeper/conf/zoo.cfg (code=exited, status=0/SUCCESS) Main PID: 3412 (java) Tasks: 49 (limit: 3720) CGroup: /system.slice/zookeeper.service ??3412 java -Dzookeeper.log.dir=/usr/local/zookeeper/bin/../logs -Dzookeeper.log.file=zookeeper-zookeeper-server-loungepi.log -XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError=ki Nov 20 13:38:33 loungepi systemd[1]: Starting ZooKeeper Service... Nov 20 13:38:33 loungepi zkServer.sh[3397]: /usr/bin/java Nov 20 13:38:33 loungepi zkServer.sh[3397]: ZooKeeper JMX enabled by default Nov 20 13:38:33 loungepi zkServer.sh[3397]: Using config: /usr/local/zookeeper/conf/zoo.cfg Nov 20 13:38:34 loungepi zkServer.sh[3397]: Starting zookeeper ... STARTED Nov 20 13:38:34 loungepi systemd[1]: Started ZooKeeper Service.
That’s looking healthy! We can dig a bit further. What does Zookeeper believe?
$ /usr/local/zookeeper/bin/zkServer.sh status /usr/bin/java ZooKeeper JMX enabled by default Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Client SSL: false. Mode: leader
Even better! looks like we are up and running, and the logs are not showing any issues:
$ tail /var/zookeeper/logs/zookeeper.log 2022-11-20 13:40:50,269 [myid:] - INFO [NIOWorkerThread-1:o.a.z.s.c.FourLetterCommands@223] - The list of known four letter word commands is : [{1936881266=srvr, 1937006964=stat, 2003003491=wchc, 1685417328=dump, 1668445044=crst, 1936880500=srst, 1701738089=envi, 1668247142=conf, -720899=telnet close, 1751217000=hash, 2003003507=wchs, 2003003504=wchp, 1684632179=dirs, 1668247155=cons, 1835955314=mntr, 1769173615=isro, 1920298859=ruok, 1735683435=gtmk, 1937010027=stmk}] 2022-11-20 13:40:50,270 [myid:] - INFO [NIOWorkerThread-1:o.a.z.s.c.FourLetterCommands@224] - The list of enabled four letter word commands is : [[srvr]] 2022-11-20 13:40:50,272 [myid:] - INFO [NIOWorkerThread-1:o.a.z.s.NIOServerCnxn@514] - Processing srvr command from /127.0.0.1:36056
And as a final test, there’s a simple command line interface we can play with:
$ /usr/local/zookeeper/bin/zkCli.sh -server localhost:2181 /usr/bin/java Connecting to localhost:2181 Welcome to ZooKeeper! JLine support is enabled WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0]
and I can execute commands against the cluster, like fetch the config:
[zk: localhost:2181(CONNECTED) 1] config server.1=192.168.1.4:2888:3888:participant server.2=192.168.1.10:2888:3888:participant server.3=192.168.1.11:2888:3888:participant version=0 [zk: localhost:2181(CONNECTED) 2] quit WATCHER:: WatchedEvent state:Closed type:None path:null
That’s pretty well it really. I’ve now got a three-node Zookeeper cluster quietly chugging away in the house, waiting for whatever I want to do with it.
While it’s handy to be able to prototype things on cloud servers, running EC2 or CloudEngine instances just for small services can get pretty expensive. Running a cluster like this with Docker can be quite a good option as well, but getting the networking to resemble what you would see in a production environment can be tricky. If you have got Pi’s lying about, not being used much, they do provide a relatively cheap and easy alternative, and installing and configuring services like this is good practice at doing more critical IT work on real servers.