Technology – Page 3 – The Occasional Masthead

Smoke testing Kafka in HDP

Assuming that you have a vanilla HDP, or the HDP sandbox, or have installed a cluster with Ambari and added Kafka, then the following may help you to smoke test the behaviour of Kafka. Obviously if you’ve configured Kafka or Zookeeper to be running on different ports, this isn’t going to help you much, and it also assumes that you are testing on one of the cluster boxes, and a ton of other assumptions.

The following assumes that you have found and changed to the Kafka installation directory – for default Ambari or HDP installations, this is probably under /usr/hdp, but your mileage may vary. To begin with, you might need to pre-create a testing topic:

bin/kafka-topics.sh
    --zookeeper localhost:2181 \
    --create --replication-factor 1 \
    --partitions 1 \
     --topic test

then in one terminal window, run a simple consumer:

bin/kafka-console-consumer.sh \
     --zookeeper localhost:2181 \
     --topic test \
     --from-beginning

Note that this is reading from the beginning of the topic, if you want to just tail the recent entries, omit the --from-beginning instruction. Finally, in another terminal window, open a dummy producer:

bin/kafka-console-producer.sh \
    --broker-list localhost:6667 \
    --topic test

There is an annoying asymmetry here – the consumer and most other utilities look to ZooKeeper to find the brokers, but the dummy producer requires an explicit pointer to one or more of the brokers. On this consumer window, type stuff, and you should see it echoed realtime in the consumer window. When finished, ^C out of the producer and consumer, and consider your work done.

Lies, Damned Lies and Programmers

I recently came across a really nice set – not directly related – of articles dealing with various profound errors that programmers and system designers fall into when dealing with names and addresses.

The TL;DR if you don’t read these: names and addresses are hard and most things you believe about them are wrong.

Let’s start with Falsehoods Programmers Believe About Names. Without even trying the author lists 40 things we believe about names that are just plain wrong.

In a similar vein, Falsehoods programmers believe about addresses, which particularly speaks to me. One of the fundamental errors about addresses is to think they identify a location. This is incorrect. An address might identify a location, but it is fundamentally a description which instructs a postman how to deliver a letter or parcel. Substitute pizza operative, Amazon driver or writ server as desired.

Even without getting into the weirdness around the actual shape of the planet, Falsehoods programmers believe about geography touches on place names.

And as a bonus: Falsehoods programmers believe about time – computers prove to be pretty bad clocks, and working out a calendar is very complicated.

A Demonstration NiFi Cluster

In order to explore NiFi clustering, and NiFi site-to-site protocol, I decided that I could use a minimal installation – as I’m really just exploring the behaviour of NiFi itself, I don’t need to have any Hadoop environment running as well. To this end, my thought was that I could get the flexibility to just play around that I need by building a minimal Centos/7 virtual machine, running in VirtualBox. The plan was to have little more than a Java 8 SDK and NiFi installed on this, and then I would clone copies of it which would be modified to be independent nodes in a cluster. At the time of writing this is still in progress, but I thought it was worth capturing some information about how I proceeded to get my VM prepared.

There are a handful of requirements for this VM:

It needs a static IP (so that I can assign different static IPs to the clones, later)
It needs to be able to reach out to the broader internet, in order to pull down OS updates and similar
I need to be able to ssh to it from my desktop
Different instances of the VM need to be able to reach each other easily
A Java 8 JVM is needed

List-o-mania

I have, once again, felt stuck, spinning my wheels in the mud. There is an unpleasant, and possibly vicious, cycle at play here in my head: my planning falls apart, I feel like I am not getting anything done, my anxiety spikes, I cannot plan cogently. Repeat and repeat and repeat like some damned overwrought Philip Glass piece. I am trying to look at this dispassionately, because if I can understand how this happens, maybe I can head it off next time.

There are a few factors – health, political chaos, and too many months of uncertainty at work. Having a work and personal phone, and a work and personal computer, and disconnected accounts across both is really not helping either – I keep dropping things between the various calendars and todo lists, which has been exacerbated in the last few months by traveling. You would think that separating work and non-work would be easy. I can partition off my 37.5 hours and leave it at work, can’t I? Well, no. Because I’m trying to juggle calendars and waking hours and mental effort between work and non-work, and I cannot just turn off my brain at the end of the working day. Increasingly I feel like I would do very well if I cloned myself at least twice, so that different instances of myself could live full and uncomplicated lives. And I really resent the 3+ hours tied up each day in commuting, even while I know other people are doing the same or worse.

Continue reading “List-o-mania”

Two-factor in the middle of the night

Wherever possible I have been enabling two-factor authentication and similar protections. Not that I am paranoid, it’s just that I am paranoid. One of these I have had in play for a long time is protection on my Google account. So it’s somewhat comforting to get an unexpected SMS message from Google in the middle of the night sending me an unexpected authorisation code. Because it means whoever just tried to access my account could not.

Lock your doors people. A simple username and password combination, particularly on anything critical, is effectively useless.

Henry VI, Part 2, Act 4, Scene 2

The first thing we do, let’s kill all the lawyers.

I’ve heard back from my contact at Blue Point regarding the defunct electric car chargers. On 4th of August I was assured by Berkeley’s “Head of Estates” that matters were going to be soon resolved, and that a quick trip through Legal would see it all sorted. The ball is still in Berkeley’s court, and appears to have been left to lie. Because Brexit. Or Trump. Or the wrong sort of rain. Or something.

It is now 280 days since we first lodged a formal report of these units being dead, and I know they were dead for at least 6 months prior to that. That is 9 months and 6 days. Had the date of our first formal contact been the date of an imagined conception, then mother and child would now be safely at home and posting cute pictures to Facebook. I do hope that these units can be repaired before this imagined child is taking it’s first steps, or indeed entering school.

I am left with the inescapable and pervasive sense that Berkeley group would prefer to appear to be taking action on community facilities and environmentally sound improvements rather than actually taking action.

On Git Submodules…

Git Submodules. Just Say No. Not Even Once.

Docker and Consul and DNS, oh my

I’m still trying to wrap my head around networking when it comes to Docker and related technologies – I think because a lot of the documentation and examples around are either not quite correct, or are subtly out of date. I’ve noticed too that a lot of the writing out there around setting up Docker and/or Consul hand waves away the trickiness of the networking. Particularly egregious is the blithe insistence on just specifying host networking for all containers, something that the Docker project itself frowns upon.

Lovely Rita Meter Maid…

The saga of the non-functioning SourceLondon / PodPoint units in Woolwich Arsenal continues, with the lightning pace usually associated with continental drift, and the rise and fall of mountain ranges.

Continue reading “Lovely Rita Meter Maid…”

SSL Made Easy

Time for a shout-out to DreamHost, who have partnered with LetsEncrypt to make using SSL with this website very, very easy. DreamHost have always aimed to make many actions against the site push-button, with sensible defaults, and clear documentation, and generating and attaching the certificate was a walk in the park.

I was a little surprised to see the certificate expiring so soon, but LetsEncrypt’s rationale is very sound: re-rolling certificates can and should be automated, and limiting the life time of a certificate automatically limits the exposure if the certificate is subverted. It is very much in line with a core idea that they have: the default for HTTP traffic should be across SSL, or in some other way encrypted.

For me, the process was as simple as pushing the buttons on the DreamHost control panel, then do a bulk find-and-replace on my site to update any http links to be https. I will probably have to chase around the interwebs to find where I’ve published the old URL, but I’m pretty sure I’ve found and updated the important ones already.