Like any other code-worrier, I have a ton of applications on my (i)Phone, ranging from “things that look shiny but are useless”, through “things that I use once a year”, up to “indispensable and every-day”. Out of interest I’ve tried to work out what apps are the once that fall into the latter category, apps that are essential to getting my work done and which contribute strongly to the sense of never being out of the office.
Category: Technology
Things Technological
ORM?
It’s rather annoying that in 2015 the ORM (Object-Relational-Mapping) problem is still tedious to deal with. While in general terms it is a solved problem – JPA and Hibernate and similar frameworks do the heavy lifting of doing the SQL queries for you and getting stuff in and out of the JDBC transport objects – there does not seem to be any way to remove the grinding grunt work of making a bunch of beans to transport things from the data layer up to the “display” layer. It remains an annoying fact that database tables tend to be wide, so you wind up with beans with potentially dozens of attributes, and even with the best aid of the IDE you wind up fiddling with a brain-numbing set of getters, setters, hash and equals methods and more-or-less identical tests.
I would love to suggest an alternative – or build an alternative – but this remains a space where it feels like for non-trivial use there are enough niggling edge cases that the best tool is a human brain.
Doing More With Less (Part 1 of N)
In recent weeks I have been massively overhauling the monitoring and alerting infrastructure. Most of the low-level box checks are easily handled by CloudWatch, and some of the more sophisticated trip-wires can be handled by looking for patterns in our logs, collated by LogStash and exported to Loggly. In either case, I have trip wires handing off to PagerDuty to do the actual alerting. This appeals to my preference for strong separation of concerns – LogStash/Loggly are good at collating logs, CloudWatch is good at triggering events off metrics, and PagerDuty knows how to navigate escalation paths and how to send outgoing messages to which poor benighted bastard – generally and almost always me – has to be woken at 1:00 AM.
One hole in the new scheme was a simple reachability test for some of our web end points. These are mostly simple enough that a positive response is a reliable indicator that the service is working, so sophisticated monitoring is not needed (yet). I looked around at the various offerings akin to Pingdom, and wondered if there was a cheaper way of doing it. Half an hour with the (excellent) API documentation from PagerDuty, and I’ve got a series of tiny shell scripts being executed via RunDeck.
#!/bin/bash
if [ $(curl -sL -w "%{http_code}\\n" "http://some.host.com/api/status" -o /dev/null) -ne 200 ]
then
echo "Service not responding, raising PagerDuty alert"
curl -H "Content-type: application/json" -X POST \
-d '{
"service_key": "66c69479d8b4a00c609245f656d443f1",
"event_type": "trigger",
"description": "Service on http://some.host.com/api/status is not responding with HTTP 200",
"client": "Infra RunDeck",
"client_url": "http://our.rundeck.com"
}' https://events.pagerduty.com/generic/2010-04-15/create_event.json
fi
This weekend I hope to replace the remaining staff with a series of cunning shell scripts. Meanwhile the above script saves us potentially hundreds of pounds a year in monitoring costs.
Elephants and Pigs
Since I did have HomeBrew installed, I went ahead with this set of instructions, with some variation. Note that at the time I did this HomeBrew installed Hadoop 2.6.0, not 2.4.x as described at this site:
https://www.getblueshift.com/setting-up-hadoop-2-4-and-pig-0-12-on-osx-locally
Continue reading “Elephants and Pigs”Ruby Tuesday
Except it’s Monday. So today I am working from home in order to bootstrap my brain quickly up into a better understanding of Pig. First order of business being to install it locally. A quick Google and I find a number of resources talking about how to install Hadoop and Pig, two of the top three involving using HomeBrew:
Continue reading “Ruby Tuesday”First World Problems
So we have installed Rocki units in three rooms in the flat – the lounge, the bedroom, and the library. I just started playing a Clannad album from my laptop to the speakers in the lounge. Much to my bewilderment, a moment later a different album started playing in the bedroom.
I thought it may have been coming from my phone, so shut that down. Kept going. From my iPad? Shut that down. From my partner’s phone? Shut that down. Something bizarrely broken with AirFoil on my laptop? Shut down the laptop. Has someone managed to hack our network and is pranking us for lulz? Is it the NSA? MI5?
Nope. One of the cats had sat on the stereo remote control in the bedroom and started playing a CD.
I need a simpler life.
A Java Development Manifesto
I wrote this some years ago, mainly aimed at our java devs, but I think it comes close to my personal manifesto for coding in general.
Passwords definitely considered broken
So we have news of yet another major slurping-up of poorly secured credential sets. A column at the Guardian talks about all the usual measures that can be taken to more-or-less protect your multiple identities, but once again misses the two subtle and deeply geeky issues that underly this breach.
Singletons considered harmful
Ok, I know it’s not a new observation, but the Singleton pattern must be one of the most overused, and abused, patterns that the Gang Of Four described.
This is on my mind this week as I’m working on a body of code that has way too many Singletons. I must emphasise that ultimately it’s my problem, not the original author’s, as I dropped the ball over a year ago and did not review the design and implementation. The problem has come home to haunt me as I introduced just one change too many and all the tests began to fail.
Particularly in this case, while looking at test coverage I wondered why a pretty important piece of life cycle management wasn’t being traversed in tests. Which led me to have a close look and realise that it was buggy, and failing out right at the start of execution during tests. So I fixed that, and all the tests threw up because the Singleton in question was no longer in the expected state.
My main gripe with Singletons is that they run headlong into one of the cardinal rules of unit testing: all tests should be entirely independent of each other. The problem with a Singleton – particularly one that has some sort of lifecycle – is that suddenly tests are connected by the internal state of an object that may not even be the unit under test. Which leads to unstable tests prone to mystery failures. And unstable tests lead to a lack of confidence in the validity of the code.
Now, I’m going to need to articulate this to other coders to head off any repeat of this problem, so it’s worth my while to hand wave about when Singletons are appropriate, and when other techniques are better.
To begin with, I often see Singletons introduced to provide static pieces of code. I strongly suspect that this is because the coder does not understand how static methods and attributes work, or simply forgets. Probably the biggest single clue that these cases should not be implemented as Singletons is that they have no persistent state.
When talking it through with the team, both zoomed in to that idea from two different directions with little prompting: by thinking about the code construct (the Singleton pattern) instead of thinking about the data, it is way too easy to not see that the Singleton pattern gives the data state a different scope and different life cycle to other code.
In the space I’m mainly playing in, it’s fairly common to have a bunch of threads handling incoming requests from some external agency, all in kind of similar ways. This transactional model, if inverted to be data centric, can be summarised as: accept data, map it onto an output state, and throw away any working state in preparation for the next request. In Java terms the scope of all data is local to the thread. The data state of the Singleton, however, is at a higher level – an application or service level. Thus objection one: Singletons cause data states at different levels of abstraction or different levels of management to be promiscuously mixed.
This immediately leads to objection two: Singletons easily cause cross-thread side effects, as they bind threads together in non-obvious ways. This problem can be lessened if the Singleton provides read-only state, in which case it might be better done using static attributes, and if the potential side effects are well documented and described.
Objection three is somewhat more of an aesthetic gripe. The common ways in which Singletons are usually implemented in Java, apart from not being as thread-safe as they appear to the naive eye, beaks the doctrine of Separation of Concerns. The Singleton class has two responsibilities, not just one, which is a very bad smell: it is responsible for whatever it’s purpose in life is, and it’s responsible for making sure it’s alone in the universe.
There are a variety of ways of getting around this bad smell. A lot of runtime containers – be it simply the JVM firing up with some single instance of a class providing main(), or Spring or a web application server taking care of the “only one” behaviour behind the scenes – provide a trustable context for which you can say “if I make just one of these objects, and put it in that context, there will only be one of them”. In the case of the examples above, as well, it means that we have the instance of the object in some sort of “application” or “service” scope, with a life cycle that can be tied to the broader context.
At a bare minimum, if you cannot identify or obtain access to the application context, you should aim to separate out the two concerns – provide a class that does stuff, and a class that holds a single instance of that do-stuff class. While adding a little bit of extra boiler plate code, this simple change suddenly means you can test the two behaviours independently and that you can have thread-local instances injected in the scope of your independent unit tests.
And a final objection, primarily aesthetic. There are a bunch of different ways to build a Singleton in Java. Not all of them are thread safe, and it’s annoyingly difficult to do lazy instantiation in a thread safe manner, particularly if you want there to be exactly one run through a costly process. The ugliness arises because generally the methods to be thread safe are clunky kinds of fiddles that require the coder to think about the behaviour of the JVM instead the behaviour of their code. There’s that separation of concerns biting us in they arse again.
I do not think the pattern is to be universally avoided though. It’s highly probable that the application or service scope is stateful, and has a well defined life cycle. Like it or not, the life cycle state is a single piece of information that needs to exist at a different level of abstraction to the per-thread state (unless you are fortunate enough to be able to think entirely at a thread level, and there genuinely is no application level state).
As an example, I’ve fallen into the habit of using a roughly MVC architectural pattern. Sometime I will go into this in detail, but for now simply accept that it’s a handy simple framework to hang more complex behaviour of, while encouraging the decomposition of the code into easily testable parts. In my case, the ‘view’ is often provided as servlets, often with a RESTful design, and not necessarily provided by a single class. It’s pretty common for me thus to not have an accessible application level context without using Spring or similar. In these instances, I tend to use the Controller layer to hold the application-level state, and manage the application-level lifecycle. Of course, this is easily abused as well, as without paying attention you can find all sorts of pieces of code dialling home to the controller layer or object, but at least by separating the singleton aspects from the controller aspects, you can make the opportunity to not bind tests together.
Let me leave you with a thought experiment: if I have a simple web application with just a single servlet class, does that servlet class provide a single-instance application level context?
Journalled Out
I’ve been thinking in recent days that I could use something journal-ish. There are two aspects to this thinking. For one, I tend to accumulate documents and links to things that will probably be useful someday, or I want to remember short-term, but they get smeared everywhere. Bookmarks across several machines and browsers, text documents tucked into folders optimistically labelled ‘to-do’ or ‘in progress’, stuff in various note-taking applications. All of which leads to a definite sense of mental clutter which I really want to eliminate. I have identified that one of the things that makes me anxious is physical and mental clutter, a sense of being overwhelmed by Stuff To Take Care Of Right Now.
It would be nice just to declare mental bankruptcy, throw all this in the bin, tear off my clothes, and run naked into the woods to live as a wild man, feeding on berries and roots. Regrettably while this simple life has certain attractions – not the least being an opportunity to dispense gnomic wisdom and entirely fabricated home-spun philosophy to unsuspecting passers-by – it does not appear to be paid particularly well anymore. Besides, brambles, briars and badgers are not a good match for running naked through the woods at my age.
Initially I’ve been thinking about something like Day One, which has the attraction of being somewhat insulated against future obsolecence (as far as I can tell, the data is stored in individual PLIST files), as well as having a frictionless interface. That’s important. The benefit of pencil and paper is that it’s always on. The disadvantages for me are that I cannot read my own handwriting, and generally cannot fit a usefully large notebook in my pocket. Also, so much of what I need to refer to comes with a URL or an image associated with it, there’s friction arising from needing to manually link together disparate data repositories.
The elephant in the room for all of this (see what I did there) is of course Evernote. I was startled to discover how many apps I already have on phone, iPad and desktop natively link to Evernote, and the environment Evernote occupies is rich and varied. Which makes me a little nervous: if I went this way, would I then still have different bits of data scattered across multiple interfaces? Additionally, even though they appear to be an honest and reliable company, the product still revolves around having my data on servers for a ‘free’ service.
Sigh. Thinking is in progress.