So we have news of yet another major slurping-up of poorly secured credential sets. A column at the Guardian talks about all the usual measures that can be taken to more-or-less protect your multiple identities, but once again misses the two subtle and deeply geeky issues that underly this breach.
First off, we can guess that the Russians swept up a combination of user name / password pairs from various sources where the user name was in plain text, and the password was either in plaintext, or hashed. At best they were salted hashes, but in all probability they were simple MD5 hashes. The real risk that comes out of this is that this sort of volume of data feeds into the construction of rainbow tables and the more sophisticated techniques that entirely circumvent mindless brute force attacks. The more often this happens, the more likely it is that hashed passwords will be able to be converted back to plain text in the future. Having super long complicated passwords, as the article at the Guardian suggests, is not going to protect you if the site you have entered them on is storing them as plain text or unsalted hashes, or even as simplistically salted hashes. You will get some protection by using different passwords in different places, but if someone chooses to attack you and subverts key passwords like a Google or Facebook account, there is every likelihood they could amplify their attack using those core accounts.
(By the way, if all the talk of hashes and salted hashes and plain text and cypher text is bewildering you, I can strongly recommend Simon Singh’s “The Code Book” as an entertaining and very readable history of cryptography and survey of most of the current crypto methods now available. Also, wikipedia)
The real core problem remains that user name / password pairs are a terrible security mechanism. Even before time-stressed and semi-competent coders do something with user name / password pairs, it’s a lousy mechanism. And the things that these code monkeys will do with the credentials are very predictable: chuck them in a database table named something like ‘account’ or ‘password’ or ‘user’. If you are lucky they will MD5 encrypt the password rather than leaving it in plain text, but they probably won’t. A slightly more experienced coder will do a salted encryption, but because they’ve used your email address as a user name, it’s not real hard to reverse that. A smarter coder will use the underlying Unix password mechanisms, and cross their fingers that the system administrator has got the box protected sufficiently (this is actually reasonably secure, and we’re now getting into the realm of sophisticated hacks – as long as the system administrator has not left all the doors wide open). If they’re really thinking, they will use something like LDAP, but that’s hard, and relies on good infrastructure, and good system administrators, and starts costing money, and oh my god we’ve got to get this site live by Thursday, we will fix it later…
Yeah, good luck with that. You have to hope that the company who hired some cheap code monkey to bang out the website you are now setting up an account on didn’t go for the lowest bidder.
The real solution is that we technologists have to tear down this whole lazy, half-arsed default assumption that we will have a user-name/password pair. For a start, could we please start separating the difference between ‘identity’ and ‘authorisation’. And for God’s sake, when you are in a meeting with the client and discussion of the user-name/password entry required to order a pizza on line, leap across the table and throttle whoever is insisting that they really really really need that. Virtually none of the places I have had to create accounts actually need accounts in order to allow me to do business with them. It’s just lazy habit.
For authorisation, we’ve got any number of alternatives, such as OAUTH and OpenID, or even Facebook’s horrible and intrusive federated authorisation system. If you really, really want to have some sort of login, please outsource it to someone who knows what they are doing. Longer term, let’s get away from user names and passwords all together. What you are really trying to do is ask two questions: who is on the other side of the keyboard, and are they allowed to do what they are trying to do. Let’s go with biometrics. Let’s go with two-factor authentication. Let’s go with anything other than freaking username/password pairs.
And in all seriousness, if you are in the position of having any input to the design of (particularly) web-based systems, push back strongly on the requirement for the user to have identified themselves to transact or read the site. A good model is Amazon’s actually – you can faff around endlessly on their site, and you only need to eventually identify yourself when it comes time to provide payment details. An even better option that I’ve seen in a very few places is to ask at that point whether the user wants to ‘checkout as guest’ and allow them to provide whatever account details they want. Seriously, if the user is happy to type in their name and shipping address every time rather than having to commit to creating a user-name/password protected account, why stop them?