Let’s play CPU!

white ferris wheel against a blue sky with a few wisps of white clouds
Unrelated image from to make this post look nicer in social media shares.

I think one of many reasons it’s so hard to learn to code is that the people doing the teaching have all forgotten what it was like not to know how to code, so we skip over some really important basic steps because we think they’re inherently obvious. Case in point, this poor redditor who nobody ever taught how to work through a piece of code with a pencil and paper. I had to read the question a couple of times to understand what they were even asking, it just didn’t occur to me that someone wouldn’t know how to trace through code manually.

The example that user posted was “x=0; y=0 While (x<7) { x=y-x; y=x+x; x= x+ y + 1;}”

First of all, let’s clean that up a little:

while (x < 7) {
    x = y - x;
    y = x + x;
    x = x + y + 1;

In the beginning x is 0, which is definitely less than 7 so we enter the loop.

Inside the loop, take each statement and swap out the variables for their current values and then do the math and put that value in the variable:

x = y – x becomes x = 0 – 0 becomes x = 0
y = x + x becomes y = 0 + 0 becomes y = 0
x = x + y + 1 becomes x = 0 + 0 + 1 becomes x = 1
At the end of the loop x = 1 and y = 0

Now we check the condition at the top of the loop again and x is still less than 7 so we go through the loop again

x = y – x becomes x = 0 – 1 becomes x = -1
y = x + x becomes y = -1 + -1 becomes y = -2
x = x + y + 1 becomes x = -1 + -2 + 1 becomes x = -2
At the end of the loop x = -2 and y = -2

At this point, I’m getting very suspicious this loop will never exit. I don’t think x is ever going to be greater than or equal to 7, but just to be sure let’s go through the code one more time

x is still less than 7 so we go through the loop again

x = y – x becomes x = -2 – -2 becomes x = 0
y = x + x becomes y = 0 + 0 becomes y = 0
x = x + y + 1 becomes x = 0 + 0 + 1 becomes x = 1

At the end of the loop x = 1 and y = 0. These are the same values we had after we went through the loop the first time, so now we can be completely sure the loop never ends. After each time through the loop x is either 1 or -2, there’s no way for it ever to be 7 or more.

That’s it. Working through code with a pencil and paper is just keeping track of what values all of your variables have and swapping out variables for their values. It’s really tedious, that’s why we make computers do it :) And to be clear, you’re never expected to be able to work through complicated code like that purely in your head. Everybody else needs a pencil and paper too, there are just too many details to keep them all in your working memory.

Okay, so if working through code with a pencil and paper sucks so much, why does anyone do it? Because it gives you a really solid understanding of what the computer is doing, which you’ll need later when you start building more complicated things. For a simple loop like the one above you can just throw that snippet of code into an IDE, run it, and see what happens, but you can’t even start building bigger things without a foundation of knowing what the computer is doing when it runs your code.

Repetition is underrated

a brown-skinned woman wearing a dark blue top and a white skirt plays an electronic keyboard
Loosely related image from to make this post look nicer in social media shares.

The more I read about the struggles beginner programmers have and the more I mentor at Ladies Learning Code workshops, the more I think that repetition is seriously underrated when it comes to teaching anything technical.

When you’re learning a language or an instrument or a sport or a physical skill like plumbing or carpentry or even math, which is similar to programming in a lot of ways, you expect to have to repeat the same thing multiple times before you remember all the details and get it right every time, but somehow when you’re learning to code we assume you should only need to be told how a for loop works once and you’ll understand it perfectly right away and remember it forever. That’s just not how our brains work and that’s why I think repetition is so underrated.

It is completely normal to need to write a lot of for loops before you get it right every time. Some people do pick up programming concepts very quickly but that doesn’t mean you’re dumb or abnormal if it takes you longer. Something like a for loop (or anything programming construct) seems simple once you understand it, but there are actually an enormous number of fiddly little details you have to keep track of to get your loop to work. You know what’s a great way to learn all of those details so thoroughly you hardly even think about them anymore? That’s right, repetition!

Programming isn’t just a matter of understanding the concepts, you’ve also got to learn them so thoroughly you don’t have to think about how to write a loop, you just write one when you need it. Without that thorough of an understanding, programming is like trying to speak another language by looking up each word individually in your [other language] to English dictionary. That’s one way to learn, but you’ll probably just be miserable and frustrated and forget what you were trying to say in the first place before you’re even halfway through your sentence.

Come to think of it, maybe the kind of drills you do when you’re learning a language could help people learn to program without spending so much time worrying that they’re just not smart enough.

Anyway, repetition is seriously underrated and I wish we collectively talked more about how much practice it takes just to pick up the basics of programming, let alone the complicated stuff like deciding how to structure your code or what to name things (still one of the hardest problems in programming :) ) I think a lot of programming tutorials assume that you know you need to practice without explicitly saying it or without making it clear just how much practice you need. I have to admit I wouldn’t want to write pages and pages of very similar programming exercises either, but it’s not so hard just to say that programming takes practice and you’ll probably need to go through the tutorial a few times before it sticks.

If you did a tutorial or a workshop just once and didn’t instantly become a good programmer, congratulations, you’re normal! If you keep at it, you’ll get there.

Let the computer do it for you

Loosely related image from to make this post look nicer in social media shares.

Many years ago when I was in college, a friend taught me something that’s been really useful for my whole career. That lesson was to let the computer do it for you. Unless there’s a compelling reason to do things the hard way, just let the computer make things easier for you.

Learning lisp? Get a plugin for your text editor or a simple IDE like Dr Racket that matches your parens for you.

Working with fixed width files? Get Record Editor, it’ll save you so much time!

Fighting with javascript? Get WebStorm! Seriously, it’s freaking amazing.

If you have to do something tedious and fiddly, get the computer to help you. That’s what it’s good at. There is a tool out there that will make your life easier, you just have to go and look for it. The hard part is remembering to go look for it :)

It’s normal to think that if there was a tool you could use your teacher or your boss or someone on your team would have told you about it, but thinking to look for tools like that is surprisingly unusual. And to be fair, it can be a real timesink to try tool after tool just to find out none of them are better than what you’re already using.

That said, if someone on your team hasn’t already told you that they’ve tried everything they could find (and searched for tools recently), it’s worth doing at least a little Googling and seeing if there’s already a tool that would make your life easier. Other people use lisp, you can safely assume there are tools out there for it. Other people use fixed width files, you can safely assume there are tools for those too. And tons of other people use javascript, tools for that are getting better all the time.

If there isn’t already a tool, you might be able to build one. That’s exactly what bash or powershell scripting is for – fiddly things that humans are bad at and tedious but simple things that humans just don’t wanna do. If you’ve ever run a build script, you’ve used a custom tool that someone wrote to automate building your application so it gets done the exact same way every single time. No matter what OS you’re running, there will be other scripts out on the internet that you can take apart to build your own custom tools.

Whether you find tools or build your own, remember that boring stuff is the computer’s job!

Rich Hickey – Hammock Driven Development

Somewhat related image from to make this post look nicer in social media shares.

A little while ago a friend of mine shared this list of Talks that changed the way I think about programming in one of the many Slacks I’m in and I finally started watching them.

Here’s my recap of one of the talks from that list, Hammock Driven Development by Rich Hickey. In it he describes his process for solving hard problems, which is obviously a huge part of what programmers do and is probably useful for just about anyone else too. There are a couple of technical examples in the talk but almost all of it should make sense to a non-programmer.

I put numbers on these steps to make this blog post easier to follow, but that doesn’t mean you have to follow this process in this exact order. The first couple of steps really do need to happen first, but after that you may want to iterate over the remaining steps a few times and possibly even go back to the beginning to make sure that you really do understand your problem in the context of all the new stuff you’ve learned while you were trying to solve it.

Step 1: state the problem. Say it out loud, talk with team member about it, write it down if nothing else.

Step 2: understand the problem. What do you know already (facts, context, constraints)? What don’t you know? For example, if you know you will need input data, where does it come from? if the primary source goes down, is there a secondary? Are there related problems? Hint: there are. Your problem is almost certainly not unique in the history of the world. And if you’re going to bother to figure all that stuff out, write it down!

Step 3: be discerning. Not everything is awesome! Look for problems in your proposed solutions and try to solve those too right away up front. Do you think there aren’t any tradeoffs in your proposed solution? Not likely, it’s much more likely you’re missing something. You should have questions – nobody knows everything! If you don’t have questions, you’re missing something. Write all this stuff down too.

Step 4: more input better output. You can’t connect things you don’t know about. Read in and around your space (not just your exact problem, also similar stuff so that you can let that simmer and let your brain make connections). Look (critically!) at other solutions too. Great solutions can come from tearing apart someone else’s idea.

Step 5: tradeoffs. Everybody says design is about tradeoffs, but you need to look at at least two possible solutions and figure out what’s good and bad about those things before you can really say you made a tradeoff. Write this down too! You may be seeing a theme here :)

Step 6: focus. This is where the hammock comes in. One of the great things about the hammock is that if you lie down with your eyes closed, no one knows you’re not sleeping and they won’t bother you because you might be sleeping so you can focus completely on your problem without interruptions. The computer is a prime source of distractions – you need to get away from the computer if you want to focus. You can’t do everything – if you really focus on something you’ll drop other balls like returning phone calls or emails, or remembering to run that errand or fix that bug you said would be really quick. Let your loved ones know what you’re doing and why you’re not going to be super responsive for a while (as an aside I really liked this point about treating people who matter to you as if they, you know, matter to you). An interesting detail Rich includes in this step is that you should close your eyes to really focus – at this point you’re not looking for any external input at all

Rich has a bunch of interesting stuff to say about what he calls your waking mind and your background mind (in the Leaning How to Learn course they call it focused and diffuse modes of attention, if you’ve taken that course this will all be pretty familiar). He uses his waking mind to assign tasks to his background mind and to analyze it’s products. Your background mind is where you solve most non-trivial problems but you can only feed it, not direct it. You need to spend enough time thinking about something for it to become an agenda item for your background mind to get anything useful from it.

Step 6 has two substeps, loading up your background mind and then letting it work. To load up your background mind you review everything you wrote down in the previous steps and really focus on it. Because your waking mind can only keep track of so many pieces of information at once (7 +- 2, as far as I know), you need to have written everything you know about your problem down so you can swap those pieces back into your working memory when you need them. To load all of this into your background mind you need to survey everything you’ve written down to give your background mind the chance to make connections between different things that your working mind didn’t realize could be connected. Rich stresses that you can’t do this at the computer, you need to go somewhere and have no external input. He recommends going somewhere you can close your eyes but not go to sleep so you can sort of switch your brain from external to internal input(all that information about your problem that you just reviewed).

The other substep is to leave the problem alone for a while and let your background mind do its thing. Sleep is a great way to do this, or going for a walk, or anything that gets your waking mind off of the problem. If you’re working on a hard problem you should sleep on it for at least one night, more if it’s a really hard problem. Just because you have an idea right away doesn’t mean you won’t have a better one in the morning. And once you have an idea, analyze it, don’t just run with it. Just because you slept on it doesn’t mean that solution that popped into your head in the morning is perfect.

Optional step 7: don’t stay stuck. If you have another problem to work on, do that when you feel stuck instead of just sitting there and staying stuck. If you don’t have another problem or really need to stay focused on your current one, gather more input, talk with more people if you can, and go through the whole process again.

A caveat to this whole process is that if you do have more than one problem you’re working on, you may spend hours loading up all the information about one problem and wake up the next morning with the answer to a different problem. That’s a thing you just have to accept, brains are funny sometimes. If you can’t switch to the other problem, at least write down the solution you came up with before going back into gathering information and loading it up for the problem you really want to solve.

Rich’s talk is really good and I highly recommend watching it yourself, but if you don’t have 40 minutes to dedicate to sitting in front of your computer then I hope this recap was helpful.

What does being an experienced developer actually mean?

Unrelated image from to make this post look nicer in social media shares.

I was poking around in /r/javahelp, saw a question about something I haven’t personally done, and found what I strongly suspect was the answer right away.  I try to explain how I found the answer when I answer questions like that because “oh it’s _____” isn’t really that helpful in the long term. I mean, what is the question asker supposed to do next time if I’m busy or sick or on vacation?

But how do you explain “well obviously that’s a_____ problem”? I can explain that I googled a thing and didn’t get very helpful results so I googled another thing and found what I needed, but not how I knew what to search for in the first place.

The closest I can come to explaining how I recognize problems is a term I recently learned from Katrina Owen‘s talk Cultivating Instinct. That term is perceptual learning, which according to Wikipedia is:

learning better perception skills such as differentiating two musical tones from one another or categorizations of spatial and temporal patterns relevant to real-world expertise as in reading, seeing relations among chess pieces, knowing whether or not an X-ray image shows a tumor.

After you’ve worked with enough code and seen enough different kinds of problems, you just kind of know what the answer is (or at least a few things it could be) when you see a new problem. And that, finally, is what being an experienced programmer really means. It’s not about some magical level of skill or about instantly knowing the perfect solution for every problem or even finally feeling like you know what you’re doing, it’s just about having seen enough problems that you end up with a library of them in your head.

I can’t tell you how long that will take but on the upside I can tell you it will definitely happen if you just keep coding. Okay, there are people with “ten years of experience” who really just have one year ten times, but if you care enough about programming to bother reading my blog, you’re not one of them.

Another part of being an experienced programmer is that once you feel confident about getting your code to do what you wanted, you sort of move up a level and start worrying about whether it’s a good idea to do it that way. That’s where people with one year of experience ten times really fall down, they can make things work but can’t make them maintainable. Getting to the point where you worry about the right way to do things takes a little more deliberate effort, but the most important step is just to care about whether it’s possible to change your code.

Once you’re past struggling to get your for loops to work and once you’re past feeling totally overwhelmed by starting a new project, it’s really tempting to think that you’re done, you’re a “real” programmer now and there’s nothing else you need to learn. But if you stop there, not only will you never be a very good programmer, but more importantly you’ll miss out on one of my favourite parts of programming, which is figuring out how to design things so you won’t curse your name in six months when you need to change them. For me that’s way more fun than Strings and Arrays.

All this is to say that if you’re a beginner programmer and you see a job posting looking for developers with x years of experience, all they really want is someone who can reliably get things to work (not necessarily immediately) and who knows a bad idea when they see one. If you can do that, go ahead and apply. It’s not unusual for job postings to be more of a wishlist than a list of what the job actually requires, anyway :)

Be a better programmer while still having a life: part 8

The big tip for post #8 in the be a better programmer while still having a life series is to become a witch. A Terry Pratchett style witch, to be precise. Terry Pratchett’s witch characters are really great at two things: first sight and second thoughts. To quote him directly:

First Sight and Second Thoughts, that’s what a witch had to rely on: First Sight to see what’s really there, and Second Thoughts to watch the First Thoughts to check that they were thinking right.

And no, I’m by no means the first person to connect Pratchett-style witchery with programming or design. Hint: go read that blog post, it’s really good.

Back at my original point, first sight is seeing what’s actually there, not what you wish was there or what you thought was there or what you meant to put there. Does that remind anyone else of debugging?

Fortunately for programmers, we have tools like debuggers and IDEs to help us see what’s actually there. We also have techniques like simply getting up and taking a walk, or explaining our problem to a rubber duck (or maybe another programmer if it’s a really hard problem), or commenting out half of our code and then half of that half and so on until we find the problem line. Let’s just not think about how much programming must have sucked in the days before friendly IDEs that highlight mistakes for you :)

Unrelated image from to make this post look nicer in social media shares.

Another part of first sight for programmers is also your attitude. If you don’t want to see the problem, you’re just not going to no matter how observant you are normally. I’m by no means perfect at it myself, but I’m convinced the most useful attitude you can bring to debugging is the simple acceptance that you got at least one thing wrong. The longer you spend insisting that your code should work, the longer it takes to figure out what’s actually wrong with it.

Moving on, second thoughts are thoughts about your thoughts. When you think you know the best way to build something, why do you think that? How do you know you’re right? Is that actually the best way or is it just the first way you thought of? How would you know either way? What constitutes the “best” way to do something? Is “best” the most performant, the easiest to read, the easiest to change, the quickest to write, the easiest to test? If “best” for your project meant quickest to write yesterday, does it still mean that today? How would you know when that changes?

Checking up on yourself like that is really hard to do and that’s why this post is more for me than for you – I’m trying to remind myself to question my assumptions.

One of the traps I fall into most often is looking for an example of what I want to do in our existing code and then assuming the first thing I find is the right way to do it. Shockingly enough, codebases change over time. Just because something worked well when it was written doesn’t mean nobody has thought of a better way since then or that the rest of the app hasn’t changed enough to make the old “right way” completely different from today’s “right way.” Just like you look for a couple of sources that agree with each other when you’re Googling what an error message means, look for a couple of examples in your codebase and if they’re different, check which one is newer.

Getting into the habit of thinking about how you think is not easy (at all!), but it’s useful and, like the other installments of this series, not something that you have to devote all of your free time to. It’s also useful in pretty much every area of your life. When you have any problem to solve, how do you know you’re right about how to solve it? For that matter, how do you know you’re right about what the problem is?

When I’m stressed out, every little thing drives me absolutely crazy. I can end up convinced that what’s bothering me is that this stupid freaking feature won’t work no matter what I do when the real problem is that I’m trying to hit a tight deadline and marketing keeps changing their minds about what’s important and half the QA team is sick so they need extra time to test everything and that means I need to deliver even sooner and everything is terrible!

Okay, so what do you do about that? For starters you really should read that blog post I linked earlier, Amy Hoy goes into a lot of detail about learning to notice yourself thinking. My big tip is just to get into the habit of asking yourself “Why? Why did I decide that? Why is that the best way? Why is that bothering me so much?” Sometimes the answer is going to be stupid simple: I decided to go to cafe at the front of my building for lunch because the weather was hideous and I didn’t want to go outside. Sometimes the answer will lead to more questions, like when you ask yourself “Why did I decide to put that config file in that directory?” In my case the answer was “Because that’s where the other config file lives” which leads to another question: “How do I know both config files should go in the same directory?” From there I learned all sorts of stuff about which files were supposed to go in the original directory and why, and where the other file that was related but not the same type of config ought to live.

This is the kind of thing that takes a lifetime to master, so don’t feel bad if you don’t get it right away. Asking yourself those questions is still worth it even if you only remember to do it sometimes.

The wall

Unrelated image from to make this post look nicer in social media shares.

I was talking with a friend who’s learning to code the other day, and the subject of the wall came up. Not the one that keeps the wildlings out, the one everybody slams into when they’re learning to code. Learning to code starts out great, there are so many tutorials that break things down really clearly, but once you’ve got a handle on the basics and want to move on to building your own projects, that’s where you hit the wall. It’s a huge leap from following tutorials where all the hard decisions are made for you to building your own projects with no one standing by to tell you where to start or what database to use or what your UI should look like.

The single thing I most want you all to know is that hitting the wall is normal. It happens to everyone. It absolutely does not mean that you’re dumb or not meant to be a programmer or that you’ll never get over the wall.

As a bit of an aside, I think the number of beginners who hit the wall and assume they’re just not smart enough to be programmers says more about how bad we collectively are at teaching programming than about the intelligence of anyone who hit the wall and walked away. I’m suspicious there’s a connection between how easy it is to write total beginner tutorials and how many of them there are, and how much harder it is to teach people to break down a problem and how few tutorials there are for that.

But anyway, I have some ideas for people who have hit the wall or who can see it in the distance and are getting worried.

One of the coolest things about programming is how many open source projects there are. Find one that you like and take it apart to see how it works. Search for text from the UI in the code and see if you can change it. Throw log messages all over the place to make the code show you what it’s doing. See if you can find some constants in there you can mess with. And don’t feel left out if you want to make games, those can be open source too.

Once you’ve found a project you like and have some idea how it works, see if you can change how it works. Let’s say you found a simple todo list app. Can you add due dates to your list items? Or subtasks? Could it play a sound and/or an animation to congratulate you when you check something off? Could you add a new feature like recurring tasks or email reminders? If you don’t have an open source app handy, just take a tutorial and mess with that.

Speaking of tutorials, if you do enough of them you’re going to start seeing similarities. Most apps have some sort of UI, some sort of data model, maybe a way to save that data for the next time you open the app (that’s pretty advanced, though, don’t worry about it right away), some logic about what users are allowed to do (like due dates can’t be in the past or players can’t have more than x hitpoints no matter how many health packs they use), maybe some communication with other APIs (but again, that’s advanced, don’t worry about it right away), and honestly, that’s pretty much it.

You can try mashing up different tutorials or open source projects too. Let’s say you have a tutorial for a driving game and one for a game where you run around and collect coins or stars or whatever. What if you could drive around and collect stars? What if you had tutorials for a todo list app and a weather app and mashed them up to make a little morning dashboard for yourself?

Don’t forget, you don’t have to do it all yourself. There are great communities like CodeNewbie, /r/learnprogramming, CodeRanch, (and lots more if you do a little Googling) full of people who will help you out.

Dev tool of the day

Exercism is a tool that lets you download and solve practice problems in over 30 different languages. I mentioned it in passing before, but let’s talk more about why it’s great.

First of all, each problem in Exercism has a set of unit tests, so you don’t have to wonder if you’re doing it right, you can just run the tests and know for sure. The tests are also great for experimenting with your code and seeing if you can make it easier to read or easier to change without breaking it.

The problems are also carefully chosen to help you learn concepts that are important to each language. Just because you can solve a problem in a certain language on a coding challenge site like HackerRank doesn’t mean you’re learning anything interesting about that language in particular. I know that’s not really what challenge sites are meant for, but I’ve seen them recommended to a lot of people who are learning and think it’s important to be clear about what challenge sites are good for (general programming concepts) and what they aren’t necessarily good for (learning individual languages).

And finally, Exercism directly incorporates both giving and receiving feedback. Obviously getting feedback is helpful – to directly quote the Exercism site: “You can write FORTRAN in any language, as the saying famously goes, but with enough feedback, you’ll quickly find yourself writing the language the way it wants to be written.” – but giving feedback is seriously underrated. To tell someone what you think of their code, you have to read it carefully and then think seriously about what makes code good or bad. That’s enormously helpful when you’re new and don’t really understand what “good code” means yet, or when you’re new to a language and just don’t know the best way to do things in that particular language.

Give Exercism a try!

Be a better programmer while still having a life: part 7

Unrelated image from to make this post look nicer in social media shares.

Back in part 1 I talked about how important it is to make sure you understand the problem you’re trying to solve. Today I want to expand on that because there’s much more to problem solving. Having a great understanding of the problem you’re trying to solve is great, but it’s not always enough. Sometimes you’re wrong about what the problem actually is. No matter how well you understand the problem you think you have, it’s not going to do you much good if you’re trying to solve the wrong problem.

Telling people to make sure they’re solving the right problem is all well and good, but an actual example always makes things a lot clearer. Conveniently enough, I saw a great example of this problem on the other day. To summarize the question quickly in case it disappears someday, the questioner wants to know if there are any alternatives to doing code reviews because not everyone likes doing code reviews. To quote part of the question:

Are there any alternative processes that could replace the code review for the goal of improving the code quality? Would it be possible to have something else instead of this process? While review may be required where software bugs kill humans, could some weaker method be sufficient where the situation is far from that critical?

An edit clarified that the reason the question asker is looking for an alternative to code reviews is because people in their organization use them to play power games and prevent other team members from contributing to the project. At this point you may be developing a theory about why I think “what can we do instead of code reviews?” is the wrong question :)

This particular question did happen to contain a great clue – there’s really no substitute for reviewing your code if you want to improve it. That’s kind of like saying you want to be a better writer but you don’t want anyone to proofread your work. When your solution goes directly against your stated goal, there’s almost certainly a deeper problem. Sometimes that problem is fixable and sometimes it’s not, but there’s definitely something there you need to look into.

Given that the reason the question asker wants to find an alternative to code reviews is because team members are using them to jerk their colleagues around, I don’t think it’s too much of a leap to the conclusion that the real problem is that people are being jerks and playing power games when they’re supposed to be working as a team and that trying to avoid code reviews is just a workaround for a serious culture problem.

To be clear I don’t blame the question asker for trying to solve the wrong problem. I’m assuming they aren’t a manager and/or don’t have the authority to tell the power game players to knock that shit off and start acting like grownups, so finding some way to avoid code reviews without completely ignoring code quality is about all they can do to work around the real problem. But if you’re going to do that, and sometimes finding a workaround/bandaid solution is the only thing you actually can do, I still think it’s important to be honest with yourself that what you’re doing is putting a bandaid on the real problem. If you forget that, you’re going to get a nasty surprise later when it turns out the real problem has popped up again in a different form.

To keep harping on the code review example, just because you’ve removed one avenue for for jerks to play power games doesn’t mean everyone is going to start playing nice. If you do retrospectives or post-mortems of any sort, jerks are going to use those to throw their colleagues under the bus and/or to take credit for their work. Whatever system you use to assign work, jerks will try to abuse it to keep the interesting/fun/easy/politically valuable tasks for themselves and leave the dregs for someone else. And no matter what you try to do to control bad behaviour in your development process, you can’t prevent someone malicious from going to lunch with their dev manager buddy and complaining that that one feature sales keeps pushing for has to be postponed again because so-and-so just isn’t contributing anything (of course they’ll leave out the fact that the malicious dev won’t approve any of their pull requests), they’re such a drain on the team.

This particular problem is especially difficult to actually solve because the real solution is for management to do their jobs and enforce consequences for sabotaging team mates and otherwise refusing to act like a professional. Making anyone, especially someone who outranks you, do their job is never an easy task, so I completely sympathize with the urge to “fix” the symptom rather than the root cause. Some problems are simply above your pay grade, others may be so complicated or expensive to fix that it’s better for the business to keep working around them.

In other cases, fixing the root cause of the problem actually is doable and cheaper or more efficient than keeping a clumsy workaround. Even then, you can’t fix the root problem without knowing what it is, so keep asking why until you get down to a bedrock answer like “Because that’s how this company makes money.”


How does a hash function work anyway?

A while ago I wrote about how hash maps work, but something’s been bugging me. How does the hash function do its thing? I know hash functions make variable length data into fixed length data but how do they do that? To be clear I’m interested in the kind of hash you would use for a hash map, you would definitely want a more secure hash to keep your passwords safe.

Thanks to the magic of the internets, it’s really easy to find the function java uses to calculate a String’s hashcode.

/* Returns a hash code for this string. The hash code for a String 
object is computed as s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using int arithmetic, where s[i] is the ith character of the string, 
n is the length of the string, and ^ indicates exponentiation. 
(The hash value of the empty string is zero.)
Returns: a hash code value for this object. */

public int hashCode() {
  int h = hash;
  if (h == 0) {
    int off = offset;
    char val[] = value;
    int len = count;

    for (int i = 0; i < len; i++) {
      h = 31*h + val[off++];
    hash = h;
  return h;

Okay great, that’s totally clear, right? ;)

Yeah, I have no idea what it’s actually doing either. But I can find out!

First of all, where are the values of hash, offset, and count coming from? They must be instance variables because they weren’t passed in as parameters. I poked around in the String code a little more and it turns out hash is defaulted to 0 when it’s declared, offset is set to 0 in the constructor, and count is set to the size of the string when it’s created.

Unrelated image from to make this post look nicer in social media shares.

The first thing hashCode actually does is checks if hash is 0. If it’s not, then we know we already computed the hash and we can just return it and go on with our day. Makes sense, why do the same calculation over and over again when we can just do it once and store the result? I think that’s the same reason count is stored separately instead of just calling value.length() when you need it. We know the length will never change because Strings are immutable, so why not save ourselves a lookup?

The next weird thing is how the method is adding a number to a char. Chars are characters, not numbers, aren’t they? Well, yes and no. According to the docs, a char is “a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive).” That 0 to 65,535 part seems suspiciously like a number :) You can also test that out yourself in the Java REPL. It turns out Java will happily treat a char like an int if you ask it to.

The rest of it is pretty simple, we’re just looping through every character in the string and adding (31 * current hashcode) + current character to the existing hashcode.

Okay, but how does that map a string of any length to a hash code of fixed length? Shouldn’t a longer String always have a larger hash code? Not if your hashcode is an integer! Those just roll over into negative numbers if you add too large of a number to them. And because 2 and -821785444 are both integers they take up the same amount of memory, which means that no matter what size String you start with, the hashcode is always the same size.

Another interesting little detail of how hashmaps actually use those hashcodes is that they rehash your hashes. If everyone used random Strings for keys then they wouldn’t need to, but because keys are usually Strings with some kind of meaning, that means the hashes for those keys won’t be evenly distributed. That is, a hashcode doesn’t have an equal chance of being any number from -231 to 231-1, you’re going to get clumps of hashes around some numbers because you’re more likely to use some Strings than others.

Great, but why does that matter? Performance! The more collisions you have (different Strings that happened to work out to the same hashCode), the more elements you need to look at to find the one you wanted and the worse your performance is. To get around that, java does some bitwise operations on the hashcode to reduce the number of collisions.

Now we all have some idea what actually happens when you use a HashMap :)