Mel Reams

Nerrrrd

Be a better programmer while still having a life: part 8

The big tip for post #8 in the be a better programmer while still having a life series is to become a witch. A Terry Pratchett style witch, to be precise. Terry Pratchett’s witch characters are really great at two things: first sight and second thoughts. To quote him directly:

First Sight and Second Thoughts, that’s what a witch had to rely on: First Sight to see what’s really there, and Second Thoughts to watch the First Thoughts to check that they were thinking right.

And no, I’m by no means the first person to connect Pratchett-style witchery with programming or design. Hint: go read that blog post, it’s really good.

Back at my original point, first sight is seeing what’s actually there, not what you wish was there or what you thought was there or what you meant to put there. Does that remind anyone else of debugging?

Fortunately for programmers, we have tools like debuggers and IDEs to help us see what’s actually there. We also have techniques like simply getting up and taking a walk, or explaining our problem to a rubber duck (or maybe another programmer if it’s a really hard problem), or commenting out half of our code and then half of that half and so on until we find the problem line. Let’s just not think about how much programming must have sucked in the days before friendly IDEs that highlight mistakes for you :)

Unrelated image from pexels.com to make this post look nicer in social media shares.

Another part of first sight for programmers is also your attitude. If you don’t want to see the problem, you’re just not going to no matter how observant you are normally. I’m by no means perfect at it myself, but I’m convinced the most useful attitude you can bring to debugging is the simple acceptance that you got at least one thing wrong. The longer you spend insisting that your code should work, the longer it takes to figure out what’s actually wrong with it.

Moving on, second thoughts are thoughts about your thoughts. When you think you know the best way to build something, why do you think that? How do you know you’re right? Is that actually the best way or is it just the first way you thought of? How would you know either way? What constitutes the “best” way to do something? Is “best” the most performant, the easiest to read, the easiest to change, the quickest to write, the easiest to test? If “best” for your project meant quickest to write yesterday, does it still mean that today? How would you know when that changes?

Checking up on yourself like that is really hard to do and that’s why this post is more for me than for you – I’m trying to remind myself to question my assumptions.

One of the traps I fall into most often is looking for an example of what I want to do in our existing code and then assuming the first thing I find is the right way to do it. Shockingly enough, codebases change over time. Just because something worked well when it was written doesn’t mean nobody has thought of a better way since then or that the rest of the app hasn’t changed enough to make the old “right way” completely different from today’s “right way.” Just like you look for a couple of sources that agree with each other when you’re Googling what an error message means, look for a couple of examples in your codebase and if they’re different, check which one is newer.

Getting into the habit of thinking about how you think is not easy (at all!), but it’s useful and, like the other installments of this series, not something that you have to devote all of your free time to. It’s also useful in pretty much every area of your life. When you have any problem to solve, how do you know you’re right about how to solve it? For that matter, how do you know you’re right about what the problem is?

When I’m stressed out, every little thing drives me absolutely crazy. I can end up convinced that what’s bothering me is that this stupid freaking feature won’t work no matter what I do when the real problem is that I’m trying to hit a tight deadline and marketing keeps changing their minds about what’s important and half the QA team is sick so they need extra time to test everything and that means I need to deliver even sooner and everything is terrible!

Okay, so what do you do about that? For starters you really should read that blog post I linked earlier, Amy Hoy goes into a lot of detail about learning to notice yourself thinking. My big tip is just to get into the habit of asking yourself “Why? Why did I decide that? Why is that the best way? Why is that bothering me so much?” Sometimes the answer is going to be stupid simple: I decided to go to cafe at the front of my building for lunch because the weather was hideous and I didn’t want to go outside. Sometimes the answer will lead to more questions, like when you ask yourself “Why did I decide to put that config file in that directory?” In my case the answer was “Because that’s where the other config file lives” which leads to another question: “How do I know both config files should go in the same directory?” From there I learned all sorts of stuff about which files were supposed to go in the original directory and why, and where the other file that was related but not the same type of config ought to live.

This is the kind of thing that takes a lifetime to master, so don’t feel bad if you don’t get it right away. Asking yourself those questions is still worth it even if you only remember to do it sometimes.

Quick guide to Chef + OpsWorks for Java devs who have other things to do

First, a caveat: I learned this on EC2 instances and make no promises that it will work in any other setup. That said:

  • The recipe runlist is not anywhere in your custom cookbooks, it’s in the layer settings for each layer in your stack.
  • If you add a new cookbook, recipes absolutely have to go in the recipes folder under the [cookbookname] folder.
  • Data that your recipe expects, such as “node[‘datadog’][‘jmx’][‘instances’]” must exist in an attributes file or your stack settings or something, but doesn’t necessarily have to have a value. The structure just needs to exist with the expected names.
  • Cookbooks have metadata (metadata.md) where you define the version number of your cookbook and which other cookbooks it depends on. You can optionally list the recipes inside the cookbook but you don’t have to.
  • If you update your cookbook you can update that cookbook on your instances without redeploying them by running the Update Custom Cookbooks command from your stack (the run command button is beside the stack settings button on the “Stack” page for your stack).
  • Once you’ve updated your cookbook you can run your updated recipes almost the same way, just select “Execute Recipes” instead of “Update Custom Cookbooks” in the dropdown and put a list of recipes to execute in the field below the dropdown.

 

The wall

Unrelated image from pexels.com to make this post look nicer in social media shares.

I was talking with a friend who’s learning to code the other day, and the subject of the wall came up. Not the one that keeps the wildlings out, the one everybody slams into when they’re learning to code. Learning to code starts out great, there are so many tutorials that break things down really clearly, but once you’ve got a handle on the basics and want to move on to building your own projects, that’s where you hit the wall. It’s a huge leap from following tutorials where all the hard decisions are made for you to building your own projects with no one standing by to tell you where to start or what database to use or what your UI should look like.

The single thing I most want you all to know is that hitting the wall is normal. It happens to everyone. It absolutely does not mean that you’re dumb or not meant to be a programmer or that you’ll never get over the wall.

As a bit of an aside, I think the number of beginners who hit the wall and assume they’re just not smart enough to be programmers says more about how bad we collectively are at teaching programming than about the intelligence of anyone who hit the wall and walked away. I’m suspicious there’s a connection between how easy it is to write total beginner tutorials and how many of them there are, and how much harder it is to teach people to break down a problem and how few tutorials there are for that.

But anyway, I have some ideas for people who have hit the wall or who can see it in the distance and are getting worried.

One of the coolest things about programming is how many open source projects there are. Find one that you like and take it apart to see how it works. Search for text from the UI in the code and see if you can change it. Throw log messages all over the place to make the code show you what it’s doing. See if you can find some constants in there you can mess with. And don’t feel left out if you want to make games, those can be open source too.

Once you’ve found a project you like and have some idea how it works, see if you can change how it works. Let’s say you found a simple todo list app. Can you add due dates to your list items? Or subtasks? Could it play a sound and/or an animation to congratulate you when you check something off? Could you add a new feature like recurring tasks or email reminders? If you don’t have an open source app handy, just take a tutorial and mess with that.

Speaking of tutorials, if you do enough of them you’re going to start seeing similarities. Most apps have some sort of UI, some sort of data model, maybe a way to save that data for the next time you open the app (that’s pretty advanced, though, don’t worry about it right away), some logic about what users are allowed to do (like due dates can’t be in the past or players can’t have more than x hitpoints no matter how many health packs they use), maybe some communication with other APIs (but again, that’s advanced, don’t worry about it right away), and honestly, that’s pretty much it.

You can try mashing up different tutorials or open source projects too. Let’s say you have a tutorial for a driving game and one for a game where you run around and collect coins or stars or whatever. What if you could drive around and collect stars? What if you had tutorials for a todo list app and a weather app and mashed them up to make a little morning dashboard for yourself?

Don’t forget, you don’t have to do it all yourself. There are great communities like CodeNewbie, /r/learnprogramming, CodeRanch, (and lots more if you do a little Googling) full of people who will help you out.

Linux tip of the day

If you want to see when you ran a particular command on a linux, you just need to run

HISTTIMEFORMAT="%d/%m/%y %T "

at the command prompt, then the next time you run

history

you'll have handy timestamps! Thanks as usual to stackoverflow for that answer.

So that's cool and all, but why should you care? Because being systematic is extremely important when you're trying to solve a problem. If you don't know exactly when you changed something, you'll have a rough time figuring out which results were caused by which change. If you don't know which change caused which results, you're effectively stumbling around in a dark room at random, hoping you run into a light switch. Not only is that frustrating, but it's a huge waste of time. Systematically changing one thing, checking on the results, then changing one more thing and checking the results again can seem slow, but in the long run it's much faster than stumbling around and hoping.

Using myself as an example, knowing exactly when I ran a particular command let me compare what I had done with what was showing up in the logs and showed me that it was most likely a combination of two commands I had run rather than a weird delay after the first command that gave me the results I wanted. If I hadn't been able to figure out exactly when I ran each command I would still have no idea which one of them helped.

Dev tool of the day

Exercism is a tool that lets you download and solve practice problems in over 30 different languages. I mentioned it in passing before, but let’s talk more about why it’s great.

First of all, each problem in Exercism has a set of unit tests, so you don’t have to wonder if you’re doing it right, you can just run the tests and know for sure. The tests are also great for experimenting with your code and seeing if you can make it easier to read or easier to change without breaking it.

The problems are also carefully chosen to help you learn concepts that are important to each language. Just because you can solve a problem in a certain language on a coding challenge site like HackerRank doesn’t mean you’re learning anything interesting about that language in particular. I know that’s not really what challenge sites are meant for, but I’ve seen them recommended to a lot of people who are learning and think it’s important to be clear about what challenge sites are good for (general programming concepts) and what they aren’t necessarily good for (learning individual languages).

And finally, Exercism directly incorporates both giving and receiving feedback. Obviously getting feedback is helpful – to directly quote the Exercism site: “You can write FORTRAN in any language, as the saying famously goes, but with enough feedback, you’ll quickly find yourself writing the language the way it wants to be written.” – but giving feedback is seriously underrated. To tell someone what you think of their code, you have to read it carefully and then think seriously about what makes code good or bad. That’s enormously helpful when you’re new and don’t really understand what “good code” means yet, or when you’re new to a language and just don’t know the best way to do things in that particular language.

Give Exercism a try!

Be a better programmer while still having a life: part 8

Unrelated disapproving owl from pexels.com to make this post look nicer in social media shares.

Testing! Getting better at testing will make you seem like a better programmer even if your coding style doesn’t change at all. No matter how beautiful and clear your code is, if it’s full of bugs it’s not good code.

It’s kind of ironic that I’m writing a post about testing because honestly I’m not very good at it. Better than I used to be, especially since I started working at a company that actually has unit tests and insists they all pass before you push anything to production, but testing is still not one of my strengths. You don’t have to be amazing for it to be worth doing more testing, though. Some improvement is always better than none.

The thing most programmers, including me, seem to struggle with the most is not being able to think of anything but the happy path through our code. It’s like how when you’re trying to proofread your own writing you see what you meant, not what’s actually there in terms of typos and missing or repeated words. We test the way we meant our code to work instead of thinking of how it could break, and then we decide testing our own code is a waste of time because QA always finds more bugs anyway.

One of the best strategies I’ve found for avoiding getting stuck in the happy path is to plan out how you would test your code before you write it. You can’t get stuck only testing the way you meant your code to work if you haven’t written it yet :) You do need to have some idea how the feature as a whole is going to work, but if you don’t have that then you shouldn’t be worrying about testing anyway. Figure out what it’s supposed to do and then you can think about testing.

This works even if you’ve gotten as far as defining interfaces. Just take a minute and jot down some notes about what values could possibly get passed into those interfaces. Not what should be passed in, not what would ever be passed in by a reasonable human being who doesn’t personally hate you, but everything that the language itself would ever allow. This blog post On Testing is an extension of a joke tweet but is actually a great place to start if you’re not sure what sort of input you should be testing. And if you work in Java like me, make sure you handle nulls. Just because that parameter should never ever ever be null doesn’t mean you don’t have some messed up data somewhere in your system that will produce a nothing where there should be a something. And don’t forget to test with bad data so you can make sure errors are displayed when they should be and are spelled correctly.

If you do unit tests at your company testing thoroughly is a lot easier, but even if you don’t you can still manually test at least a few different cases. It’s just embarassing when you go to demo your new feature to someone and it immediately blows up.

Another part of testing, and for me the hardest part, is testing for my changes affecting existing code in non-obvious ways. It’s really easy to fix “surprise that parameter can be null” bugs and much harder to figure out why on earth adding a new feature would break an existing one that didn’t seem to be related. On the upside for developers like me, the entire reason regression testing exists is to catch bugs like this. Unfortunately, full regression tests aren’t feasible at every company for every release.

All I can really recommend to prevent the weird bugs is to isolate functionality as much as you can, which is good coding practice anyway. That is, if your app formats emails, all of the email formatting code should be in the same class if possible or the same package if you need more than one class. The less different features interact with each other, the less chance you have of those features getting in a fight :)

To bring this back to becoming a better developer, the fewer times QA (or god forbid, your customers) have to kick back a feature because it has bugs, the better a developer you are. Taking the time up front to make sure your code works is absolutely worth it for the time savings later and the increase in quality. Even if you deliver more slowly than programmers who do less testing, QA/your project manager/your team lead will notice whose features zip right through QA and whose get sent back over and over. And if they don’t, you should remind them repeatedly :) Going to QA first means nothing when it takes try after try after try to get approved.

One of the best things about testing thoroughly is that it’s a work thing you can do at work that doesn’t affect your personal time. It’ll even save you time in the long run!

Be a better programmer while still having a life: part 7

Unrelated image from pexels.com to make this post look nicer in social media shares.

Back in part 1 I talked about how important it is to make sure you understand the problem you’re trying to solve. Today I want to expand on that because there’s much more to problem solving. Having a great understanding of the problem you’re trying to solve is great, but it’s not always enough. Sometimes you’re wrong about what the problem actually is. No matter how well you understand the problem you think you have, it’s not going to do you much good if you’re trying to solve the wrong problem.

Telling people to make sure they’re solving the right problem is all well and good, but an actual example always makes things a lot clearer. Conveniently enough, I saw a great example of this problem on workplace.stackexchange.com the other day. To summarize the question quickly in case it disappears someday, the questioner wants to know if there are any alternatives to doing code reviews because not everyone likes doing code reviews. To quote part of the question:

Are there any alternative processes that could replace the code review for the goal of improving the code quality? Would it be possible to have something else instead of this process? While review may be required where software bugs kill humans, could some weaker method be sufficient where the situation is far from that critical?

An edit clarified that the reason the question asker is looking for an alternative to code reviews is because people in their organization use them to play power games and prevent other team members from contributing to the project. At this point you may be developing a theory about why I think “what can we do instead of code reviews?” is the wrong question :)

This particular question did happen to contain a great clue – there’s really no substitute for reviewing your code if you want to improve it. That’s kind of like saying you want to be a better writer but you don’t want anyone to proofread your work. When your solution goes directly against your stated goal, there’s almost certainly a deeper problem. Sometimes that problem is fixable and sometimes it’s not, but there’s definitely something there you need to look into.

Given that the reason the question asker wants to find an alternative to code reviews is because team members are using them to jerk their colleagues around, I don’t think it’s too much of a leap to the conclusion that the real problem is that people are being jerks and playing power games when they’re supposed to be working as a team and that trying to avoid code reviews is just a workaround for a serious culture problem.

To be clear I don’t blame the question asker for trying to solve the wrong problem. I’m assuming they aren’t a manager and/or don’t have the authority to tell the power game players to knock that shit off and start acting like grownups, so finding some way to avoid code reviews without completely ignoring code quality is about all they can do to work around the real problem. But if you’re going to do that, and sometimes finding a workaround/bandaid solution is the only thing you actually can do, I still think it’s important to be honest with yourself that what you’re doing is putting a bandaid on the real problem. If you forget that, you’re going to get a nasty surprise later when it turns out the real problem has popped up again in a different form.

To keep harping on the code review example, just because you’ve removed one avenue for for jerks to play power games doesn’t mean everyone is going to start playing nice. If you do retrospectives or post-mortems of any sort, jerks are going to use those to throw their colleagues under the bus and/or to take credit for their work. Whatever system you use to assign work, jerks will try to abuse it to keep the interesting/fun/easy/politically valuable tasks for themselves and leave the dregs for someone else. And no matter what you try to do to control bad behaviour in your development process, you can’t prevent someone malicious from going to lunch with their dev manager buddy and complaining that that one feature sales keeps pushing for has to be postponed again because so-and-so just isn’t contributing anything (of course they’ll leave out the fact that the malicious dev won’t approve any of their pull requests), they’re such a drain on the team.

This particular problem is especially difficult to actually solve because the real solution is for management to do their jobs and enforce consequences for sabotaging team mates and otherwise refusing to act like a professional. Making anyone, especially someone who outranks you, do their job is never an easy task, so I completely sympathize with the urge to “fix” the symptom rather than the root cause. Some problems are simply above your pay grade, others may be so complicated or expensive to fix that it’s better for the business to keep working around them.

In other cases, fixing the root cause of the problem actually is doable and cheaper or more efficient than keeping a clumsy workaround. Even then, you can’t fix the root problem without knowing what it is, so keep asking why until you get down to a bedrock answer like “Because that’s how this company makes money.”

 

Link of the day

Julia Evans has a great guide to asking good questions, you should read it :) Asking good questions is such a useful skill, I wish programming education spent more time on it.

Getting really good at asking questions is also a great hack for looking like a better developer (it’ll also help you actually become better, but in the short term it’s a good hack). When you ask a bad question, like “My code isn’t working, can you help me?” people have to wonder what you’ve tried already or if you tried at all before giving up and asking someone else. If you ask the exact same question with more detail, especially about what you tried already, like “I’m trying to send an email but I’m getting an error message I don’t understand. I tried googling it but I got a bunch of different answers and I don’t know which one applies to my problem. Can you help me sort through them?” then it’s obvious that not only did you not immediately give up, but you also respect your answerer’s time enough to make it as easy as possible for them to help you. Telling them what you’ve tried already means they can skip suggesting things you already did, and asking a specific question means they don’t have to do the work of figuring out what the question actually is before they can even start thinking about how to answer it.

People love it when you make things easier for them, and when you show them you’ve put some effort into doing so, they’ll think you’re a better programmer than the person who makes getting the real question out of them like pulling teeth even if both of you are around the same skill level. That’s the hack part :) The becoming a better programmer part is that stating a question really clearly (yay rubber ducks!) and listing everything you’ve tried already may trigger that flash of insight about what you haven’t tried but should or what assumption you made that could be wrong. If you get into the habit of reflecting on what you’re doing, you’ll learn a lot faster than someome who sits around and waits for help.

How does a hash function work anyway?

A while ago I wrote about how hash maps work, but something’s been bugging me. How does the hash function do its thing? I know hash functions make variable length data into fixed length data but how do they do that? To be clear I’m interested in the kind of hash you would use for a hash map, you would definitely want a more secure hash to keep your passwords safe.

Thanks to the magic of the internets, it’s really easy to find the function java uses to calculate a String’s hashcode.

/* Returns a hash code for this string. The hash code for a String 
object is computed as s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using int arithmetic, where s[i] is the ith character of the string, 
n is the length of the string, and ^ indicates exponentiation. 
(The hash value of the empty string is zero.)
Returns: a hash code value for this object. */

public int hashCode() {
  int h = hash;
  if (h == 0) {
    int off = offset;
    char val[] = value;
    int len = count;

    for (int i = 0; i < len; i++) {
      h = 31*h + val[off++];
    }
    hash = h;
  }
  return h;
}

Okay great, that’s totally clear, right? ;)

Yeah, I have no idea what it’s actually doing either. But I can find out!

First of all, where are the values of hash, offset, and count coming from? They must be instance variables because they weren’t passed in as parameters. I poked around in the String code a little more and it turns out hash is defaulted to 0 when it’s declared, offset is set to 0 in the constructor, and count is set to the size of the string when it’s created.

Unrelated image from pexels.com to make this post look nicer in social media shares.

The first thing hashCode actually does is checks if hash is 0. If it’s not, then we know we already computed the hash and we can just return it and go on with our day. Makes sense, why do the same calculation over and over again when we can just do it once and store the result? I think that’s the same reason count is stored separately instead of just calling value.length() when you need it. We know the length will never change because Strings are immutable, so why not save ourselves a lookup?

The next weird thing is how the method is adding a number to a char. Chars are characters, not numbers, aren’t they? Well, yes and no. According to the docs, a char is “a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive).” That 0 to 65,535 part seems suspiciously like a number :) You can also test that out yourself in the Java REPL. It turns out Java will happily treat a char like an int if you ask it to.

The rest of it is pretty simple, we’re just looping through every character in the string and adding (31 * current hashcode) + current character to the existing hashcode.

Okay, but how does that map a string of any length to a hash code of fixed length? Shouldn’t a longer String always have a larger hash code? Not if your hashcode is an integer! Those just roll over into negative numbers if you add too large of a number to them. And because 2 and -821785444 are both integers they take up the same amount of memory, which means that no matter what size String you start with, the hashcode is always the same size.

Another interesting little detail of how hashmaps actually use those hashcodes is that they rehash your hashes. If everyone used random Strings for keys then they wouldn’t need to, but because keys are usually Strings with some kind of meaning, that means the hashes for those keys won’t be evenly distributed. That is, a hashcode doesn’t have an equal chance of being any number from -231 to 231-1, you’re going to get clumps of hashes around some numbers because you’re more likely to use some Strings than others.

Great, but why does that matter? Performance! The more collisions you have (different Strings that happened to work out to the same hashCode), the more elements you need to look at to find the one you wanted and the worse your performance is. To get around that, java does some bitwise operations on the hashcode to reduce the number of collisions.

Now we all have some idea what actually happens when you use a HashMap :)

WordPress Appliance - Powered by TurnKey Linux