melreams.com

Nerrrrd

Programming is actually a creative field

There’s this stereotype that programming isn’t a creative field, that programmers do nothing but mechanically assemble code all day. I find that really sad, I think if we did a better job of explaining how creative programming actually is many more people would be willing to give it a shot.

If you’ve only ever had a total beginner intro to programming, it might be really hard to see where the creativity comes in. Honestly, variables are pretty boring and conditionals aren’t much better. Programming gets a lot more interesting once you’ve mastered the basics, I swear :)

Programming is a bit like building things with lego blocks, except that you have to make all the blocks yourself. Building the blocks – that is, writing an if statement or creating a variable – isn’t that interesting, I’ll be honest. But once you get good at creating those blocks, that’s when you get to be creative. Just like you can build a castle or a siege engine or an entire lego Westeros out of simple little blocks, you can build amazing things out of variables and loops and conditionals if you’re patient.

It’s also a bit like pottery or wood working. Just because you’re constrained to making a usable cup or chair doesn’t mean you can’t be incredibly creative within those constraints. Even in visual art or writing you need sentences that make sense and a combination of shapes and colours that work. Jokes about modern art aside, you can’t just throw paint in the direction of the canvas and expect anyone to care what you’ve done.

There are always constraints on anything you make, programming just has particularly rigid ones. An imperfect sentence is still intelligible, but a missing semicolon will keep your code from compiling at all. For some people that’s more frustration than they care to deal with, for others it’s an interesting challenge.

Building your own project that does anything you want it to is pretty obviously creative, but what about programming at work where you’re given assignments?

Even the least interesting internal application still requires creative problem solving to add new features or fix bugs. There have been times when I’ve had to be very creative to change an application in a way that meets the new requirements without breaking anything that used to work and without making a horrifying mess of the code. There are often quick and dirty ways to make a change that just leave you with more trouble in the long run, and sometimes (in an emergency, for example) they’re the least bad option, but usually you think about not just how you can make the change that’s necessary right now but how you can set things up so that you can make more changes in the future without tearing your hair out.

There’s never just one way to do that, either. When you’re working through a beginner tutorial it will probably look like there’s one right way to build any application and you can’t be creative at all once you’ve decided what you’re going to build and what the user interface should look like. That’s totally untrue. Once you’re working on anything larger than a simple assignment to write a for loop, you enter the world of trade-offs. There’s never only one way to solve a problem and each solution has its own pros and cons.

For example, optimizing code so it runs faster involves a lot of decisions about trade-offs and a lot of creative problem solving. Optimizing code usually makes it harder to read, which makes it harder to update if you need a new feature or find a bug. This matters a whole lot when you’re running a business because programmer time is so expensive. On the other hand, if your program runs so slowly that no one wants to use it (and pay for it!), it doesn’t matter how nice the code is to work with. Sometimes you absolutely need your code to run as fast as possible and unclear code is worth it to get your game to run at a decent frame rate. Other times performance isn’t your highest priority and what you really need is to be able to read and change your code quickly because you get requests for new features all the time.

In short, I build things all day. How is that not creative?

Development is maintenance

Professional programming and the kind of programming you learn in college/university/bootcamp/etc are actually very different things. Despite what you learned in school, development is really maintenance. In other words, I’m here to crush your dreams :)

So, you know how in school you started new projects from the ground up all the time? Yeah, you’ll hardly ever do that at work.

Now, sometimes you will need to research new technologies and/or frameworks and starting a new prototype project is usually a part of that, and sometimes even the largest and slowest moving organization needs to start something completely new, but that’s generally a very small part of the job.

What you’ll actually spend most of your time doing as a professional developer is adding features and fixing bugs in an existing product. That’s such a large part of the job that I wish we’d done more (or any) of it in school. For anyone reading this who is learning to code, I strongly recommend taking sample projects or open source projects or whatever you can get your hands on and adding new features or fixing bugs. Learning to read other people’s code is hugely important and you may only barely touch on it in your studies.

To be fair, just learning to code takes a lot of time and you can only cram so much into any one program without keeping students there for years and years, but I wish we put a little more emphasis on what software developers actually do at work most of the time. I also wish we’d spent more time on why design is such a big deal.

One of the consequences of hardly ever starting completely new projects at work is that the few projects that do get started are extremely long lived. Instead of a tiny throw-away project that you spend maybe a week building and then never touch again, you’ll work with applications that live for years or even decades. This can be really weird to adjust to since the lifespan of those projects means every tiny decision you made in five minutes can come back to haunt you for years to come :)

On the other hand, long lived projects have a much greater impact than tiny little throw-aways. If you do a good job, the code you write can make people’s lives a little bit easier for years and years. You can also build much larger things, whether they’re applications, games, frameworks, or something else entirely, when you have years to work on something. Corporate software development isn’t all bad, you get to work on things that you could never build on your own.

Another way professional development is different from school projects is that requirements always, always change. Even if every feature you add is perfect and bug free (ha!), your users are going to ask for new things and/or discover that the feature they asked for isn’t actually what they needed and the business might expand into new areas and the laws that your business has to follow might change. Sometimes technical requirements even drive changes: if a new version of your database or framework or a library you depend on comes out, eventually you’re going to want to switch to that.

The requirements changes can be infuriating, I’m not going to sugar-coat that. But at least you get to work on something that people care enough about to ask for changes, even if sometimes it seems like they have no idea what they actually want. If you never had to change a piece of software, all that would mean was that absolutely nobody was using it. I  don’t know about you but I think it’s pretty cool that people actually use the stuff I build.

Real world software development is very, very different from what you do in school, so don’t be surprised if it takes you a little while to find your feet. As much as it can be frustrating sometimes, there are some really cool upsides too.

Dev tool of the day

You know what’s incredibly helpful? RequestBin! Why is it so great? Because testing webhooks sucks and RequestBin makes it easy. Logging your output is a good start but that can’t tell you which IP your request is actually coming from. RequestBin can, which is awesome when you’re trying to figure out whether the Elastic IP you set up in AWS is working correctly. It also shows you all of your output (in a nice human-readable format, no less), which is handy if you really want to know exactly what your client receives.

It’s also free! On the downside your destination url only lasts for 48 hours and your data may be wiped out at any time (if you need a permanent solution look at Runscope’s Request Captures – seems only fair to plug the paid version when I’m talking about how helpful the free one is), but the price is right :)

If it’s hard to explain, it’s probably a bad idea

One of the things I struggled with when I was new to programming was how to tell whether a given piece of code is good or not. When everything is new and confusing, how do you tell bad confusing from normal confusing?

One thing that will give you a very helpful hint is if you code is hard to explain. Like this Python style guide puts it:

If the implementation is hard to explain, it’s a bad idea. This is a general software engineering principle — but applies very well to Python code. Most Python functions and objects can have an easy-to-explain implementation. If it’s hard to explain, it’s probably a bad idea. Usually you can make a hard-to-explain function easier-to-explain via “divide and conquer” — split it into several functions.

Basically, if something you’re working on is hard to explain, that’s a sign that it needs to be re-thought. Some problems are just unavoidably complex, but it should still be possible to explain what you’re doing at a high level.

That applies to higher level application logic just as much as to individual functions. At a previous job I worked on a multiplayer game project that involved putting groups of players into rooms together for each round, then closing down that room when the round was over and creating a new one. Our first implementation seemed like a good idea at the time, but when we got the game to a point where the team could play together, we had a terrible time explaining how players were sorted into rooms the to the artists and project manager.

The non-programmers on the team were by no means stupid people and had been working on the game for quite a while by that point. The fact that we couldn’t explain our room selection scheme to them was a very strong sign that what we were doing just didn’t make sense. As we kept playing our in-progress game, it also turned out that it was extremely difficult to get the whole team into the same room. There were less than a dozen of us working on that game, so there was really no good reason for it to be that hard to play together.

In the end, we admitted our room selection logic wasn’t working and rewrote it to be much simpler. Players sometimes had to wait a bit longer for a round to start, but they could play with their friends more easily and stay in the same room with the people they played with last round. The simpler logic that was easier to explain was also a better experience for the players.

I’m not going to pretend you’ll never run into anyone who is invested in not understanding what you’re trying to explain to them, but if you give someone an overview of what you’re up to and they don’t follow it, think about whether you’re doing something overly complicated before you assume the person you’re trying to explain it to is just dumb. Complicated code is harder to test, harder to debug, and harder to change when you get a new feature request. It’s worth paying attention to seemingly unimportant signs like having a hard time explaining your code to someone else because it can save you so much time in the long run.

What does SOLID really mean? Part 5

First, a quick recap: the SOLID principles of object oriented design are, to quote Uncle Bob:

The Single Responsibility Principle A class should have one, and only one, reason to change.
The Open Closed Principle You should be able to extend a classes behavior, without modifying it.
The Liskov Substitution Principle Derived classes must be substitutable for their base classes.
The Interface Segregation Principle Make fine grained interfaces that are client specific.
The Dependency Inversion Principle Depend on abstractions, not on concretions.

Last time I talked about the fourth letter in SOLID, the Interface Segregation Principle. Now I’m moving on to the Dependency Inversion Principle.

The Dependency Inversion Principle says you should depend on abstractions, not concrete classes. Great, what does that mean? Basically that you want to hide the details of what you’re doing not just behind a separate class but behind an interface so you don’t even have to know which class is actually doing the work.

If you only have one class that actually does the work, this probably seems like a total waste of time. Honestly, for some situations it probably is. If there are business reasons that you’re never, ever going to change database vendors, then don’t worry too much about hiding which database driver you’re using. In other situations where things might change or will definitely change (which is most situations, if requirements would just stay put software would be easy), dependency inversion can really help you out.

Unrelated photo from Pexels
Unrelated photo from Pexels

Let’s take sending email as an example. In a web app that you sell to other businesses, you often need to notify their customers of things directly – if you sell an appointment reminder system the entire point is that your customer doesn’t have to manually send emails to their customers, your app takes care of that for them. Sending email sounds simple enough, right? Either you set up your own SMTP server and send emails directly or you use a service like MailChimp or SendGrid or Amazon SES or Mailgun or ___ and you leave it alone.

Not so fast! What if some of your customers want to send email through their own SendGrid account so they can customize their own emails without going through you and see all their stats and everything? What if other customers already have their own SMTP server and want to send email through that? Now you’ve really got to hide all the details so that your code can trigger an email without even knowing whether that email is going to be send directly to a mailserver or to a service like Mailgun.

If you built your app following the dependency inversion principle from the get-to, this is going to be really simple. All you have to do is add another implementation of the email handler interface you already have and you’re set. Best of all, you know you didn’t break your existing email handling because it’s in a separate class that you don’t have to mess with.

If you let your app depend directly on one email service, though, you’ve got a mess to deal with. Not only do you have to add another email handler, but you have to make pretty major changes to your code to pull your existing email handling into a separate class. This can really suck if you let your code deal with too many implementation details, like how to react to different error codes. It also makes the change riskier and more expensive (in both time and money) because any time you change existing code you might introduce new bugs and because you’ll need to retest all of the existing email handling as well as the new feature to make sure everything still works.

Even if you doubt a certain feature is going to change, it’s still worth thinking about dependency inversion. If the code that triggers an email can only talk to an interface, that’s going to change the way you pass along data like the to and from addresses. It’s also going to change how you report and recover from errors. You might still decide to let your code depend directly on your email service, which is perfectly fine if you’ve thought that decision through. The Dependency Inversion Principle isn’t mean to be an ironclad rule, it’s just a principle to help you avoid painting yourself into a corner.

That’s it for SOLID! If there’s a particular design principle you’d like me to cover next, let me know in the comments.

What does SOLID really mean? Part 4

First, a quick recap: the SOLID principles of object oriented design are, to quote Uncle Bob:

The Single Responsibility Principle A class should have one, and only one, reason to change.
The Open Closed Principle You should be able to extend a classes behavior, without modifying it.
The Liskov Substitution Principle Derived classes must be substitutable for their base classes.
The Interface Segregation Principle Make fine grained interfaces that are client specific.
The Dependency Inversion Principle Depend on abstractions, not on concretions.

Last time I talked about the third letter in SOLID, the Liskov Substitution Principle. Now I’m moving on the the Interface Segregation Principle.

Another way to state the Interface Segregation Principle is that no client should be forced to depend on methods it does not use (thanks wikipedia). That is, if you have methods in your interface that are different enough that no single client would use both of them, those methods probably belong in separate interfaces.This is similar to but not quite the same as the Single Responsibility Principle – a class can have a single responsibility and still have public methods that will be used by some clients but not others.

Take this example of a job class from the wikipedia page on the Interface Segregation Principle.

The ISP was first used and formulated by Robert C. Martin while consulting for Xerox. Xerox had created a new printer system that could perform a variety of tasks such as stapling and faxing. The software for this system was created from the ground up. As the software grew, making modifications became more and more difficult so that even the smallest change would take a redeployment cycle of an hour, which made development nearly impossible.

The design problem was that a single Job class was used by almost all of the tasks. Whenever a print job or a stapling job needed to be performed, a call was made to the Job class. This resulted in a ‘fat’ class with multitudes of methods specific to a variety of different clients. Because of this design, a staple job would know about all the methods of the print job, even though there was no use for them.

The solution suggested by Martin utilized what is called the Interface Segregation Principle today. Applied to the Xerox software, an interface layer between the Job class and its clients was added using the Dependency Inversion Principle. Instead of having one large Job class, a Staple Job interface or a Print Job interface was created that would be used by the Staple or Print classes, respectively, calling methods of the Job class. Therefore, one interface was created for each job type, which were all implemented by the Job class.

Just because the Job class only changes when when we have a new or different type of job doesn’t mean the interface isn’t a mess. Of course, you could also argue that “Job” is too broad and that the Job class does have multiple responsibilities because a a staple job and a print job are separate things, but I think there’s still something to be gained from looking at the breadth of your interface and thinking about whether it needs to be broken up into separate interfaces.

Even if you have a single class that implements all of those interfaces, it’s still cleaner for the clients of that class only to know about the methods they actually need. The more things your interface does, the more likely that separate clients accidentally get tangled up because it’s so easy to just call another method on an interface you already have access to. Splitting your interface into separate pieces forces you to think about what each client really needs to have access to and whether you’ve split your clients up the right way.

In most cases, it’s probably better to let separate subclasses implement the different parts of each single purpose interface. If two clients are different enough to use completely separate interfaces, then a single change probably should not affect them both. Sometimes the change you need to make is at such a fundamental level that it is reasonable for all clients to be affected, but that’s something you should avoid if at all possible. Programming: where there’s never a simple right answer.

Another reason to have smaller interfaces is to make life easier for maintenance programmers :) The more methods you have in an interface, the harder the maintenance programmer has to work to figure out which one is actually right for what they’re doing. That might sound silly, but take a look at the Java Collections API. Collections are meant to be generic so they do need a pretty broad interface but that’s still a lot of stuff to dig through when you just want to know which method to use to update some of the elements in your collection.

Next up, the last letter in SOLID: the Dependency Inversion Principle.

What does SOLID really mean? Part 3

First, a quick recap: the SOLID principles of object oriented design are, to quote Uncle Bob:

The Single Responsibility Principle A class should have one, and only one, reason to change.
The Open Closed Principle You should be able to extend a classes behavior, without modifying it.
The Liskov Substitution Principle Derived classes must be substitutable for their base classes.
The Interface Segregation Principle Make fine grained interfaces that are client specific.
The Dependency Inversion Principle Depend on abstractions, not on concretions.

Last time I talked about the second letter in SOLID, the Open Closed Principle. Now I’m moving on to the Liskov Substitution principle.

The Liskov Substitution Principle is named that because it was created by Barbara Liskov, currently an institute professor at the Massachusetts Institute of Technology and Ford Professor of Engineering in its School of Engineering‘s electrical engineering and computer science department. The principle also doesn’t lend itself well to a short description, so it’s just easier to name it after the person who invented it.

The Liskov substitution principle says that it must be possible to substitute a subclass for the base class without changing anything. Basically, your object hierarchy

Unrelated photo by Zukiman Mohamad
Unrelated photo by Zukiman Mohamad

should make sense :) If you have a subclass that requires special handling and can’t just be dropped in where the base class is used, something is wrong with your design. Why is that so bad? Because it means every time you use that subclass you have to remember to add the special handling bit and/or remember which subclass has which side effects.

If you need some of the functionality of the base class but you have to do some special stuff that means you can’t just create a subclass that is substitutable, create a new class and give it an instance of the base class to use. If it can’t behave like a real subclass, don’t try to force it to be one, it’s just going to cause trouble in the long run.

The typical example of a Liskov Substitution Principle violation is a Square class and a Rectangle class. If they both have setters for width and height, then you can get yourself into trouble if the calling code got a rectangle when it expected a square or vice versa. Say you’re trying to lay out a screen and you know you have a space left that’s x by y so you set your screen object’s width to x and its height to y. If your screen object is a rectangle, everything is cool. But if your object is a square, suddenly its width also got set to y when you set its height to y. Now your layout is all messed up and you’re frustrated because your code looks perfectly reasonable even though it’s clearly not working correctly.

Another way to state the Liskov Substitution Principle is that your code shouldn’t contain surprises. No matter how sensible and obvious something seems while you’re writing it, in six months when you come back to add a new feature you will have forgotten all the details. If your code doesn’t have surprise side effects or special handling, then you’re much more likely to be able to add that new feature quickly and move on. If you run into a surprise, you could spend ages figuring out why the code behaves that way.

If you have class hierarchies in your code, be nice to your future self and obey the Liskov Substitution Principle.

What does SOLID really mean? Part 2

First, a quick recap: the SOLID principles of object oriented design are, to quote Uncle Bob:

The Single Responsibility Principle A class should have one, and only one, reason to change.
The Open Closed Principle You should be able to extend a classes behavior, without modifying it.
The Liskov Substitution Principle Derived classes must be substitutable for their base classes.
The Interface Segregation Principle Make fine grained interfaces that are client specific.
The Dependency Inversion Principle Depend on abstractions, not on concretions.

Last time I talked about the first letter in SOLID, the Single Responsibility Principle. Now I’m moving on to the Open Closed Principle.

The Open Closed principle says that you should be able to extend a class’s behaviour without modifying that class directly. In other words, the class should be open for extension and closed for modification. Okay cool, but what does that really mean? That you should design your classes in a way that when you need to add new features to your system you can do it by adding new code without messing with existing code that already works. Remember, most of software development is accepting that it’s hard and trying not to completely screw it up. When you change code that already works, you risk breaking everything that depends on it. It’s safer to add new code to a child class (for example) that’s separate from the existing code so you can break your new feature without wrecking everyone else’s day.

Photo so this blog post looks nicer in social media shares. Photo by CreateHER Stock shared under a Creative Commons Attribution-ShareAlike 4.0 International License.
Photo so this blog post looks nicer in social media shares. Photo by CreateHER Stock shared under a Creative Commons Attribution-ShareAlike 4.0 International License.

Open/closed is about how you arrange your abstractions. The name Open Closed Principle can be kind of confusing, it’s not about somehow preventing other people from changing existing code with stern comments threatening to replace their good chair with the crappy broken one that got abandoned in the conference room, it’s about writing your code so other people (or you in six months) don’t have to change the existing code. It’s the not needing to change your code part that makes your class closed to modification.

For example, suppose you have some classes for employees and contractors, and a report building class that calculates everyone’s pay for the month. If that report building class has an if or case statement that checks whether the current person to calculate pay for is an employee or a contractor and handles them differently, then your class has to be updated every time you add a new type. Maybe you need to handle interns now, or both salaried and hourly employees, or full and part time employees, or sales people who get paid different commissions, or or or. The more employee types you add the bigger and uglier your case statement gets and the more chance you’ll forget one or mess it up.

Instead, you need an interface named PayableIndividual with a method called calculatePay(Date start, Date end). Then your concrete classes like FullTimeEmployee and Contractor can implement that interface. If your report generating class only uses PayableIndividuals, not the concrete classes that implement that interface, you an add all the subtypes you want without ever having to mess with the report generator because all it has to do is call calculatePay and let the concrete class do the work.

The open for extension part of the Open Closed Principle is about keeping processing that’s specific to an individual piece of code separate. It might be tempting to make that calculatePay method I mentioned above a method in an abstract PayableIndividual class and let the subclasses just add what they need to after they call the base calculatePay method. If you really do have a base hourly rate everyone gets paid that could work, but if, say, some contractors get paid by the hour and some sell blocks of work for a fixed price then you’ll have to tear out the base calculatePay method that doesn’t make sense anymore. If it can vary, separate it out so that you can simply override it in a base class without reworking your entire design.

Of course, you can’t have code that perfectly adheres to the open closed principle if it actually does anything useful. Sooner or later you’ll run into a problem that just doesn’t bit perfectly into your nice tidy design. For example, you might want your salary report ordered by employee type and then name. Whatever code does the ordering has to change when you add new employee types – all you can do is keep the mess as contained as possible. If you use a comparator with a table that contains the order all the employee types should be printed in then only that table has to change when you add a new employee type. Not perfect but it could be much worse.

So why is it so bad to modify a base class? It’s a risk. Every time you modify a class, you risk breaking everything that depends on it. That’s bad enough if it’s a class used only by one project, but it’s really scary if it’s a class in a shared library. If you could break every project that uses that library, then every project needs to be retested and that can be massively expensive in terms of both time and money. It can also make other developers hate you, which is exactly what we’re trying to avoid with these principles :)

Next up, the Liskov Substitution principle!

What does SOLID really mean? Part 1

Not so long ago I read an article about SOLID design principles in Clojure and started thinking it would be interesting to talk about those principles more generally. I don’t know about you, but I have a terrible habit of skimming over stuff like that thinking “oh sure, SOLID, that sounds like a good idea” and then promptly forgetting all about it.

According to Wikipedia, Robert C Martin (aka Uncle Bob) came up with the SOLID design principles in the early 2000s and Michael Feathers came up with the mnemonic acronym to help people remember them. The SOLID principles are meant to help people design code that’s easy to maintain and extend, which keeps future maintenance programmers from wanting to throw chairs at them :)

Uncle Bob has an excellent summary of what SOLID stands for on his site, so I’ll just quote him here:

The Single Responsibility Principle A class should have one, and only one, reason to change.
The Open Closed Principle You should be able to extend a classes behavior, without modifying it.
The Liskov Substitution Principle Derived classes must be substitutable for their base classes.
The Interface Segregation Principle Make fine grained interfaces that are client specific.
The Dependency Inversion Principle Depend on abstractions, not on concretions.

Let’s start with the S, the single responsibility principle. Why is it so important that a class only have one reason to change? Because that means it’s only responsible for one thing. Classes that are responsible for only one thing have conceptual integrity, which is an enormous part of writing code than anyone else can ever use and one of the most important things The Mythical Man Month talks about. Conceptual integrity is kind of a tough concept to nail down, though. I would say that a project that has a high level of conceptual integrity is consistent, it has a predictable design. All of the classes at any given layer of your architecture should behave largely the same way so another programmer doesn’t have to spend hours figuring out how each individual class behaves so she can add her feature.

A surprising number of best practices in software development are about accepting that development is incredibly easy to screw up and attempting to make it harder to make a complete hash of things. If you’re working on a very small project like an assignment in college, it really doesn’t matter whether your project has conceptual integrity. If it’s simple enough that you can hold the whole thing in your head or if it’s something you spend a week building before you hand it in and then never look at it again, you can get away with whatever ridiculous mish-mash of design concepts you want. Where conceptual integrity becomes a huge fucking deal is when you’re maintaining a large system over multiple years.

College/university/bootcamps are great and you’ll learn lots, but one thing that’s very difficult to teach in a limited amount of time is what it’s actually like to maintain an existing system. All of the things you can get away with in tiny throwaway projects just do not work on larger projects that might survive for ten years or more. Every tiny mistake that you make in the beginning gets magnified by years and years of decisions built on top of it.

For example, at a previous job I chose a convoluted hash structure to store the data I needed to build an interface. It seemed like a good idea when I built it, but as the requirements changed – requirements always change, don’t kid yourself on that front – the cracks started to show. I had come up with a brittle design that was extremely difficult to extend. As I tried to force my data structure to support more and more features it just became more and more of a mess. I ended up having to apologize a whole lot to the dev who ended up maintaining that monstrosity after I left and I wouldn’t have been surprised if he had to scrap the whole thing and rebuild it sensibly.

Even if conceptual integrity doesn’t seem like a big deal at this point in your studies or career, trust me, it will come back to bite you soon enough. The more things your class does the longer it takes another dev to figure out how to use it. That other dev could also be you six months from now when you’ve forgotten all the fiddly little details of how you built that class. The less predictable your codebase is, the more time you waste re-learning things every time you use a given class.

It probably doesn’t sound that bad to have to relearn one class before you use it. It only takes a few minutes, right? Where things get ugly is where you have to relearn “just one class” five times to get a feature done or even worse, when you didn’t realize you needed to relearn that class because the whole rest of that layer worked the same way so you reasonably assumed that class did too. You get some really interesting bugs when just one of these things isn’t like the others and what’s worse, they’re especially difficult to catch with unit tests because hardly anyone thinks to test for the possibility that this one class doesn’t behave like all of the other classes like it.

The more areas a class has responsibility for, the less predictable it is. If I have a service that retrieves data from a store (maybe it’s a relational db, maybe it’s a nosql db or a cache), I can pretty safely make assumptions about what the get and save methods do. But if I have a service that retrieves data and formats it for display, I can’t know it formats things the same way as all the other controller classes or if it does any special business logic until I go and look.

The more areas a class has responsibility for, the more other classes have to change if it changes. A class that just represents database data only changes if we add new fields to the database. If we add a new field, it’s completely logical and predictable that other classes that use that class might need to change to handle the new field. If a class handles retrieval and some business logic, then it’s much harder to find all of the classes that need to change if that class changes and it’s much more likely that we might miss something that needs to change too. It sounds stupid if you’re used to teeny tiny projects, but this is stuff that actually happens when you have a codebase of over a thousand classes. The more your project does, the more careful you have to be about keeping everything tidy. Think about that poor maintenance programmer and don’t make her wish she could hunt you down and pelt you with tomatoes.

As much as we like to think programming is a solitary activity, a huge amount of professional programming is actually about not being a complete asshole to the devs who will come after you :)

The open/closed principle is coming soon!

Java best practices: synchronization

Java’s synchronization can be really helpful, but it can also get you into plenty of trouble. Synchonization is in no way a magic wand that you can wave around to get rid of multi-threading issues, you have to understand how to use it.

In java (and many other languages, but java’s what I’m familiar with), synchronization prevents threads from accessing the same data at the same time. Concurrency (multiple threads sharing access to the same variables) is a gigantic subject, so I’m going to gloss over it here by saying that things can go wrong in deeply weird ways when threads accidentally overwrite each other’s updates to a variable or work from different copies of the shared variable. Synchronization can stop that from happening if you use it correctly, but at the cost of a hit to performance and the need to be very very careful that you don’t introduce deadlocks.

public synchronized void example() {
   //do things
}

Using the synchronization keyword on a method (like in the example above) synchronizes access to that entire method, which is generally pretty safe but unless you have a very small method you may be synchronizing a bigger chunk of code than you absolutely need to, which is more of a performance hit than necessary. Because synchronized blocks/methods can only be accessed by one thread at a time, they really slow down processing. The larger a chunk of code you synchronize, the worse the performance hit is.

class Example {
   Message m;

   public Example(Message m) {
       this.m = m;
   }

   public void doThings() {
       String name = Thread.currentThread().getName();
       synchronized(m) {
           //actually do things with m
       }
   }
} 

The synchronization method, while it makes it easier to synchronize only the part you need, also makes it easier to mess things up by introducing a deadlock. A deadlock happens if thread A needs locks on objects Y and Z and thread B needs locks on objects Z and Y in that order. If A locks Y and waits for Z to be unlocked, and B locks Z and waits for Y to be unlocked, both threads wait forever and nothing happens until you restart your program. If you lock on multiple objects (which you should definitely do if you need to update multiple shared objects in the same block of code), make sure that you absolutely always lock on those objects in the same order. The same problem applies to mysql deadlocks, which can really suck to debug if your codebase is large enough.

While we’re at it, according to stack overflow, synchronized(this) can be dangerous because it synchronizes on the entire instance. If you have another block that synchronizes on this, it can’t run until the other lock on this unlocks. It also means any external locks on that object can’t run until it’s unlocked, which can cause serious performance problems if you do it enough.

Aside from being very careful when you do use synchronized, the best advice I can give you is to use it as little as possible. If you can, just don’t have shared state. Particularly in web programming, you generally shouldn’t keep state around for longer than it takes to process a request.

Finally, if you use synchronized and mess it up, don’t waste time beating yourself up about it. Concurrency is even worse than timezones and everyone messes it up sometimes.