As you learn to code and learn new programming languages you’ll often hear that different languages are good for different things. Technically you can do just about anything in any language, so for a long time that never meant much to me. Once you get past basic conditionals and loops, there actually are pretty major differences in how easy it is to do different things in different languages.
Here’s a handy example: the other day I wanted to figure out how much I spend on average each month so I could figure out how much I can reliably throw into my RRSP. Okay, use mint, you say. Not so fast there! I only wanted to know about my expenses NOT including RRSP and TFSA contributions, and I wanted to leave out the month I got married because it’s a huge outlier and screws up my average :) If you can get mint to do that, I’d love to hear how.
What I ended up doing was downloading my transaction history as a csv from my bank and manually removing the stuff I didn’t want to include. Then I needed to create monthly totals (so I could see if those looked reasonable) and an overall average somehow. I was hoping I could do that with a simple formula in a spreadsheet, but after fiddling with it for a bit I decided I’d rather poke myself in the eye than stick with that idea.
Python to the rescue! Not so long ago I was a mentor at a Ladies Learning Code workshop about data processing with Python. At the end of that workshop we ended up with a little script that read in a csv, did some processing, and output the results, which is exactly what I needed. I started with that script and ended up with this:
# Import the csv library
import csv
import datetime
# Open the statement file
statement_file = open('./statement.csv')
# Convert it to a csv_data structure
statement_data = csv.DictReader(statement_file)
current_month = -1
current_year = -1
months = 0
grand_total = 0.0
running_total = 0
# Loop through each of the rows
for transaction in statement_data:
# deposits have a blank in the withdrawal field, we only want withdrawals
if transaction['withdrawal'] is not '':
#convert the string date to a date object so we can get the month
date = datetime.datetime.strptime(transaction["date"], '%d-%b-%Y')
#every time we hit a row where the month doesn't match the month from
#the last row we know it's a new month and we need to update current
# month & year and increment the month count
if date.month != current_month:
if current_month > -1:
months += 1
#print current_month instead of date.month because date.month
#is the new month
print(str(current_month) + "-" + str(current_year)
+ " monthly total: " + str(running_total))
current_month = date.month
current_year = date.year
running_total = 0
running_total += float(transaction["withdrawal"])
grand_total += float(transaction["withdrawal"])
# one more print statement for the last month in the file
print(str(current_month) + "-" + str(current_year) + " monthly total: "
+ str(running_total))
average = grand_total / months
print("avg: " + str(average) + " over " + str(months) + " months")
Then I started thinking, that was weirdly easy considering that since college I’ve touched Python twice – once while preparing for that Ladies Learning Code workshop and once while actually mentoring at the workshop. That made me wonder how Java, the language I’ve used just about every day at work for the last nine years, would compare. So I ported my Python script to Java and this is what I ended up with:
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import java.text.DateFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.Date;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVRecord;
public class Calc {
public static void main(String[] args) {
try {
int monthCount = 0;
int currentMonth = -1;
int currentYear = -1;
float grandTotal = 0;
float runningTotal = 0;
// open the statement csv
Reader in = new FileReader("statement.csv");
// parse it into CSVRecords so we can get values out more easily
// unlike python this CSV library doesn't seem to automagically
// figure out what a header row is so I had to add the headers
// manually
Iterable<CSVRecord> records = CSVFormat.DEFAULT.withHeader(
"account", "date", "desc", "num", "withdrawal", "deposit",
"balance").parse(in);
// loop through each of the rows
for (CSVRecord record : records) {
String dateStr = record.get("date");
String withdrawalStr = record.get("withdrawal");
// deposits have a blank in the withdrawal field, we only want
// withdrawals
if (withdrawalStr != null && !withdrawalStr.equals("")) {
// java requires a lot of boilerplate around parsing a
// string into a date that we can get a month out of
DateFormat df = new SimpleDateFormat("d-MMM-yyyy");
Date transactionDate = df.parse(dateStr);
Calendar cal = Calendar.getInstance();
cal.setTime(transactionDate);
// every time we hit a row where the month doesn't match
// the month from the last row we know it's a new month
// and we need to update the current month and increment
// the month count. technically we can get the month
// using transactionDate.getMonth() but that method is
// deprecated and I'm trying to set a good example
if (cal.get(Calendar.MONTH) != currentMonth) {
monthCount++;
if (currentMonth > -1) {
// in java months start from 0, add 1 so we get
// nicer looking output
System.out.println((currentMonth + 1) + "-"
+ currentYear + " monthly total: "
+ runningTotal);
}
currentMonth = cal.get(Calendar.MONTH);
currentYear = cal.get(Calendar.YEAR);
runningTotal = 0;
}
float withdrawal = Float.parseFloat(withdrawalStr);
grandTotal += withdrawal;
runningTotal += withdrawal;
}
}
// one more print statement for the last month in the file
System.out.println((currentMonth + 1) + "-" + currentYear
+ " monthly total: " + runningTotal);
float average = grandTotal / monthCount;
// the one convenient thing java does here is 'autoboxing' - it
// automatically converts non-strings into strings when you try to
// add them to a string
System.out.println("avg: " + average + " over " + monthCount
+ " months");
} catch (IOException | ParseException e) {
e.printStackTrace();
}
}
}
In a word, ugh. File processing scripts are not even slightly what Java is good at. Everything I needed for the Python script was part of the Python language. For the Java version, I had to go hunt down a library and add it to my project, which required knowing that there probably was a library, knowing how to add it to my build path, and figuring out how to use it.
Even the least terrible csv library I was able to find for Java inside of five minutes of googling (Apache Commons CSV, if you’re curious) was much harder to use as Python’s builtin csv handling. Java’s date parsing also requires way more steps than Python’s does. And to run this in Java you have to know about main methods and all the boilerplate around them. Even if you just let your IDE generate that for you, you still need to know it exists and what it’s for.
Basically you have to fight Java to do something like my average monthly spending script. You can still do it, but it’s much more work than it has to be. Java is great for big enterprisey systems with APIs and multiple programmers working on different pieces, but it’s kind of painful for little scripts to parse a csv and do some processing. Python, on the other hand, rocks at stuff like that. I hope this helps you understand what people actually mean when they say different languages are good for different things.