Backward Thinking


Perhaps this entry should be titled Rant #2.

Okay, someone explain to me why people write queries with the following syntax:

SELECT Employee.Name, Department.Manager
FROM Department
JOIN Employee ON Employee.Department = Department.DeptName 
WHERE Employee.Name = 'John Smith';

The FROM/JOIN seems all backwards to me. To me, this reads as ‘find all the departments and then see if any of them match up with the employee we are interested in’. It just doesn’t make sense. If we want to find the department information for John Smith, then start by getting John’s info and then link it to the appropriate department – not the other way around.

Unless you can demonstrate some kind of savings in the execution plan, the above example is just… well… backwards.

And don’t say ‘if it makes no difference to execution plan, then what’s the problem?’ The problem is readability. Every second someone has to take to decipher what this means is 1000 milliseconds of wasted time. Admittedly, in a simple example like this one, the total wasted time may only be 1000 ms, but using this syntax in larger, more complicated queries adds up to millions upon millions of microseconds!

Just flipping the tables in the FROM/JOIN makes this query soooo much more readable:

SELECT Employee.Name, Department.Manager
FROM Employee
JOIN Department ON Employee.Department = Department.DeptName  
WHERE Employee.Name = 'John Smith';

Don’t believe me? Find someone you know who knows a little about programming but doesn’t know SQL. Give them a 10 second rundown on what FROM and JOIN do and then show them the two examples and ask them which makes more sense.

Or just try telling someone what the query is for. I’m sure your explanation will include something to the effect of “it uses John Smith’s department code to go get the name of his department manager”. And that is a reflection of the way you think about resolving the problem of getting John’s manager’s name.

If that is the way you think of things, then that is the way you should write it. That makes it far easier for the next guy to figure out what you are doing and why.

But the change to the query has resulted in another style of backward thinking of a sort that I see even more frequently than the first example. Look at the ON clause. The format is now:

FROM a JOIN b ON a.x = b.x

If you don’t see anything wrong with that then try thinking about this: when you shop for new shoes do you test if your foot matches the shoe size?

SELECT  Person.Name, Shoe.Model
FROM Person
JOIN Shoe ON Person.FootSize = Shoe.Size

Or do you test if the shoe size matches your foot?

SELECT  Person.Name, Shoe.Model
FROM Person
JOIN Shoe ON Shoe.Size = Person.FootSize

The first implies gathering every shoe in the store and then finding the ones that match your foot. The second implies checking your foot and then selecting the shoes that match with it.

Yeah, I thought so: you shop the second way. If that is the way you think of things, then that is the way you should write it. That makes it far easier for the next guy to figure out what you are doing and why.

FROM a JOIN b ON b.x = a.x

Another reason for using this order can be seen if we replace a.x with a constant. Which reads more sensibly:

JOIN Shoe ON 7 = Shoe.Size

or

JOIN Shoe ON Shoe.Size = 7

Lastly, I want to rail a bit on something I see in code quite often:

if (!product.IsAvailable) 
{
  order.PutOnBackOrder();
}
else
{
  order.Process();
}

Why? Just why? Why would you write a negative condition and then follow it with an else? There is absolutely no logical rationalization for this.

Would you tell a child
“If you are not a boy then you will grow up to be a woman. Otherwise you will grow up to be a man”?
It is logically correct, but you are going to get a blank stare of confusion from the average 3-year old.
Try this instead:
“If you are a boy then you will grow up to be a man, Otherwise you will grow up to be a woman”
I bet you your kid will understand that much more easily (I bet you did too… admit it).

Soooooo.. if that is the way you think of things, then it is the way you should write it. That makes it far easier for the next guy to figure out what you are doing and why.

if (product.IsAvailable) 
{
  order.Process();
}
else
{
  order.PutOnBackOrder();
}

Sometimes it does make sense to maintain the order of the method calls. In those cases:

var needBackOrder = !product.available;
if (needBackOrder) 
{
  order.PutOnBackOrder();
}
else
{
  order.Process();
}

The general rule I follow is “if you need an else then the if condition must be a positive condition”.

As a creator of code, I know that sometimes a programmer’s mind is so occupied with the task of solving the issue at hand that is easy to write the code, mentally check that it works, and then move on to the next section. Programmers have to deal with time pressure, problem solving, juggling 82 variables nested 7 calls deep inside a coding framework that has conform to 3 design patterns at once. At that point, issues like FROM b JOIN a and if not true then this else that seems trivial.

But as a maintainer of code, I know that, over the life of a program, far more time is spent reading that code than was ever spent writing it. I also know how much I appreciate it when I hit a section of code that is properly laid out with good indenting, sectioning of concepts, well thought out identifiers… all the little things that allow me to quickly scan the code, understand the task it accomplishes, and find the problem. Finding the problem is often much harder than creating the code since I have to deduce everything the coder went through to write it plus interpret its function in a way that the coder didn’t (if he or she had, there wouldn’t be a bug).

That is where those seconds, milliseconds and microseconds start to add up. And they can add up to an awful lot of time, frustration and mistakes. Well formatted code is important. That’s why companies often have documented coding standards. That’s why books devote pages to explaining the conventions of how example code will be shown. Because when you are reading code, formatting matters.

I am sure all programmers do it: once the crux of the problem is solved, they review and clean up their code. That’s the point at which I hope programmers will consider the ‘next guy’: the poor slob who has to wade into the sea of statements, often sight unseen, and make sense of what the coder left behind. Have pity on the bloke: don’t make his or her job harder with backward thinking.

Advertisement

One thought on “Backward Thinking”

  1. INNER JOIN is commutative, so changing the order doesn’t matter… and I also prefer the order you do. However, keep in mind that if you’re using a left, right, or outer join things might not work that way.

    Like in a left join, start with department implies that you only get employees that have a department association, where as if you start with employees you’ll potentially get a larger result set. This could still be written “employee first”, though, and have a conditional stating that department != null… which might be more readable.

    Personally, I hate query syntax. I do my best to avoid raw SQL as much as possible. A lot of it is backwards to my way of thinking, and it’s not just cosmetic issues like this one.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s