Sunday, September 4, 2011

The pitfalls of compound assignment operators

Compound assignment operators

You may well know, that in Java (assuming x and y are int variables), instead of writing
    x = x + y;
you can use
    x += y;

The shortcut form is called compound assignment operator and it exists for a lot of operators. They seem harmless enough at first sight.

Now look at this:
    int x = 5, y = 3;
    x *= y + 1;
    System.out.println(x);
You may think that the second line is equivalent to x = x * y + 1, but it is not. It is equivalent to x = x * (y + 1). Therefore x becomes 5 * (3 + 1), i.e. 20.

Short circuit evaluation

Short circuit evaluation describes the fact, that in Java (and many other languages) a boolean expression is only evaluated as long as its final result is not yet known.
    boolean result = true && foo(); // foo gets called
vs.
    boolean result = false && foo(); // foo gets not called
You can (and should) use that to your advantage: Cheap operations to the left, expensive ones to the right (in case of &&, it is the other way round for ||). It also common to use short circuit evaluation for precondition checking:
    boolean result = x != null && x.getMyBool();
If x is really null, we must not call the getMyBool method as it would throw a NullPointerException.

The misleading &= operator

Back to compound assignment operators, there is a &= operator. At first sight, it looks like you could abbreviate
    keep = keep && foo();
by
    keep &= foo();
But, alas, you loose short circuit evaluation, because that really means
    keep = keep & foo();
And the & operator has no short circuit evaluation.

This can be detrimental to the performance of your code, e.g. in cases like this:
   keep &= expensiveOperation1(); 
   keep &= expensiveOperation2(); 
   keep &= expensiveOperation3(); 
   keep &= expensiveOperation4(); 
Or outright fatal, if an operation must not be called
   Object x = null;
   boolean keep = x != null;
   keep &= x.toString().isEmpty();
And there is no &&= compound assignment operator. 

Conclusion

It makes me wonder whether the comfort of saving a few keystrokes and having slightly more compact code to read outweighs these pitfalls and the increased complexity of the language.




Wednesday, July 20, 2011

Removing entries from a Java Collection

Let's look at some simple code:
public static void main(String[] args) {
  Map<String, String> m = new HashMap<String, String>();
  m.put("A", "B"); m.put("C", "D"); m.put("E", "F");
  for (Map.Entry<String, String> e : m.entrySet()) {
    if (e.getValue().equals("D")) {
      m.remove(e.getKey());
    }
  }
} 
After setting up a map which contains three entries (A->B, C->D, E->F), it iterates over its entries and removes an entry, if it matches a certain condition.
Compile, Run, works perfectly (on my machine, with my JDK, etc.)

Now we just change the condition to:
if (e.getValue().equals("B")) {

Compile, Run, oops - we get a ConcurrentModificationException like this:
Exception in thread "main" java.util.ConcurrentModificationException
  at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
  at java.util.HashMap$EntryIterator.next(HashMap.java:834)
  at java.util.HashMap$EntryIterator.next(HashMap.java:832)
  at prv.rli.codetest.CodeTest.main(CodeTest.java:11)

Why?
  • The for (x : y) statement is syntactic sugar for the old school collection iteration with an Iterator:
    Iterator<Map.Entry<String, String>> it = m.entrySet().iterator();
    while (it.hasNext()) {
    Map.Entry<String, String> e = it.next();
    }
  • An Iterator may throw a ConcurrentModificationException (a RuntimeException), if you change the collection it iterates over.



But why only with "B"?
  • Pure luck. The HashMap happens to be ordered like this: {E=F, A=B, C=D}, therefore the entry with value D is the last one and since the iterator will return false for hasNext(), we leave the loop, next() is no longer called and nothing happens.
So what is the correct way of removing an element from a map? There are several ways, which we will look at: Using the remove method of the iterator
Iterator<Map.Entry<String, String>> it = m.entrySet().iterator();
  while (it.hasNext()) {
    Map.Entry<String, String> e = it.next();
    if (e.getValue().equals("B")) {
      it.remove();
    }
  }
The Iterator interface provides a remove method, which will remove the element the last call to next() returned. Unfortunately the remove method may not be supported by some iterators. In this case the call to remove() would throw an UnsupportedOperationException. If we cannot use it.remove() we can also remember the keys of elements to remove and remove them after the iteration has completed:
  Set<String> toRemove = new HashSet<String>();
  for (Map.Entry<String, String> e : m.entrySet()) {
    if (e.getValue().equals("B")) {
      toRemove.add(e.getKey());
    }
  }
  m.keySet().removeAll(toRemove);
Note that we can use the for (x : y) notation here. Finally, if you remove most elements of a collection you can also create a new collection and only copy those elements you want to keep:
  Map<String, String> m2 = new HashMap<String, String>();
  for (Map.Entry<String, String> e : m.entrySet()) {
    if (!e.getValue().equals("B")) {
      m2.entrySet().add(e);
    }
  }
  m = m2;

This may have advantages in terms of memory consumption, because the new collection will not waste space for deleted entries.

Tuesday, June 28, 2011

Code for interview

If we hire a software engineer in the company I work for, we always want to see some code of the applicant. This can be some part of an open source project the applicant is contributing, something from a course the person attends (in case of graduates) or they solve a short task we tell them.
I always find it extremely enlightening to see how people code because it tells you a lot about the way they think and work. Most of the time I am crestfallen however, about the quality of the code presented. So here is my list of things I look for in such code:
  • Short and concise. Don't try to show how much complexity you can handle, instead demonstrate that you avoid it as much as reasonably possible. Remove unneeded lines of code.
    If you are using code from an open source project, pick a single good source file which demonstrates how smart you are. Make it easy to access that file (e.g. send it by e-mail, don't tell the employer to download the latest 1.5 GB tarball and look for your name in the version history...).
  • Good names. Classes, methods, variables should have names that are as short as possible but telling.
  • Avoid gotos. All Gotos. Write your methods so that they are Dijkstra-Diagrams. Code is much easier to read if all the methods are exited at the last line of code. This means to avoid things like return-statements in the middle of the method, breaks in loops, fallthrough case-statements.
    Even exceptions are a kind of goto but you cannot usually avoid them, as they are given by the environment.
  • Don't comment what is obvious from the code. Some applicants try to demonstrate how well they are documenting their code by writing things like this:
    x = x + 1; // increase x by 1
    or
    /** Get name. */
    void String getName() {
      if (name == null) {
        name = createName();
      }
      return name;
    }
    In the first case, just remove the comment. In the second one could better comment that the method creates the name, if this has not yet happened. But it would be better yet to call that method getOrCreateName() because that would be a more fitting name.
  • Avoid long methods. If one of your method is longer than 20 lines you can possible split it up. Depending on the situation you can do so by inserting "section comments" or introducing helper methods. Using helper methods is usually the better method as it fosters reuse.