observable collections and performance penalties

In thinking about observable collections, you come upon one basic performance penalty that is applicable to a much larger design paradigm, or at least one real (and defensible) penalty. This is the price of eventing, and in certain situations it can bloom somewhat quickly. I can only imagine that this and the complexities of choosing a good eventing system were the reasons why Java has no observable collections (as I wrote about here).

I believe it is important, in general, for every good programmer to understand, in broad strokes, the performance implications of the code he writes and the patterns he uses. Be careful though not to take this as a prescription to actually optimize them, however, as there are many performance penalties that are acceptable, for a number of reasons:

  • the penalty is minimal or has no noticeable/show-stopping effect on the system or the program
  • the penalty occurs rarely (amortize the cost of the penalty over the lifetime of the program)
  • (debatable) the penalty is expected by the user at that time
  • penalty is an inherent part of the problem you are solving (e.g. if you are writing an application that simulates the birth of a galaxy, expect for some heavy computation)
  • better than the alternatives

So, from this, we can surmise rules for when we should try to optimize for performance:

  • penalty is severe – unexpected adverse effects on either the program or the system
  • penalty is avoidable
  • penalty is in a common piece of code – the performance cost of a piece of code is a combination of where the code suffers and how often it is executed

And, the most important of all:

  • these qualities must be tested, not guessed – as a rule, programmers suck at optimizing by ear, and thus the “no premature optimization” rules

In this light, the performance penalties of eventing can be defended and/or mitigated in various ways, by convention or by contract. First, the penalty of eventing is defensible in that it is not, by nature, wasteful; events are only sent when changes occur and only if someone is registered as a listener for the changes, and this is better performing (by nature) than polling for the changes that occur. Either solution (polling or pushing) pragmatically requires some type of multi-threading, and, in that sense, you can make trade-offs between the priority of the event threads/guarantees as to when events are sent and the performance penalty. Also, by convention, event handlers are meant to be short and simple, and this lessens the impact of sending events.

(Side note: With polling, however, the code that polls the collection gets to choose exactly what threading model to use, perhaps creating a dedicated thread to watch for changes, or even add the act of checking for changes to a larger loop of data checking. Perhaps this is not an incredibly important freedom, but we should always be slightly wary of taking choice away from people writing code against our frameworks. In fact, as I find time and again, I always end up writing some type of framework-ish component first, write the code to utilize it second, and then find out that I’m missing functionality that would either make utilization easier or allow for beneficial configuration of the framework.)

Arguments Against

The most common arguments against are at least agreeable in-and-of themselves. On collections that change often, you end up sending many events, and possibly to many people (event bloom). Also, you can get by without generalized observable collections, perhaps by making the controller observable, or even, in the case of Swing, physically embed the collection inside of the view (for a later post: this is why Swing makes a true model/view/controller split impossible). In addition, there are many hard questions to answer about eventing, some of which can make certain, possibly desired uses either difficult or impossible.

For the most part, these are not show-stopping issues. For the first, again, you only pay for what you use. Most of the ways of solving this are by convention, not by contract:

  • Design the collection to have bulk modification methods and events, so I can subscribe to an event that gets sent once for any addition, be it either a single element or many elements. Most languages have a collections framework with bulk change methods, so this is really just a matter of designing the events accordingly.
  • Only register for the events you want, and unregister as soon as you have finished caring
  • Deal with events quickly

As to the second argument (putting the logic elsewhere), this is both not scalable and not clean. The scalability issue seems relatively obvious – if I have to write logic to manage each of my collections, and the logic sits outside of my collections, is there any good way of abstracting that logic (besides the obvious: “Yes – put it inside the collection!”). And, like I mentioned above, wrapping the collection inside the view (Swing component) puts an ownership relationship into place that makes the view/model split disappear. Besides, what if you have multiple components listening to the same collection, or things outside of Swing that need to know when things change? For the latter, you’ll see other entities listen for changes from the GUI. This is close to intended behavior (eventing is also useful for registering for GUI changes), but the fact is that, in this case, you are indirectly listening to the collection by listening if the GUI changes, and this convolutes your design in ways you don’t want. At that point, you can’t really separate out the model/view/controller, which is one of the most important benefits of even using that design pattern.

Things to Remember

Even saying all of this, I’m not going to make the claim that designing event frameworks is trivial. There are a lot of questions to take into account:

  • Are events delivered synchronously or asynchronously?
  • Do you guarantee an order of events?
  • What mechanism do you use to dispatch events?
  • What types of events do you supply/support?
  • What information do you send out with events?
  • What are the usage patterns of your events?

I’m sure there are others I’m forgetting. Even so, these are all solvable problems, and you should be able to create a solution that works for most people, most of the time (which is generally the best you can hope for in a framework). More to the point, Sun should have been able to find an agreeable solution for Java, rather than the bullpoop cop-out that they gave us.

So, if you are keeping tally (which I just decided to do), in the C# vs. Java battle, I now read the score as:

C#: 1 (observable collections)
Java: 1 (optional checked exceptions)

blog comments powered by Disqus