LINQ’s query syntax was created to make me angry

Before you big C# fellaters out there get all pissed off, let me start by saying how cool some of the new C# features are. For those of you who aren’t tuned in to the Microsoft Kool-Aid bandwagon, C# is becoming, well, more Lispy (by which I mean is getting language features that will let it sit at the Adult Table next Thanksgiving). Features that C#, along with some other languages (Java, I’m looking at you, you red-headed stepchild of programming languages), have been needing.

Before I start this rant, I also wanted to share a quick hallway conversation today. A friend was complaining about how most frameworks and libraries let him down in terms of their ability to be reused, modified, and extended. I commented that I think the primary problem is the languages and paradigms we use, namely, when all your language really offers is Java-esque OOP, sucking in these areas is a terminal problem. I said that we should be using more languages like Lisp or Smalltalk or Ruby or Python or what-have-you. He reminded me of his oft-stated views, along the lines of, “I’m not just talking about syntax stuff.”

As much as I love this guy, his opinion is informed by never using languages other than C, C++, VB, C#, and ASM (assumedly Intel). Granted, this guy is smarter (and even more HR-inappropriate, imagine that) than I will ever be, but he still doesn’t understand the underlying issue: your solution to any given problem is set within and thus limited by the language with which you use to describe and solve it. This applies just as well to any scenario from which the term “language” applies – natural language, mathematics, sciences, and programming languages.

For that reason, I take seriously the features that a language offers you in the arena of what most would consider as “syntactic sugar”, which is oft overused and more often misunderstood. For example, having lambda expressions (or whatever your language chooses to call it), is almost always more than just syntactic sugar.

Which is why I’m extremely pleased that the next version of C# will have, drum roll please, lambda expressions. Ooo, and not just your everyday, run of the mill lambda expressions. Along with this, we get variant types which can be used to reference anonymous types. If you look closely enough, you’ll realize that the new anonymous types are really just tuples and dictionaries (or hashes, pick your terminology), because they can be arbitrary collections of named parameters.

All these features are exciting and offer more than just syntactic sugar over delegates. Now, don’t get me wrong, delegates were a step in the right direction (for those curious, I’m a larger fan of delegates than anonymous classes, especially since the most common use of the anonymous class is to really just be a lambda expression). Unfortunately, up until this point, delegates have really just been used as a sort of function template, for methods (either declared or anonymous delegates) that handle events.

Now, the one feature that I abhor with all my being is the new query syntax that LINQ has introduced. For those who haven’t seen it, you can write something like this:

var squares =
from someListOfNumbers
select n * n


foreach (var i in squares)
Console.WriteLine(i);

(YAY! Let’s add SQL to every language!!!!! AND PONIES TOO!!!!!)

Anyways.

What this does is take this strange syntax and do one of two things. Either you can apply this transformation (which this example will do), or you can get the syntax tree from the super-duuuper special LINQ parser thangy. For things like LINQ to SQL (or DLINQ or whatever its called), the translation layer gets the syntax tree and uses it to generate the SQL expression. For situations like this, where someListOfNumbers is literally an int[], this does just what you think it does: it prints out the squares of the numbers found in the array.

Now, it sure seems handy to be able to get the syntax tree for this little expression, as this perhaps brings us, finally, back to where we were a good 30 years ago (Lisp, anyone?). However, I have a few issues:

First, our language understands how to parse this strange and awkward sub-language of C#. We don’t know how to parse C# yet (except through reflection, which is about halfway there), and that would be the real carry-over.

Second, because this language is so awkward, C# now becomes an awful bitch to parse. In fact, we really now have multiple layers of language interaction. First, we have C#, and then we have this inner language, which, all things considered, should almost be a separate parser. Maybe you don’t care, but I think we’d all benefit just a little from languages that are a bit easier to parse by a machine (think: smarter tools).

Third, and most importantly, what will perhaps be the more common use of LINQ (i.e. anything other than LINQ to SQL), is just syntactic sugar.

In terms of C# code, this really parses down to something like:

var squares = someListOfNumbers.Select(n => n * n);

And other things, like “where”, or “groupby”, or “order”, are just more of the same – functions that take lambda expressions, can be composed with each other (assumedly by returning a similar type, like a ResultSet or something of that nature), exist as target methods (another new C# feature – you can declare methods in your own classes that get added to existing types), and create anonymous types (really dictionaries or tuples or hashes).

The real beauty is that, given what some people would call syntactic sugar, you can simply and quickly create very powerful abstractions.

The question becomes, can you still create LINQ to SQL without that super special syntax?

Well, I hope so. You see, if instead of making the super-dooooooooooooooooper special inner-language parse-able, you instead made lambda expressions parse-able and query-able (which they probably are anyways, since they get captured as a Func class with some generic parameters [for return type and parameter types]), then you could ostensibly write something like:

var results = database.SomeTable.Select( blehbleh ).Where( blehblehbleh );

(spaces added so that your browser will wrap the line)

And since you are the one writing these methods, instead of making them apply the given lamba expressions, you can instead get some information about the expressions (i.e. store them), and then translate the whole thing to SQL, and apply the statement when you go to iterate over the results.

The truth is that I don’t see the gain from having SQL directly in C# or VB. Now, moving away from your comfortable SQL syntax might make some developers unhappy, but it really just shows you how incredibly powerful (and, I would argue, useful) good languages features can be. The combination of these various improvements allows you to write stuff like LINQ to SQL (minus the dumb query syntax). In fact, I imagine that the astute reader could go write something similar to the non-LINQ to SQL stuff themselves. Seriously. Given all these language features, I’m guessing anyone could implement a simple version of all this (say, Where(), OrderBy(), GroupBy(), and Select(), assuming that Select() is the thing that creates the result) in a day minimum, week maximum.

So there you have it. As Scott Guthrie (ScottGu) notes in the first of his articles about LINQ to SQL, these features are all officially named:

  • Automatic Properties, Object Initializer and Collection Initilizers (which translates to dictionaries as objects, named property on object initialization, and first-class collection initialization [or, in other words, syntactic shortcuts for doing really common things, like creating a collection from a { } list, or creating an object by "new"ing the object and listing out the values for the properties as Name = Value pairs])
  • Extension Methods (the ability to place methods on already defined classes, even primitive types like strings)
  • Lambda Expressions
  • Query Syntax (the only bad-apple of the lot)
  • Anonymous Types (combination of having a variant reference and being able to generate types, on the fly, via the semantics afforded by the first bullet point)

So there you have it. The Query Syntax was, by far, the dumbest possible thing to put in a language. It’s syntactic sugar that, I would guess, would double the time it would take to write a fully-capable AND not horribly-designed parser for C# (read: your parser shouldn’t have a special line that says “if this line contains ‘=’ then ‘for’ and some stuff then ’select’ and some stuff). It’s almost like the C# language people got the point for 4 of the 5, and then, on the fifth, decided to implement somebodies badly thought out wet-dream.

The end result? We all suffer. I’m sure the mono people are a bit unhappy, and those of us acquainted with real programming languages just stand by and say, “Huh? Did the C# team pass around the bud one too many times?”

Conclusion – lambda = awesome, having a special sublanguage that bears no resemblance to the rest of your language = dropped on head as child. Lambda good, query syntax bad.

[[Edit: I found a copy of the new C# Language Specification, which shows examples like:

The example

from Customer c in customers
where c.City == "London"
select c

is translated into

from c in customers.Cast<Customer>()
where c.City == "London"
select c

the final translation of which is

customers.
Cast<Customer>().
Where(c => c.City == "London")

I also happened upon a video of Anders (recorded in Jan 07) here, where he talks about functional programming, full of typical strange Anders gesticulating (the Anders floppy hands).  Interesting, he uses the above code as an example (around 10:20 in the video, listen closely).  Not as interestingly, he does a rather crappy job of explaining what a lambda expression is (he takes the typical C-ish approach: "Well, this is how its implemented" type thing).   Also, be sure not to mistake Anders' "composability" to be anything resembling composability in the language, as he really just is talking about LINQ's composability, thanks to the age-old concept of having methods that operate on a certain type of object and return references to that same type of object, so you can chain things together.  Also, he is referring to the fact that LINQ to SQL, under the covers, is lazily evaluated.

Oh, and kudos to Anders for mentioning relational algebra.]]

  • Query syntax looks moronic. Nice Thanksgiving metaphor there at the beginning.
blog comments powered by Disqus