I really enjoy the concept behind our DBSI (Database Systems Implementation) project: we are to take an existing, open-source database application (such as tinySQL, SQLite, hsqldb, and many others) and add three distinct features to it. The features are meant to show our understanding of the topics we learn in class. This is a wonderful idea, but has a few (unintended?) side-effects.
One of the first things that you notice with many of the products we are choosing is that each product is horribly deficient in at least one important area of database implementation. The databases written in Java are either badly designed or pay no attention to how hardware is used. Most of these simpler databases are missing large chunks of SQL functionality or commonly-used database features, most notable of which is stored procedures and triggers. Many databases have only minimal support for indexing, and most databases have little/no documentation.
None of this is very surprising, and these databases are all very useful in their own right, but the problem is this: every single database breaks rules that we learn are important to creating a well-functioning database. These people, who spend years writing their own database systems, can’t even implement the “minimal” features requried for a relational database to be functional in the “common” areas.
What we’ve really discovered in our project is that it is way too hard to implement the minimal set of features. With the database we are working on (Axion, from tigris.org), we are attempting to add a new indexing structure, create a client/server architecture, and improve the file storage mechanism to be more efficient. You know what we have found? It is much harder than it looks in theory.
Part of the difficulty stems from trying to understand an existing system with very little documentation. The rest of the difficulty is just mapping the math to, in our case, Java. In order to be an efficient database, at some level, your engine needs to have an understanding of how the hardware operates and the ability to manipulate hardware at a very-low level. In this case, that type of ability is almost contrary to the point of Java (system agnosticism), and I assume it is almost easier to just ignore the efficiency concerns.
So the end result of this is that anybody who had any ideas about the simplicity of database implementation and a desire to work on databases has had their hopes dashed. Personally, I was extremely excited for this project last quarter, at which time I wanted to add functionality to SQLite, which is a relatively simple, compact, and efficient database engine library. Like every other group, though, we ended up working on YAJD (Yet Another Java Database), a veritable monstrosity of classes, and it has put a pretty bad taste in my mouth for database implementation.