Vladimir Sedach

Have Emacs - Will Hack

August 24, 2009

The right way to use ORM: don't

Topic: Software engineering

A couple of days ago I came across Ted Neward's essay on the problems with object-relational mapping schemes, The Vietnam of Computer Science. A really good examination of the problems with ORM, the essay wraps up in a list of six possible approaches to the problem of working with relational databases.

The database and the object-oriented application querying the database are clearly two different systems with two different domain models (the model of the programming language the application is written in, and SQL, respectively). From the perspective of domain-driven design, they represent two bounded contexts that are typically bridged using the repository pattern.

Barring the trivial ways out of the problem - using an OO database, hacking around the ORM, or "abandonment" (which seems to be a nonsensical proposition, given that the application language will need to have some kind of model different from SQL), there are three interesting alternatives: manual ORM, embedded relational DSLs, or building on your language's object model with relational-friendly classes.

I don't think embedded relational DSLs such as LINQ help solve the object-relational model mismatch in any but the most trivial ways. All they really are are macros providing syntactic sugar for database access; all the domain mismatch problems are still left unsolved.

The manual ORM and object model extension methods have proven fruitful in my experience.

Matchcraft's web directory product was already using sql2java to generate abstract Java classes (which were then subclassed and given domain logic) from a SQL schema hand-constructed from the domain model when I came on board the project. This approach worked well - the database schema and the object model were clean and sql2java's code generation provided efficient cruft-free boilerplate, all without an ORM.

The only problem was that some behaviors were implemented with hand-written SQL queries right inside the derived class. The affected parts of the code were difficult to modify and unit test. By refactoring the system to use the repository pattern, I was able to make those problems go away, while still preserving all the benefits of the "sql2java and some handwritten queries" approach that yielded both a clean database and a clean object model.

On another project (this one to build a flight reservation system for small tour and air charter operators, written in Common Lisp), I went the other way with code generation. I wrote a set of macros that took domain object definitions and generated a SQL schema and CLOS objects describing those domain objects. The object instances were just lists of fields as returned by CLSQL (with type conversions done automatically based on information in the CLOS objects). The behaviors were based on a mix of multimethods specialized on the CLOS objects as well as ones executing hand-written SQL queries directly.

The above approach combined both aspects of the manual ORM and object model extension methods. While CLOS does come with an extensible meta-object protocol (and CLSQL already comes with a CLOS ORM based on the MOP), I wanted to prioritize a clean database schema above ease of integration with Lisp (as should any project where a SQL database is planned to be queried by more than one system). While Common Lisp's macros and multimethods made this approach extremely fast and easy, it can also be used with other languages.