Too much abstraction: Doctrine and PHP

If you enjoyed this article, please leave a comment, rss subscribe to my RSS feed and/or follow me on Twitter. Thank you very much!

Personally — I feel like I'm really, really late to get on the Doctrine train.

I've tried Doctrine 1 a few years back and found it mostly too complicated for my needs. For all I know, I still find PEAR::DB and PEAR::MDB2 the perfect database abstraction layers (DBAL). Though of course they may or may not be totally up to date in terms of PHP etc.. The overall concept of an ORM never stuck with me.

Of course Doctrine has a DBAL too and it would be unfair to compare its ORM to another DBAL, but it seems like almost no one uses the Doctrine DBAL by itself. I haven't either.

The primary use-case for Doctrine seems to be ORM and here is what I think about it.

Little steps

RTFM is especially hard with Doctrine: Google usually points to older documentation, then there are a lot of out-dated links and the classic has not been written yet.

In case there is documentation I compare the experience to RPM (Redhat Package Manager):

  • I need to learn about X
  • But have to read about Y first.
  • To understand Y I need to dive into Z.

Sounds familiar? Dependency resolution in documentation. However, of course there's always also light at the end of the tunnel — it's when it all works — I conquered a problem and defeated the proxies and caches. ;-)

Entities

Entities can be incredibly simple — take the persistent object:

<?php
use Doctrine\Common\Persistence\PersistentObject;

/**
 * @Entity
 * @Table(
 *   name="db.users"
 * )
 */
class Row extends PersistentObject
{
  /**
   * @Id
   * @Column(type="integer")
   * @GeneratedValue
   */ 
  protected $id
}

So what does this look like:

  • database "db"
  • table: "users"
  • columns: "id" (most likely integer and auto_increment with MySQL)

I feel like that is feature a hidden gem because I could not find much about it in official documentation — but that's basically all you have to do.

Annotations are not my thing and XML or YAML or neither — I can live with either of the three though. What seemed annoying is that usually people start off by writing half a dozen getFoo() and setFoo() methods which of course depends on how many columns in your table are. With the persistent object, this is not necessary.

Abstraction

Abstraction is Doctrine's biggest strength and largest drawback in my opinion. Abstraction is nice within reason, but I think with Doctrine it's way over the top.

My personal opinion on SQL is that SELECT * FROM table WHERE id = 1 never hurt anyone. Not to suggest to do this in the wrong layer of your application, but actually understanding the tools I work with, made me a better developer today. A JOIN does not hurt either because then you and I know how to write one and what it costs to execute the sucker. LEFT JOIN, STRAIGHT JOIN and INNER JOIN — most developers don't know the difference. Doctrine also likes cascades but I bet not many people ever watched a busy database server while it's trying to process all them while a couple 1000 writes happen at the same time.

With Doctrine, SQL is so far away that developers really don't need to have an idea about what is going on and I think that is bad in particular.

My problem is that most of the time, the developer does not see their SQL at all and also doesn't analyze what it is doing. I have yet to figure out how to make the application log all statements produced. And yeah, it seems to be impossible because with the use of prepared statements no SQL is actually produced in Doctrine — it all happens on the server. But sure, prepared statements are pretty awesome because we are all idiots and don't know how to escape data. But I should leave this for another blog post.

About abstraction — we had a similar discussion at this month's PHP Usergroup in Berlin and usually someone will counter my argument that the benefits of faster development outweigh all the potential screw ups. And maybe they are right — for a while. And then they upgrade their database server to have more RAM, right? What if for a while is already past tense?

Example

I just found the following code in one of our projects and I think it's a good example of what I mean.

The basics (broken down):

  • SQL:
    • a user table
    • columns: id, email, password, site_id
    • indices: PRIMARY (on id), UNIQUE (on email and site_id)
  • an entity Doctrine\Model\User class describing the above
  • a repository Doctrine\Model\UserRepository class to add convenience calls to the entity

So here is what I found:

<?php
// in class UserRepository
public function getUserById($id, $site_id = 1)
{
    $qb = $this->_em->createQueryBuilder();
    $qb->select(array('u'))
        ->from('Doctrine\Model\User', 'u')
        ->where('u.id = :id')
        ->andWhere('u.site_id = :site_id')
        ->setParameter('id', $id)
        ->setParameter('site_id', $site_id);

        /* ... */
}

And to the defense of all the people who worked on this (including me), it takes a while to see. :-)

Question to my reader: Take the setup into account, do you see the issue? Leave a comment if you figure out what the problem is.

So despite having an entity which defines the table setup, it's still possible to do something like that and there is no error/warning what so ever to let the developer know what they are doing. Not at run-time and not in the Doctrine CLI either. I'd argue that with the level of abstraction Doctrine adds to hide SQL from the developer, I expect that it would do enough analysis on what the developer is doing.

Or maybe the level of abstraction added here is actually counter-productive?

Fin

Food for thoughts. Hope it was not too ranty.

Disclaimer: This is with no offense meant to the people contributing to Doctrine or using Doctrine. I just had to raise my concerns somewhere.

| More