How to migrate to PDO without it hurting… much

As we saw in the previous article, conversion to MySQLi is an awful lot of work. So let’s move to PDO.

Step 0. Get PDO working with your database server

Somewhere along the line, the PHP and MySQL folks decided to not be friends, so even though 99.99% of all PHP scripts require MySQL, in most cases, PDO doesn’t have MySQL support enabled.

Check that PHP doesn’t already have PDO for MySQL ready to go:

% php -i | grep -i pdo
Configure Command =>  '[...snip...]'
PDO
PDO support => enabled
PDO drivers => mysql, sqlite, sqlite2
pdo_mysql
PDO Driver for MySQL => enabled
pdo_mysql.cache_size => 2000 => 2000
pdo_mysql.default_socket => /tmp/mysql.sock => /tmp/mysql.sock
pdo_sqlite
PDO Driver for SQLite 3.x => enabled
The default socket location is fine for a development workstation, but not so good for a production host. Change php.ini and my.cnf to make it safer, such as /var/run/mysql/mysql.sock
If your PHP installation doesn’t show the above, whip out the compiler – there’s plenty of articles on the Interweb on how to get PDO and MySQL going. For now, let’s assume PDO and MySQL are good to go.
Step 1. Conversion
As in our previous article, it’s very tempting to just go through each file and make a 1:1 change from MySQL to PDO. That is actually a losing scenario.
  1. My non-trivial app has 1853 SQL queries littered through the code. At about 15 minutes to 30 minutes per query, with no test case, that’s about a year’s work with no change to the overall complexity of the app nor any improvements to the overall data architecture. Changing from one to the other is sure to introduce bugs.
  2. GaiaBB only has 13 tables. Using a DAO approach, with a list and search helper method for each (i.e forum list, search forum), that’s 6 * 13 = 78 properly written, tested DAO methods that need to be written from scratch. This is a saving of over 10 months work compared to just getting in there and getting my hands dirty.
  3. I can add security business requirements once to the DAO, such as fine grained access control, input validation, audit, and dirty output sanitization, thus fixing all my access control issues and dirty data issues in the same small piece of code. So each file that is converted to DAO gets more secure with no additional coding

You’re probably wondering about “dirty output sanitization”. Sanitization is for the weak minded! No seriously, my database is 7 years old. There’s crud in there – I can’t trust it to be safe or good. There’s some nice tools out there like Arshan’s Scrubbr to scan and clean a database, but Scrubbr is not going to be able to decode BBCode and check to see if there’s a nasty [color] based XSS or CSRF attack in there. Additionally, some versions of my software and some buggy versions of MySQL == binary blob crappiness. Blobs coming out sometimes (but not always) need to be stripslashed(). Additionally, some versions of XMB used client side checks to prevent files of the wrong type to be uploaded. I’ve found a JavaScript malicious file in my production database. So it’s worth adding checks so that I don’t serve bad things to folks. You can’t do this if you have 100+ attachment related SELECT statements alone, not without lots of bugs and lots of code. Going the DAO way, means I can do it once, for every single use of the attachment sub-system.

There’s gotta be a downside.

It’s not all sunshine and roses. You should not open both a PDO and old style connection to the database for busy servers like mine – it more than doubles the time to serve the average script, the poor little database server gets whacked, and performance will drop.

In GaiaBB, there’s a significant 7240 lines of shared code read in every single time a script does anything – header.php, functions.php, db.php, validation.php, constants.php, config.php, cache.php, english.lang.php, mail.class.php. So to only open ONE database connection requires these files to be ported to PDO first. But if I convert these files, every remaining single file totalling about 80,000 lines of PHP will also need to be converted.

So my solution for GaiaBB is to create a side-by-side approach, using kernel.php for converted files instead of header.php. kernel.php includes PDO ready files rather than the old files. This temporarily makes GaiaBB approach the 90,000 line mark, but in the long run, it will allow me to make many of the scripts smaller and more object orientated. Once converted to DAO, I can simply eliminate many security checks within the code of each page, as the DAO will simply throw an exception if you’re not privileged to do something with the data you’re trying to work with.

So my main piece of advice for you contemplating converting to PDO is to consider your data architecture. The old MySQL way is awful, buggy and insecure. Let’s exterminate it, but instead of standing still for months at a time, add in freebies like access control, audit, input validation and output sanity checking.

Comments

One response to “How to migrate to PDO without it hurting… much”

  1. Benoit Avatar
    Benoit

    Great post, and great aim to rewrite your app.

    As of sanityzing output, once we had to do it for a really old CMS app, started in 99 with ASP/access then migrated to PHP in 2003. In 2008 we had to integarate the front office part to support IE7/8

    The HTML contents in database got a low risks of injection (no public comments or such), but it was so ugly generated by poor WYSIWYG editor, has MsWord metadata in it.As a consequence IE7 was crashing some times (really) and other browser mostly stop rendering content seeing as much of MS Word crappy css and proprietary tags.

    We used HTML purifier to clean up this, but after some testing it was too slow to run on the fly.
    While refactoring the app we made a data sanitization script to clean up the markup stored in DB, and we adjusted the back office app to do the same before saving data.

    This way we can trutly pull HTML content from database.

    All this to say, HTMLPurifier helped us cleaning up empty or invalid HTML tags, but it can also (and apparently does it very well) filter XSS attempts and various bunch of malicious code.

Leave a Reply

Your email address will not be published. Required fields are marked *