Skip navigation.

Squeezing code with xdebug

Every time we prepare to ship a new major version of phpLens (eg. 1.0, 2.0, 3.0 and soon 4.0), the PHP forms generator and app server we sell, we take pride in speeding things up. This is the time we all go into intensive code optimization sessions to find all the slow pieces of gunk we accidentally left in the code.

Unfortunately, things are not so easy. After so many versions, it gets progressively harder to find things to speed up. And even if we speed up one function by 30%, the fact that the function only consitutes perhaps 1% of the execution time means that we have speeded everything by 0.3%.

In our frustration, because we couldn't find much to optimize, we downloaded a fresh copy of xdebug, Derick Rethans code profiling module. Now the last time we tried xdebug, it was a total failure. There were a lot of bugs in xdebug 1.2 related to profiling with classes. To my pleasant surprise, the latest version of xdebug worked! A thousand thanks to Robert Beenen for submitting a patch.

Using xdebug

Here's some sample code we used for testing (after installing the extension and modifying php.ini), that outputs the profile data at the end of the page:

$xd = extension_loaded('xdebug');
if ($xd) xdebug_start_profiling();
ob_start();
include('testcode.php');
ob_end_flush();
if ($xd) xdebug_dump_function_profile(4); # 4=sort by total exec time

If you've ever been in a lecture where the teacher spent the whole session writing meaningless numbers of the blackboard, you will know how we felt. At first the profiling statistics were a numbing. Xdebug generates a list of function calls and their execution time. It looks like this:

Function Summary Profile (sorted by total execution time)
Total Time TakenAvg. Time TakenNumber of CallsFunction Name
7.56980705267.56980705261{include}
5.13778710375.13778710371*showphplens
3.91706514363.91706514361*phplens2->render
1.94814682011.94814682011{include_once}
1.88123989111.88123989111*phplens2->lensinit
1.46986889840.48995629953*adodb_mysql->cacheexecute
1.39454507831.39454507831{require_once}
1.18921899801.18921899801*phplens2->renderview
1.17148423190.29287105804*adodb_mysql->execute
1.15885615350.28971403844*adodb_mysql->_execute
1.15715098380.28928774604*adodb_mysql->_query
1.15705275540.28926318884mysql_query
1.06500005721.06500005721*phplenssession->setlens
1.01941585541.01941585541*phplenssession->get
0.70984005930.70984005931*phplens2->_joinfields

The problem with profiling is that you need an intimate feel of the code to make sense of it. You need to understand how the functions relate together, and whether the numbers make any sense. Why were there 4 function calls to mysql_query? How does that correlate with the higher level functions such as adodb_mysql->execute() and adodb_mysql->cacheexecute()?

After going through these figures, we realised that one of the initialization functions was being called several times. Sure enough the function was being accidentally called in several places. Totally unnecessary - and totally harmless - but definitely slowing down the application.

Supplementary optimization tips

The above analysis boosted our confidence in xdebug, and we continued searching, using the xdebug output as a guide. We had already applied most of the obvious optimizations mentioned in Optimizing PHP, so the following advice is supplementary:

  1. You need to know what you know. This may sound obvious, but keep an older version of your code, so you can compare timings.

  2. Measuring small things is hard. Page execution is so fast, typically less than 0.1 seconds, that any other background tasks can skew the results by up to 40%. We found that we had to run the same benchmark at least 10 times, and use the lowest time as the measure.

  3. Switch statements, like the opposite sex, are finicky. If you have very long lists of cases in your switch statements, place the most common cases at the top. Make sure all uncommon cases are at the bottom. This speeded up our date formating code by 30%.

  4. Use output buffering to concatenate large numbers of small strings. The most expensive part of combining strings is memory allocation, and output buffering reduces memory calls by allocating a huge 40K buffer initially (with 10K increments).
    ob_start();
    for ($i=0; $i < 10000; $i++) echo $i;
    $s = ob_get_contents();
    ob_end_flush();
    

    So the above is faster than:

    $s = '';
    for ($i=0; $i < 10000; $i++) $s .= $i;
    

    Doing this boosted the main program loop by 5%. Not that much, but relatively a lot when you are squeezing performance from a dry sponge.

  5. HTML and PHP is all about string manipulation. When using regular expressions for string manipulation don't forget about preg_replace_callback() and the /e modifier. Both allow you to call PHP code within a regular expression parser. Sometimes this is the fastest way of doing sophisticated string replacements such as "Find me all strings that match {pattern} and upper case all characters in pattern".

  6. One common operation that is tunable is the checking for zero-string lengths. This is very useful for table cell checking, because empty table cells look really yucky without &nbsp;. One fast way of checking zero-length strings is:
    if (empty($s) && strlen($s) == 0) IsEmpty($s);
    

    Paradoxically, this is faster than

    if (strlen($s) == 0) IsEmpty($s);
    

    because empty() is a language construct built into the Zend engine, while strlen is implemented as a standard extension function. The improved version still needs to call strlen() because empty() returns true when the string is "0" - nobody's perfect.

    The following elegant code might also work:

    if (!isset($s{0})) IsEmpty($s);
    

    But it is not portable, as some versions of PHP will emit a warning, complaining about the zero-offset.

    This optimization speeded up our cell formating functions by 10%. Does it sound silly - yes. It would be nice if some of the more frequently used standard library functions were moved into the Zend Engine.

    Update: Fixed a typo in the code examples. Also for this to work there is a presumption that most strings are non-empty.

  7. One of the things about highly object-oriented designs is that you end up with a lot of methods in your classes because you want to handle every situation extensibly. However calling functions in PHP is expensive. For performance reasons, you might want to reduce the number of functions (unroll them), and also reduce the number of parameters you pass into functions. Instead of passing 15 parameters, pass in an array (by reference of course), or use a global variable to hold the array. Or combine 10 functions into 1 big ugly fat but fast procedure.

  8. The toughest part of this tuning exercise is ensuring that the optimizations are idempotent. In other words, the code works the same way after the tuning. You can test, but you might miss out something. I still find the best way is to use source-code control, do a diff, and scrutinize the before and after changes carefully.

  9. Lastly, xdebug doesn't work well with Zend Optimizer and other extensions that manipulate PHP internals. Turn them off.

We finally achieved an average performance improvement of 15% by applying this and similar techniques to over 50 functions. And when the weather is fine and the conditions are favourable, a 30% increase in speed. I admit that I'm not so cheery after such a complicated exercise - in fact I'm totally fagged out. No more code-squeezing for a long time.

Update: Xdebug Prestentation from LinuxTag 2004. Requires Mozilla. Use left/right arrow keys to scroll slides. I had to read the javascript source to figure that out :-)

Comment viewing options

Select your preferred way to display the comments and click 'Save settings' to activate your changes.

hot sitehot sitehot site

nice

nice blog

nice blog