. Editors Only: News Stories Pictures Gems Shortcuts Discuss Prefs Admin Bulletins Logoff Help

PHP Everywhere Home

Home    Bio   
  My Prefs   Sign Out
  John Lim is logged in.

2003

 
Scripting in .NET is bullshit?
So what is the problem? Where is the procedural .NET!?! Where is the scripter's version of .NET!?! My colleagues are giving up on Microsoft after years of effort to consolidate on the MS platform. They do not want to be G. Andrew Duthie or Jesse Liberty. They want to create a few HTML pages with forms and write some procedural script on a server to process the data.

What I am starting to find are a number of PHP, Perl, and MySQL books appearing on the shelves of my coworkers. And they are not looking at Win2K or Win2003 as the platform or IIS as the server.

And please let's not waste time with the argument 'But you can basically script in .NET.' BS is all I can say to such a statement. -- Danny Boyd

tri: Also see Richard Tallent's response.

Discuss (1 response) (Join / Login first) Edit   permalink: #  

Merry Christmas and a Happy New Year
At this time of the year, there are better things to do than blogging. This is a time for serious fun. Love ya all!

PS: PHP5 beta 3 has just been announced.

Discuss (1 response) (Join / Login first) Edit   permalink: #  

Protein and Silicon Bugs
I'm currently sick, down with a protein bug. Not too much energy today, but after a nap, i felt energized enough to find a silicon bug in PHP5. Thanks to Jonas for the initial forensics.

Discuss (Join / Login first) Edit   permalink: #  

Innapropriate Abstractions: A Conversation with Anders Hejlsberg et al
The trouble with the wrong abstraction is there's no way out of it. In practice, though, it's very hard for class designers to make reasonable guesses about even the scenarios in which their designs will be used, much less the relative frequency of each kind of use. You may think your users will want transparency, because it lets them do really cool things, so you implement transparency. But if it turns out 99% of your users never care, guess what? Those people pay the tax. -- Eric Gunnerson

Lot's of good advice in this article. I see so many authors of PHP classes that try to do the right thing when the right thing to do is far from clear. To expect the correct design the first time is really really difficult. The key is don't try to be too ambitious. Use simple abstractions, so even if we screw up, it's easily fixed.

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

.NET Rashomon Rudeness
One fact, multiple perspectives:

  .NET continues to surprise me. For example, we recently had to play a .WAV file on .NET. Luckily .NET allows you to call the underlying operating system to play the file. We really have the best of both worlds with .NET: platform and language portability, with hooks into the OS.

  .NET continues to surprise me. It still remains a thin layer between the OS and you. For example, we recently found out that playing a .WAV file in .NET requires making a Win32 call. It's frustrating that there is no .NET multimedia equivalent.

  .NET continues to surprise me. It still remains a thin hymen between the OS and you. Once you penetrate it, and do stuff like playing sexy music, you're no longer a virgin - you're hooked up to something better than .NET sex - the Win32 api.

Discuss (5 responses) (Join / Login first) Edit   permalink: #  

Andrew Stopford's the man for Rotor latte
Andrew Stopford wrote a book on PHP on Windows, but he's addicted to Rotor latte nowadays. He's sipping on hot .NET and CLR expresso on his blog. I read it to find out what's happening in the .NET world. Great brewing, Andrew!

Discuss (1 response) (Join / Login first) Edit   permalink: #  

The "Simple" Art of Code Design
I just visited SitePoint's Advanced PHP Forums, and spotted a debate on cache design. I only stumbled upon this because I saw that Manuel Lemos posted to this thread. Manuel has a unique way of writing that almost always guarantees a heated response. But Manuel brings up a good point. Most cache classes out there could do with better designs.

Now one good way of designing things is to define the parameters the code must work under in the beginning, then work your code from there. For example, for caching it could be:

  1. Works in multi-user environment and is portable. I'm also assuming in this design that we are saving to a file.
  2. Data must have a basic integrity checks. File IO can fail. Programs can crash.
  3. Solid concurrency and stress testing. Caching is typically most useful in overloaded environments.

Let's address these issues:

  1. Multi-user: You probably shouldn't be using PHP's file-locking, unless you're writing a private label class for only yourself. PHP's flock doesn't work on many systems such as NFS and FAT, so that's a big no-no. In adodb (and newer versions of smarty too i believe), content saving use the following file trickery:

    a. save contents into temporary file.
    b. delete the original cache file, eg. unlink($filename).
    c. rename the temporary file, eg. rename($tempname, $filename).

    A file rename operation is guaranteed to be atomic by the operating system, and is very efficient as file functions are always highly optimized.

  2. Data integrity: You should store and check the file-size in the cache file to ensure that the process didn't crash while saving. If you want stronger integrity, use crc32. In adodb, we store the file-size and use PHP's serialize function, which has many built-in integrity checks. Using serialize internally also means you can automatically save objects and arrays, not merely strings.

  3. Testing: I've seen elaborate unit tests for caching. But without the most basic concurrency test you might as well not do any testing.

    One such simple testing recipe is to write a script to continiously poll a cache file, saving occasionally. Output or log all errors (having a debug mode makes it straight-forward). Then execute the same script concurrently in separate processes and watch for problems. Let the testing cook for at least 10 minutes.

The above suggestions sound simple, but as I've learnt, simplicity is not easy to achieve. In youth, everything is oversimplified or too complicated - real simplicity comes from experience. Unfortunately, it's only when we get too old and senile that everything becomes truly simple ;-)

Discuss (7 responses) (Join / Login first) Edit   permalink: #  

It takes a snake to prove that .NET is fast for scripting
Some language experts have said publicly that scripting languages built on .NET or Java bytecodes will always perform poorly due to limitations in bytecode design.

Jim Hugunin, contrarian to the core, has posted an email which suggests the reverse, by comparing Python written in C (CPython) with one Jim wrote using .NET (IronPython). Jim is the original author of Jython, the Java implementation of Python, so he has the skills to back his claim.

And if his claim is correct, I think that the push to move scripting languages (eg. PHP) to Parrot or similar bytecode systems which can be Just-In-Time compiled will accelerate. The downside is that the overhead of dynamic eval's is much higher - making scripts that use eval() to dynamically compile code prohibitively expensive.

Discuss (Join / Login first) Edit   permalink: #  

Zend/Win Enabler - Running PHP on Windows
I've been advocating FastCGI as the most stable way to run PHP on Windows for a long time. Now I see Zend has decided to commercialize such a FastCGI solution.

Open source is an extremely incestous activity. Everyone works together and competes together (I just hope they don't sleep together). The FastCGI work was done by an Shane Caraveo, who works for ActiveState, a company that sells products that directly competes with Zend. And in the fine tradition of open source, there is always a free solution when you don't want to pay anything - you can use the free FastCGI+PHP installer for Windows I wrote a while back.

Discuss (1 response) (Join / Login first) Edit   permalink: #  

Study Questions Companies' Ability to Handle J2EE
I'm back from my holidays, and am continuing to dive into Java. I'm trying out Eclipse, been using JBoss and am reading Thinking in Java.

Java remains an excellent replacement to coding in C or C++. But for many domains, Java is too complex and an overkill. Here's a recent study that supports this view:

Wily Technology, a Brisbane, Calif. company that gauges how well Java performs, said it has completed a study on the quality of how Java 2 Platform, Enterprise Edition (J2EE) applications run in certain enterprise environments.

After interviewing more than 350 J2EE users who work outside the software industry, the research firm concluded that applications running on the J2EE platform were available to users 88 percent of the time. "That means on average, companies are losing over 20 hours, or one full day per week, of availability," said Lewis Cirne, founder and chief technology officer of Wily Technology.

The Wiley study found that enterprises who rely on J2EE to run their applications must endure at least one full day of downtime per week. Downtime, a reviled word in the computing industry, could mean lost dollars for companies that rely on software to power their businesses.

Discuss (5 responses) (Join / Login first) Edit   permalink: #  

The Mans Aways on Holidays
Back in 2 weeks time, hopefully sane, zesty and healthy. And to all you Malaysians - a happy Raya holidays!

Discuss (3 responses) (Join / Login first) Edit   permalink: #  

PHP Benchmark
Sebastian is doing some neat work on testing performance. I found it hard to decipher the data, so I graphed it in Excel.

PHP Sebastian Benchmark:

There seems to be some drop-off in performance in PHP 4.3.4. I guess the core developers are putting their energies into PHP 5.

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

21st Century Blues
I guess i've finally arrived in the 21st century and it feels good. I have submitted to the onslaught of new technology and just bought an Apple iPod. I liked it so much that i bought it even though i wasn't sure whether i had a firewire connection on my windows notebook. Luckily it did, or i would have felt really stupid.

Discuss (Join / Login first) Edit   permalink: #  

ProFont for Windows and X-Windows
I remember one of my programming colleagues (Rajesh, a very talented coder, at that time just fresh from college) many years ago placing newlines after every line of code, using a lot of white space, making his source code look very artistic. I pointed out to him that when you can only see 10 lines of code on one screen, you need to scroll up and down all the time, and it takes a very long time to understand what's going on.

My advice to him was to chunk his code, making each dense chunk represent some complex computation, and to to make each chunk fit one screen. I also gave him a second piece of advice. It was related to fonts: avoid Courier - use a professional font where zero's are slashed, and 1 and l are distinguishable (that's one and letter L).

That's where ProFont comes in. It's a great programming font for the Apple Macintosh, designed to look good in small font sizes. I remember using ProFont religiously on my Mac for software development many years ago, and looking for something similar when I moved to Windows. Recently I found that it has been ported to Windows (and X-Windows too). If you want to a panoramic view of your code, but you don't want to go blind, this is the solution - highly recommended.

PS: Use the .FON version. It's better optimized for small sizes than the .TTF one.

Discuss (4 responses) (Join / Login first) Edit   permalink: #  

PEAR2: The interface is the framework
PEAR1 is now suffering from the fact that it focused from a long period of building technical foundations without planning the community growth at the same time. -- Lukas Smith.

Now that the cat is out of the bag about PEAR's problems, I will give my 2 cents worth.

Firstly, I want to say that I am a PEAR user, but not a PEAR developer. I do use code from PEAR, everyday. There is some first class code in there. But yes I do see problems, and I also have a radical solution at the very end of this essay.

The biggest problem IMHO with PEAR is the arbitrary way things are apparently run. Why is one class accepted, and another rejected? If it is because they do the same thing, then why are there so many classes that do the same thing in PEAR?

My perception is that

  • there is no planning except for flavour of the day
  • there are egos in the community, so people introduce new classes rather than rewrite existing ones to add new functionality
  • poor design in some of the original classes makes it impractical to extend the original classes
  • the quality of PEAR is uneven, brilliant gems combined with duds - and the problem is that the duds begin with the base classes (PEAR and PEAR_Error)

How can things improve? Obviously PEAR2 must be designed to grow with the community. One way to grow in an open source world (if there is no leader with the stature of Linus in the community) is to give everyone a chance to contribute their own code fairly, without the arbitrariness that we see today.

In the Perl world, we have CPAN, where there is no thought police telling you that your code is not acceptable - you just register as a developer, declare a namespace, and upload. Yes there is lots of code duplication, and yes there is no coding standard - but i do perceive that it is fair.

I have a suggestion for PEAR. I think they need to be more open, and allow multiple people to develop similar classes for the same category. What the PEAR group could do is standardize the API's for specific categories. This API defines a minimum inter-operatable subset of code for a specific category. Poor documentation will be less of an issue if all contributed code share a common base API.

I can see some similar classes in PEAR share common APIs informally, but it would be better if it were standardized, so it is less arbitrary and more inclusive. In fact, PHP5 has a good way of enforcing this contract with the implements keyword. The launch of PHP5 is also a good time to start, because we can finally put PEAR exceptions to sleep and use built-in exceptions.

Now here comes the really radical suggestion - if your contributed class conforms to that category's unit tests, it should be accepted into PEAR. No ifs, no buts, no thought police. And how would you pick which class to use if there are 10 implementations? Let the PHP community decide! Votes or download statistics could be displayed to "fairly" quantify the best code contribution for a category.

Lastly, I want to remind everyone that standardizing on a common API does not inhibit creativity, provided that the API is sensibly chosen. For example, ADOdb and MDB both have a PEAR DB emulation layer. In fact common APIs do encourage innovation within a shared framework (eg. the BSD forks go in different directions but share code).

Thanks for listening.

Discuss (23 responses) (Join / Login first) Edit   permalink: #  

Today I ran JBoss for the first time
Another personal milestone i just wanted to note before i forget.

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

Smash the Windows
As our society becomes ever more dependent on information technology, the gulf between those who understand computers and those who don't will get wider and wider. In 50 years, perhaps much less, the ability to read and write code will be as essential for professionals of every stripe as the ability to read and write a human language is today. If your children's children can't speak the language of the machines, they will have to get a manual job - if there are any left -- Dylan Evans.

Discuss (6 responses) (Join / Login first) Edit   permalink: #  

A Turning point for PEAR?
A posting in the php.pear.dev newsgroup by one of PEAR's leading developers. I think this is a very honest assessment, and I'm hopeful about this.
> From: Xavier Noguer [mailto:xnoguer#xavier-noguer.com]
> Sent: Saturday, November 08, 2003 3:41 PM
 
> Martin Jansen <mj#php.net> escribió
> 
> > (http://pear.php.net/manual/en/developers.contributing.php)
> >
> > It is pretty funny to see how much developers have actually read the
> > "Developers Guide" ...
> 
>  I've read that guide. I just don't seem to be able to take it seriously
> when
> it lists requirements that have never, as far as I know, been voted by the
> pear group or the developer community at large, such as regressions tests
> (http://cvs.php.net/diff.php/peardoc/en/guide/developers/contributing.xml?
> r1=1.8&r2=1.9&ty=h)

I agree here. While previous mistakes don't make a wrong a right, I think our whole manual lacks any consistent concept of what we really feel needs to happen when and where.

Anyways we have a bunch of messes that are a result of the long period of limited peer review, followed by a period of package inflation, followed by the today ruling confusion.

I think its time we fix our standards by starting from a clean slate with a PEAR2. There we can think about how to best deal with our developer and users base and how to great the best possible code in a PHP version which actually supports our needs for OOP.

PEAR1 should of course be maintained as we all have an interest to keep that code running and to use PEAR1 as a momentum towards a PEAR2 which build on the past experience.

<rant> PEAR1 is now suffering from the fact that it focused from a long period of building technical foundations without planning the community growth at the same time. This has lead to numerous problems and is making us very inefficient. Of course PEAR1 has a lot to offer, but I don't think we are scaling well and past mistakes seem to haunt us more and more, which we don't seem to be able to fix. So I think we need to recognize our past which we of course need to maintain to remain credible, but at the same time we should work to build a more scalable PEAR2 in which we can address our issues on a clean slate. </rant>

> Would you be so kind to point me to the pear group document or public > discussion in which this requirement was approved?

There is no such decision I can remember. As George pointed out we discussed this point in Amsterdam, however I don't remember that anyone decided on requiring documentation at first commit. However our decisions there were mostly only concepts and not complete. Anyways maybe someone should check when this was commited anyways.

Regards, Lukas Smith

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

Is Novell-SuSE deal a brilliant Big Blue power play?
Perhaps the most interesting take on the Novell-SuSE deal is the above link. What has been missed by other commentators is that this is a tri-partite agreement with IBM. David Berlind clarifies a lot of things, even if it is still not the full picture.

Red Hat is also not standing still, abandoning its hobbyist roots to sell only to Enterprises. As Bruce Perens says, "The open source community is supposed to produce Fedora so Red Hat can put a stamp on it and charge lots of money for it."

Does anyone have any recommendations for free Linux distros? Ease of installation and use are more important to me than power (hey, the first computer i ever bought was a Mac.)

PS: Björn Schotte has some groovy pictures of the recent PHP Conference 2003 in Germany.

Discuss (9 responses) (Join / Login first) Edit   permalink: #  

ADOdb 4.02 released with PHP5 suppport
Been playing around with PHP5. ADOdb 4.02 now works transparently with both PHP4 and PHP5. If PHP5 is detected then the following features will be automatically enabled:

Support for PHP5 iterator overloading

  $rs = $DB->Execute("select * from table");
foreach($rs as $row => $fields) { var_dump($fields); }

Support for PHP5 exceptions

Just include adodb-exceptions.inc.php and you can now catch exceptions on connection and execute errors as they occur.

include("../adodb-exceptions.inc.php"); 
include("../adodb.inc.php");	 
try { 
	$db = NewADOConnection("oci8"); 
	$db->Connect('','scott','bad-password'); 
} catch (exception $e) { 
	var_dump($e); 
} 

I managed to surprise myself, the PHP5 iterator code is backward compatible with PHP4, even though the IMPLEMENTS keyword is illegal in PHP4, thanks to the magic of includes.

And IMHO, the PHP5 iterator implementation with IteratorAggregate and Iterator, though powerful, is too complicated - certainly not in the spirit of PHP.

Discuss (7 responses) (Join / Login first) Edit   permalink: #  

PHP 4.3.4
After a lengthy QA process, PHP 4.3.4 is finally out! This is a medium size maintenance release, with a fair number of bug fixes. All users are encouraged to upgrade to 4.3.4.

Bugfix release
PHP 4.3.4 contains, among others, following important fixes, additions and improvements:

Fixed disk_total_space() and disk_free_space() under FreeBSD.
Fixed FastCGI being unable to bind to a specific IP.
Fixed several bugs in mail() implementation on win32.
Fixed crashes in a number of functions.
Fixed compile failure on MacOSX 10.3 Panther.
Over 60 various bug fixes!

The bug-fix that most concerns me most is this one:

Fixed bug 25404 (ext/pgsql: open transactions not closed when script ends).

In our early days with PHP, we used MySQL a lot. Nowadays, most of our PHP work is with PostgreSQL and Oracle (with the occasional MSSQL project). MySQL is still a good database, but without triggers and views, it no longer meets our company's needs.

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

Sun, Zend integrate PHP with Sun's Web server
Zend, oversees the development of PHP and also sells a commercial implementation of the technology. On Monday it released two products that integrate with Version 6.1 of Sun's Java System Web Server, allowing companies to deploy PHP on Sun's software. (The Java System Web Server was known previously as the Sun ONE Web Server.)

The two products are the PHP Enabler, which is intended to let PHP programs run smoothly on Sun's Web server, and the Zend Performance Suite, which uses code acceleration, content caching and other software tricks to improve the performance of PHP on the Sun platform, the companies said.

Discuss (Join / Login first) Edit   permalink: #  

Icky Sticky Leaky PHPloat
After running PHP5 beta 2 for 2 days on Apache 1.3 (multi-threaded SAPI on Windows), I was surprised to find that the process was taking 500 Mb. There must be lots of memory leaks. I'm pretty happy that most PHP code just runs, but obviously it's not production ready.

Now the hard work starts, how to integrate new PHP5 functionality without impacting old PHP4 code. After thinking a while, I realize there are only a few things I can do to write portable code:

  • In PHP5, we use __clone() to duplicate objects. To make this portable, simply check the PHP_VERSION:
      $obj2 = (PHP_VERSION >= 5) ? $this->__clone() : $this;
    
  • Error-handling can be encapsulated in a separate function, and conditionally included based on PHP version. Eg.
    if (PHP_VERSION >= 5) include("exceptions.inc.php");
    else include("error.inc.php");
    

Unfortunately, most other features require explicit use of keywords that are illegal in PHP4, eg. private, protected, implements, etc. These require maintaining a separate codebase for both versions of PHP, or some special pre-processing to be done on the code depending on the versions. Does anyone have a better suggestion?

PS: I will be releasing a new version of ADOdb soon, one that should be compatible with PHP5.

Discuss (4 responses) (Join / Login first) Edit   permalink: #  

PHP 5 beta 2 is out
Just downloaded the Windows install and tested it.

To run on Apache, I copied php4apache.dll to the php5 root directory, and modified Apache's httpd.conf:

LoadModule php5_module c:/php5/php5b2/php4apache.dll
AddModule mod_php5.c
AddType application/x-httpd-php .php

For some reason, php_mysql.dll is not working. The error message is "The procedure entry point mysql_create_db could not be located in the dynamic link library LIBMYSQL.dll". I made sure that the LIBMYSQL.dll was the one that came with the PHP release. I'm using MySQL 4.0.12. Perhaps someone can comment on this.

However Oracle's oci8 extension is working fine. As most of our software runs on Oracle, it was easy for us to continue testing PHP5. 99% of all code ran fine. The only gotcha i found was that if your function returns a reference, you can no longer do this:

  return $this->function();

but have to change your code to this:

  $ret =& $this->function(); # & is not needed if you don't support PHP4
return $ret;

A very impressive beta release except for the above glitches.

Discuss (3 responses) (Join / Login first) Edit   permalink: #  

A different take on PHP-Con
Before I came down I was worried about how I would be accepted by this crowd as a Microsoft representative. Would it be hostile? Would it be open?

Discuss (Join / Login first) Edit   permalink: #  

Living La Vida Longhorn
Chris Sells kicks off his inaugural installment of the Longhorn Foghorn column by defining the pillars of "Longhorn," the next generation of the Windows operating system, and providing an overview of each pillar.

Also see Working with Data in ASP.NET Whidbey for an overview of the next version of ASP.NET.

Discuss (1 response) (Join / Login first) Edit   permalink: #  

Is Pharrot That Fast?
I have not seen the sources of the Pharrot compiler, but if it is just a proof of concept, it is not likely that much error-checking is done. I expect the figures for Pharrot below to increase by at least 50% when all the error-handling is put in.

The figures are still very impressive despite these reservations.

Generating a Mandelbrot fractal
 – PHP,                     2.4  seconds
 – PHP-Hacked,              1.2  seconds
 – Parrot without JIT,      0.5  seconds x1.5 = .75
 – Parrot with JIT (Intel), 0.08 seconds x1.5 = .12

tri: This is the best Parrot reference, next to the source of course.

Discuss (Join / Login first) Edit   permalink: #  

The Shape of Pharrots to Come
John Coggeshall mentions that the PHP on Parrot project has been named "Pharrot" by the php-con conference attendees.

Here's my take on things. Now I don't have any inside info, so the following is entirely guesswork, and any resemblence to reality is entirely your imagination:

  • Although Sterling and Thies are very modest, given the fact that they were given the closing keynote and the amazing performance improvements - Pharrot will probably be PHP 6.

  • The speed of the JIT means that PHP will become a general programming language. A high performance application server written 100% in PHP becomes practical. A high performance anything becomes practical in PHP.

  • The tribes using Parrot will probably include Python, PHP and Perl. Code sharing between different programming tribes will become a reality. This does not mean that there will be full interoperatability between all languages, because (a) there is no common runtime library (yet), (b) and no consensus on what will be the default PMC's (Parrot's language extensions) installed.

  • There will be battles fought over the run-time. In PHP4/5, after execution, we throw away the opcodes together with the bath water, or store them in shared memory. Parrot gives you more choices. See the end of Dan Sugalski's Parrot internals presentation (ppt).

  • The Zend API is dead - big deal. Parrot is a big opportunity for companies with skill and resolve. The tools market for open source programming languages suddenly becomes much larger because you are able to support so many more languages effectively.

  • My prediction: the first beta of Pharrot will be out in 2006.

PS: Selkirk was prescient about parrot. Smart chap.

Discuss (4 responses) (Join / Login first) Edit   permalink: #  

PHP & Parrot (PDF)
Here is the PHP-Con Closing Keynote by Sterling Hughes and Thies Arntzen on running PHP on Parrot.

Parrot is a virtual machine used to efficiently execute bytecode for interpreted languages. Parrot will be the target platform to which Perl 6 code is compiled.

From their slides:

       Parrot is FAST!

Generating a Mandelbrot fractal – PHP, 2.4 seconds – PHP-Hacked, 1.2 seconds – Parrot without JIT, 0.5 seconds – Parrot with JIT (Intel), 0.08 seconds

I presume that PHP-Hacked is the patched PHP that Sterling and Thies released in August. I have been a sceptic about Perl 6 because it has taken so long, but this is impressive stuff.

Discuss (5 responses) (Join / Login first) Edit   permalink: #  

Natural Born Killers of PHP
Recently I revised Optimizing PHP, an article that I wrote in 2002. I'm pleased to say that it hasn't aged much. The changes I had to make include a recommendation to use FastCGI with IIS, adding Turck MMCache to the opcode cache list, recommending Cache_Lite, replacing foreach with list/each for large arrays, and the realisation that arrays need to be passed by reference too.

Then I realized how much more I could have discussed. I started to think more about performance profiling after discussing APD in a previous post.

Now there is some logic to the fact that functions with many lines of code will run slower than ones which are short and brief. However sometimes it's the shortest code snippet that cause the real slowdowns. That's because these code snippets call external functions that hide a lot of complexity behind a deceptively simple and light exterior. With any of the following natural born killers, one thoughtless line of code can result in a unbearable x100 times slowdown:

  • SQL statements - for example, forgetting to add an index to a large table will have a massive impact on performance.

  • Regular expressions - because regular expressions work by back-tracking when a match fails, its common for a regular expression to be exponentially expensive to compute. The longer the string, the worse it becomes.

  • Network calls - the increasing popularity of SOAP and similar inefficient but easy-to-use protocols open up new vistas of unscalability for the unwary.

Classic performance tuning techniques used by XDebug and APD give you a summary of all functions with long execution times. This is useful, but we need tools to easily pinpoint, measure and tune the overhead of killers such as SQL statements, regular expressions and RPC calls. I think that we already have enough Open Source CMS projects out there. Tuning tools like this are great idea for students and developers with time to kill, looking for an interesting Open Source project to start.

Discuss (7 responses) (Join / Login first) Edit   permalink: #  

PHP Performance Profiling with APD
Good article on profiling PHP using APD, the Advanced PHP Debugger. This is a bit of a misnomer, because APD is not a debugger you use to step through your code, but is actually a diagnostic and profiling tool. I mostly use XDebug for profiling, but APD looks like a cool alternative.

And if you are interested in performance tuning, do have a look at ADOdb's database performance monitoring features. The dreadful thing about SQL is that it is an iceberg of complexity hidden in a deceptively simple query language. Bad PHP code can slow your code down by a factor of x2-5 perhaps. But one bad SQL statement can cause a x10-100 times slowdown.

Lastly, if you are using Windows, unless you have the ability and means to compile APD, you're out of luck. This is one area where PECL (which is the official repository for PHP extensions) could improve on.

Update: George (APD's author) mentions that Shane Caraveo has ported APD to Windows, and Will and John add that pre-compiled PECL dll's for Windows are available from here and there. (added 25 Oct 2003).

Discuss (5 responses) (Join / Login first) Edit   permalink: #  

PHP Compiler Cache Internals
The latest English issue of PHP Magazine has an interesting article about implementing a PHP Opcode Cache by George Schlossnagle, the author of APC.

If you're familiar with the English expression, don't throw the baby out with the bath water, then you will be amused to learn that that's exactly how the Zend Engine (PHP's compiler) works. It will compile the PHP into opcodes for a page request, and throw the opcodes away immediately after the code completes.

This may sound really wierd and inefficient, but of course Zeev and Andi would not have been able to start their own company, Zend, without a business plan that involved fixing this "stupidity". And you thought Microsoft was evil ;-)

Now it is perfectly normal when developing a platform to leave gaps for commercial vendors to fill. That creates a ecosystem where we have companies willing to pay to maintain and promote PHP. So this isn't meant to be an attack against Zend, but an acknowledgement of business realities.

This omission of the Zend Engine stimulated interest in several open source developers to create their own opcode caches. APC is one of the earliest open source opcode caches.

In my benchmarks (yes, you see me benchmark a lot, because that's the only way to understand the performance profile of PHP software without spending a lot of time examining source code) I noticed that the overhead of PHP opcode caches was less for small scripts. Obviously there is some copying of instructions from the cache in shared memory during script execution. The question was how much? How did it affect performance?

Now we have the answer. George says restoration of the opcode info for script execution "involves only a so-called shallow copy of the op_array. A shallow copy means that only the structure itself is copied, but none of the elements it contains pointers to."

This means that the actual opcodes are not actually copied, only the pointers to the structures that contain the opcodes. Apart from that, the function and class metadata and any static variables are restored, and the inheritance hierarchy is dynamically resolved.

So the overhead of the opcode cache is O(n), where n is the number of functions+classes+inheritance levels+properties+PHP files. It is not proportional to the number of lines of code - that would be as worrying as throwing the baby with the bath water.

Another excellent issue of PHP Magazine!

Discuss (Join / Login first) Edit   permalink: #  

ADOdb 4.00 released
ADOdb 4.00 is out after a 3 month beta testing process.

The distinguishing feature of this release is the performance monitoring functionality. AFAIK, it is the first Open Source cross-platform, multi-database performance monitoring and health check software in the world.

It features:

  • A quick health check of your database server using $perf->HealthCheck() or $perf->HealthCheckCLI().
  • User interface for performance monitoring, $perf->UI(). This UI displays:
    • the health check,
    • all SQL logged and their query plans,
    • a list of all tables in the current database
    • an interface to continiously poll the server for key performance indicators such as CPU, Hit Ratio, Disk I/O
  • Gives you an API to build database monitoring tools for a server farm, for example calling $perf->DBParameter('data cache hit ratio') returns this very important statistic in a database independant manner.

ADOdb also has the ability to log all SQL executed, using ADOdb's LogSQL feature. All SQL logged can be analyzed through the performance monitor user interface. In the View SQL mode, we categorize the SQL into 3 types:

  • Suspicious SQL: queries with high average execution times, and are potential candidates for rewriting
  • Expensive SQL: queries with high total execution times (#executions * avg execution time). Optimizing these queries will reduce your database server load.
  • Invalid SQL: queries that generate errors.

Each query is hyperlinked to a description of the query plan, and every PHP script that executed that query is also shown.

Databases that work with the performance monitoring features include MySQL, PostgreSQL, Oracle, Informix, MSSQL, DB2. Code contributions are very welcome.

Download: http://php.weblogs.com/adodb#downloads
Performance Monitoring Docs: http://phplens.com/lens/adodb/docs-perf.htm

Discuss (Join / Login first) Edit   permalink: #  

PHP EasyWindows Installer 4.3.3.1 released
The latest release of this installer, which sets up PHP 4.3.3, FastCGI, Turck MMCache 2.4.1, PEAR and ADOdb for IIS, Apache 1.3 and 2.0. This is the installer that we use in-house for our Windows projects.

My recent benchmarks with PHP to test XML-RPC performance show that IIS is the fastest PHP web server on Windows. The XML-RPC benchmark was a PHP script with 1 select, 2 update queries to Oracle.

            XML-RPC    
            Reqs/sec 
Apache 1.3    44         
Apache 2.0    42         
IIS+FastCGI  108        
If you want a PHP installer that is tuned for IIS with FastCGI, this is the one.

Formerly, the version numbers of the installer were not synchronized with PHP. Now the numbering system will match the PHP version for easy tracking.

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

The Philosophy of PHP
Every couple of months, someone requests the Perl style regular expression operator (=~) be implemented for PHP. The PHP internals group response to this is a good insight into their clarity-first design style:
I don't think there's a chance we'd agree to implement such regex operators (wether the singular or plural versions :) Except for it pushing PHP in Perl's direction of being unreadable it doesn't really give any added value. I don't see how it is a significant improvement over using a function such as preg_match(). (Actually I think the latter is more readable).

Here's a small quote from Mr. Ritchie: "A language that doesn't have everything is actually easier to program in than some that do."

I think Perl is a prime example of why the quote is correct. -- Andi Gutmans

and
You are pushing towards

$_~=/^\.*?\$$/;

This is not human-readable code and one of the basic characteristics that sets PHP apart from Perl. Every non-trivial line of PHP code has a decypherable keyword that you can plug into the manual to figure out what that line is doing. We make sure of this by keeping the number of operators to a minimum. As for your bitshifting example. It has nothing to do with the frequency of use, it has to do with readability. -- Rasmus

Discuss (3 responses) (Join / Login first) Edit   permalink: #  

We need barking dogs to fix "The PHP Scalability Myth"
This above link is a great article, but you must be an asphyxiating ostrich burrowed in the sand to say that PHP doesn't scale. It's not that slow a programming language, it has few bottlenecks, and perhaps most important - there are many case studies of large web sites using it for mission critical work (eg. Lufthansa's online booking and Yahoo).

However I do feel something is still missing from PHP. Not in the language per se, but in the conceptual overview. What we really need are dogs barking and cats meowing in our very own virtual Pet Shop. For people who don't understand this reference, the Pet Shop is famous web application created to demonstrate best practices in scalability and software design for J2EE and .NET.

In contrast, there is no reference PHP Pet Shop, and no accepted and well-documented methodology on how to create scalable web-sites with PHP. So you need to be very smart, or get the advice of technical gurus to keep everything scalable and running smoothly. It's no accident that Yahoo, perhaps the biggest web-site that has invested heavily in PHP, employs so many cool cats who have a deep knowledge of PHP such as Rasmus and Andrei.

Update: Slashdot discusses this article. I liked this comment best:

The sad reality is that so few developers know enough to fully exploit J2EE that they wind up doing little more than what PHP does better in the first place.

True wisdom is knowing your limitations (18 Oct 2003).

Discuss (3 responses) (Join / Login first) Edit   permalink: #  

10,000 ways to Ni Hao with Unicode and PHP
Joel Spolsky has been cursing the lack of support for Unicode in PHP. So last week, he wrote this great article on The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Of course, that still doesn't answer his initial question of how to get Unicode working with PHP. Well Scott Reynen had the solution and wrote How to develop multilingual, Unicode applications with PHP in response to Joel's frustration.

Scott's technique works on all versions of PHP. Or you can just use the UTF8 character set and mbstring functions, which should run faster as they are coded in C. To use mbstring, you need PHP 4.3 or later (it was buggy pre-4.3). On Unix, you will need to compile the extension in. On Windows, you just need to modify your php.ini.

Update: l0t3k is working on a Unicode I18N extension based on IBM's open source International Components for Unicode. The extension includes a UnicodeString class with the requisite searching, replacing, casing, trimming, and classification methods. Get the CVS version (16 Oct 2003).

PS: Ni Hao means hello in Chinese.

Discuss (3 responses) (Join / Login first) Edit   permalink: #  

Oracle ramping up its support for PHP?
Recently, Christopher Jones from Oracle Australia emailed me, asking me a few questions about ADOdb. Apparently Oracle is looking into ramping up its PHP support, and are starting to study what PHP developers are actually using to connect to Oracle databases.

It was a pleasant surprise to find out that he was 1 year behind me at the University of Melbourne, Australia. We never met, but probably passed each other in the dingy computer labs. Ah, youth - I was quite a skinny scruffy lad, and 22 kilos lighter too (50 lbs).

And for those of you who are having problems with Oracle and PHP, this forum at Oracle can be used to ask PHP questions. The nice thing about this forum is that you don't need to get an expensive metalink account (Oracle's support site) to get answers from real experts.

A quick dig through the forums revealed some useful nuggets of information that I didn't know, such as this guide on using DBMS_OUTPUT with PHP. DBMS_OUTPUT is Oracle's equivalent of echo.

Discuss (Join / Login first) Edit   permalink: #  

Bridging PHP and .NET with XML-RPC
A couple of months ago, I mentioned that we are developing rich Windows clients running on .NET. We are still using PHP on the web server, and using XML-RPC for communication between .NET and PHP. We could have tried to use SOAP, but at that time there were too many deficiencies in NuSOAP's support of arrays and collections.

We used the following XML-RPC libraries: Keith Deven's for PHP, and the Cook Computing library for .NET. It turned out to be very simple to get everything working. The docs were clear, and with half a day of coding, we got basic connectivity up.

We've been happy using Keith Deven's PHP library, but recently we decided to test using the epinions xmlrpc extension that comes bundled with PHP. We thought it would be significantly faster as it was coded in C. To our surprise, it made very little difference to performance. The improvement in speed was just 1%. The reason is simple: most of the overhead is in the networking, apache processing, and actual computation. The breakdown of time is roughly:

  • 10% testing overhead
  • 50% networking/apache overhead
  • 40% PHP computation (1 query and 2 updates to Oracle)

Given that the client/server testing was done on a single machine, that 50% networking/apache overhead is quite high. This explains my interest in Nanoweb and simpler alternatives.

Update: On Windows, it appears IIS is still the high performance solution. Apache 2.0.46 and 1.3.28 did not exhibit good performance (with PHP running as a SAPI). But with IIS 5 running FastCGI, the networking/IIS overhead drops to about 10-15%.

In my tests, I was pounding the web server with 100 simulated clients, and I suspect that there are still some internal bottlenecks crippling PHP performance as a threaded Apache SAPI (either in the threads support or oci8 extension). I think that Apache with FastCGI would perform better, but I don't have the time to test this.

Discuss (7 responses) (Join / Login first) Edit   permalink: #  

Twisting by the PHPool
Recently I posted a link to Twisted, a python library for developing socket servers. This generated a storm of interest as to whether PHP is suitable for developing similar custom networking applications. You should read the commentary in the above link for some interesting remarks by BDKR.

So i benchmarked Nanoweb, a webserver written in PHP, against Apache. Test was repeatedly requesting a 4K HTML file (adodb-session.htm), using M'soft's WAST set at 10 concurrent threads. All software running on a 2.6 Ghz Win XP machine. PHP-CLI 4.3.3 was used to run Nanoweb.

              Requests/Sec
Nanoweb          176
Apache 1.3.28    300
These figures suggest that a PHP networking app (Nanoweb) running on brand-new 2003 hardware (3 Ghz PC) will be faster than a program written in C (Apache) on good year 2000 hardware (1Ghz PC).

Now if someone had told me in 2000 that in 3 years I would be able to run a web server written in PHP, and that it would run faster than the Apache of the year 2000, I wouldn't have believed the person. I think this sort of performance is a fantastic achievement for PHP and its developers.

PS: For those of you who read the Twisted docs, you will see that it does not use threads, but callbacks. It should be possible to do something similar in PHP too.

New: If you click on Discuss, you will see that Kemar has posted some benchmarks on Linux, and the numbers are quite different from Windows XP (9 Oct 2003)

Discuss (14 responses) (Join / Login first) Edit   permalink: #  

SAP interface to PHP
Recently we had a need to interface to SAP. To my pleasant surprise, there is a PHP extension that does this...

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

php-con West extends Early Bird to Oct. 10th!
Oops. Here's another announcement that slipped through the cracks while i was sick. Sorry Monica.

php-con West 2003's early bird deadline has been extended till Friday, October 10th. It's not too late to sign up tutorial, technical session or full conference packages at reduced fees. Register online at http://www.php-con.com/ or download and fax one of our registration PDFs.

** Additional Savings and Special Promotions **

Do you work for a non-profit or university? Are you a student or a member of a PHP user group? You may be eligible for additional savings on registration fees.

* Students get 50% off registration when they include a current class schedule and photo ID with their faxed registration

* Employees of Yahoo!, Universities and Non-Profits can take an additional 10% off rates.

* Members of recognized PHP User Groups, phpclasses.org and PostNuke are also eligible for additional savings.

Want to find out if you qualify? Email monica#php-con.com for more information.

Discuss (Join / Login first) Edit   permalink: #  

Tiki Wiki still enjoying phenomenal growth
Last week I got this email from Marc Laporte, informing me that Tiki Wiki has decided to switch to ADOdb as the database portability layer. I was sick at the time and have only just posted it. The statistics on Tiki Wiki are certainly impressive.

#4 most active project on Sourceforge
http://sourceforge.net/top/mostactive.php?type=week

#5 best rated app on Freshmeat.
http://freshmeat.net/stats/#rating

111 developers on SourceForge:
http://sf.net/project/memberlist.php?group_id=64258

1300+ members on
http://tikiwiki.org/

July 2003 Project of the Month on Sourceforge
http://sourceforge.net/potm/potm-2003-07.php

350 pages of fully illustrated documentation for Tiki 1.6
http://prdownloads.sf.net/tikiwiki/tiki16pdfmanual.zip?download

1.7 doc is in progress. Here is a list of "undocumented features":
http://tikiwiki.org/art25

An idea of what's ahead:
http://tikiwiki.org/ReleaseProcess18

Here are our current partners:
http://tikiwiki.org/TikiPartner

Here are future partners:
http://tikiwiki.org/FuturePartner

Tiki Wiki Discussion on database abstraction:
http://tikiwiki.org/DbAbstractionDev

I believe that Tiki Wiki was formerly using PEAR DB. For people who are using PEAR DB and believe that switching to ADOdb is hard, they couldn't be more wrong! ADOdb has a PEAR DB emulation layer that is quite complete. I have ported PEAR DB applications by changing include files to ADOdb and they worked straight off.

Marc also mentions that they are looking for PHP and Java developers who would like to contribute to the project. Guys, keep up the good work!

Discuss (1 response) (Join / Login first) Edit   permalink: #  

Getting Twisted
Twisted is a framework, written in Python, for writing networked applications. It includes implementations of a number of commonly used network services such as a web server, an IRC chat server, a mail server, a relational database interface and an object broker. Developers can build applications using all of these services as well as custom services that they write themselves. Twisted also includes a user authentication system that controls access to services and provides services with user context information to implement their own security models.

This is certainly one area where PHP is lacking. Twisted looks really nice for developing specialized network servers.

PS: Also see David Mertz's multi-part tutorial on Twisted.

Discuss (5 responses) (Join / Login first) Edit   permalink: #  

Tuning PHP Database Performance
In a typical web application, most of the time is spent querying the database, formatting the results and spitting the html back to the browser. This also means one of the biggest speed bottlenecks are your SQL queries.

I became interested in db tuning tools after using the excellent 3rd party Oracle development tool TOAD which has similar functionality. We have a team of PHP developers who use Oracle, and without a tool like TOAD, it would be impossible to identify bad SQL and database bottlenecks quickly. As our clients use a wide range of databases, I felt that a similar tuning tool that works across multiple databases would be useful. And to my surprise, popular tools such as PgAdmin, PHPMyAdmin and the like do not have equivalent functionality.

That is why the latest feature of ADOdb (download here) is performance monitoring. This allows you to log all SQL executed into a table called adodb_logsql (sometimes when it comes to naming, not being original is a virtue), together with the execution time and the name of the script that called it. This logging is done by calling

  $db->LogSQL(true);

ADOdb also provides tools to analyze bottlenecks in your SQL. After logging your SQL, ADOdb then classifies your SQL into

  • Suspicious SQL, which is SQL with high average execution times. It is suspicious because the SQL might be poorly written. Tuning these SQL will improve the response times of your web application.

  • Expensive SQL, which is SQL with high total execution times (eg. average execution time * # times executed). Tuning these SQL will drop the load on your database server.

  • Invalid SQL, which is SQL that generates error messages.

We provide a HTML UI, which you can invoke with:

<?php
include_once('adodb.inc.php');
$db = NewADOConnection('mysql'); # driver
$db->Connect(...);
$perf = NewPerfMonitor($db);
$perf->UI();
?>
Click on "View SQL", you will see your queries nicely classified. Clicking on an SQL query will bring up a new window that explains the execution plan, and lists which the PHP scripts that call this SQL.

You can also control the number of SQL queries to view by entering a number in the input field at the top-right of the View SQL screen.

Field Test

This week I had my first successful field test of the system on a customer site. I have tuned several in-house systems already, but this was my first opportunity to test this on a system where I was not familiar with the code.

After gathering the statistics for half-an-hour, we opened the performance UI. Of course performance is relative: a query that returns 100,000 rows and takes 1 second is pretty fast, but any query that takes 0.1 seconds to retrieving one record is pretty slow.

We found several statements that process less than 10 records but were taking over a second to execute. A quick check revealed that the table in question was not properly indexed. After indexing, we saw a 1000% improvement in performance for those queries. Also some queries were pulling whole tables with thousands of rows from the database and performing joins in PHP; a simple query rewrite solved this.

We found all these problems within the first hour of using the new performance monitor. Before I had this tool, I would have had to interview the end-users on what pages were slow, manually benchmark the pages to check the response times, and then read the source code to try to identify the bottlenecks. This new feature saved me days of tuning! Good job, if i do say so myself.

PS: Thx for all the nice messages while I was sick.

Discuss (4 responses) (Join / Login first) Edit   permalink: #  

Busy Sick Bumblebee
Busy with work pressures and sick (stomach bacterial infection) this week. Let the miserable rest.

Discuss (2 responses) (Join / Login first) Edit   permalink: #  

PHP Component Model (PCOM) Wiki
Wow, Sebastian has put up a Wiki for his proposed PHP Component model. Feel free to join in.

Discuss (Join / Login first) Edit   permalink: #  

Cyberinsecurity: The Cost of Monopoly
Technology analyst Daniel Geer, formerly the CTO of @Stake, was allegedly fired for co-authoring this report that purports that Microsoft's monopoly on operating systems and its tight integration of software constitutes a huge security risk to the world's technology infrastructure.

More details on the firing.

Discuss (Join / Login first) Edit   permalink: #  

A Conversation with Joshua Bloch on Java 2, SE 1.5
The new language features all have one thing in common: they take some common idiom and provide linguistic support for it. In other words, they shift the responsibility for writing the boilerplate code from the programmer to the compiler.

An interesting release. From my C++ experience, the new generics feature is like the late Charles Bronson, an ugly as sin syntax but quite an attractive feature once you get used to the initial looks.

And here is feedback from some real Java programmers (read the comments for a plethora of views). Also see the Stephen Jungels article on 1.5.

Discuss (1 response) (Join / Login first) Edit   permalink: #  

PHP Component Model (PCOM)
The goal of this project is to develop a standard for the development of exchangable components that can be customized and plugged together using PHP Development Environments (PHP IDEs). -- Sebastian Bergmann

Very interesting. To make this succeed, we need to get through the hard part, which is agreeing on what common set of services everyone will support. To illustrate, here are some issues that come immediately to mind:

What exception mechanism will be supported if an invalid event or service occurs? I like PHP5's try/catch, but I dislike PEAR's error handling because it interferes with my in-house error handling mechanism.

Who is responsible for serialization and storage of component settings/properties? What serialization protocol will be used? For phpLens, we decided that we had no choice but store the properties as PHP code for speed.

Are multiple versions of the same component allowed like in .NET? PHP does not have an OOP model that allows multiple classes with the same name to co-exist.

There needs to be some dichotomy between edit and runtime modes. Different sets of services will need to be available. How will this work?

PS: I emailed Sebastian asking him to setup a Wiki/mailing-list on this. Lets see what happens.

Discuss (3 responses) (Join / Login first) Edit   permalink: #  

Harry Fuecks and the Control Freaks
Harry explains his thoughts on processing logic for web pages. I like Harry's conclusion, but looking at the twisted way the conclusion was derived, all I can say is: Harry, I think your mind has been warped by too many design patterns. Throw them away and breathe freedom :-)

I also like this:

Put bluntly, what I'm saying here is I think most of the PHP frameworks out there today have got it "wrong". And yes this site is guilty of exactly these "crimes"...

In my opinion, there is too much mindless imitation of .NET and Java out there. Remember that PHP is a dynamic language, and many of the design choices in .NET and Java do not make sense for PHP.

Here's an example of something that is very hard to implement in VB.NET or C# or Java: PHP can dynamically recompile itself; the famous Smarty template compiler being just the tip of the iceberg. There is a world of PHP innovation out there beyond cloning PHPNuke or J2EE, just waiting to be discovered!

PS: Something different: Leendert Brouwer has posted an essay on template engines.

Discuss (3 responses) (Join / Login first) Edit   permalink: #  

J2EE, .NET and PHP at MIT
A project done in Java will cost 5 times as much, take twice as long, and be harder to maintain than a project done in a scripting language such as PHP or Perl. People who are serious about getting the job done on time and under budget will use tools such as Visual Basic. -- Phil Greenspun

The nice thing about Java is that it is a good general purpose language. Unfortunately that strength also means that it is really good at nothing, and is a handicap in domain specific areas. You can argue that C# has the same defects as Java, so why is C# is easier to use? The evidence shown here hints why Java has failed in this area.

PS: The responses posted in comments are interesting also.

Discuss (Join / Login first) Edit   permalink: #  

Code bloat and benchmarking database libraries, round 2
Recently, I came across this thread at SitePoint about code bloat. This is an issue because PHP in its default configuration has to recompile all scripts on execution. So comments and complex logic have to be reparsed on each request.

Now Shakespeare would have said, "Much ado about nothing." In Malaysia we say: "All talk, no action-lah!" If you really are interested in speed simply install a PHP accelerator which compiles your PHP once, and subsequent requests will use the cached compiled script. Then this is a non-issue.

Some of you might say you are in a shared hosting environment and your ISP doesn't provide an accelerator. Well, the fact that you're using shared hosting probably means the ability to operate under a heavy load is not really a requirement. And I have heard of