| Scripting
in .NET is bullshit? |
| So what is the problem? Where is the procedural .NET!?! Where
is the scripter's version of .NET!?! My colleagues are giving up on
Microsoft after years of effort to consolidate on the MS platform.
They do not want to be G. Andrew Duthie or Jesse Liberty. They want
to create a few HTML pages with forms and write some procedural
script on a server to process the data.
What I am starting to find are a number of PHP, Perl, and MySQL
books appearing on the shelves of my coworkers. And they are not
looking at Win2K or Win2003 as the platform or IIS as the server.
And please let's not waste time with the argument 'But you can
basically script in .NET.' BS is all I can say to such a
statement. -- Danny Boyd
Also see Richard
Tallent's response.
Discuss (1
response) (Join /
Login first) Edit
permalink: #
|
| Merry Christmas
and a Happy New Year |
| At this time of the year, there are better things to do than
blogging. This is a time for serious fun. Love ya all!
PS: PHP5 beta 3 has just been
announced.
Discuss (1
response) (Join /
Login first) Edit
permalink: #
|
| Protein
and Silicon Bugs |
| I'm currently sick, down with a protein bug. Not too much energy
today, but after a nap, i felt energized enough to find a silicon
bug in PHP5. Thanks to Jonas for the initial forensics.
Discuss (Join / Login first) Edit
permalink: #
|
| Innapropriate
Abstractions: A Conversation with Anders Hejlsberg et
al |
| The trouble with the wrong abstraction is there's no way out
of it. In practice, though, it's very hard for class designers to
make reasonable guesses about even the scenarios in which their
designs will be used, much less the relative frequency of each kind
of use. You may think your users will want transparency, because it
lets them do really cool things, so you implement transparency. But
if it turns out 99% of your users never care, guess what? Those
people pay the tax. -- Eric Gunnerson
Lot's of good advice in this article. I see so many authors of
PHP classes that try to do the right thing when the right thing to
do is far from clear. To expect the correct design the first time is
really really difficult. The key is don't try to be too ambitious.
Use simple abstractions, so even if we screw up, it's easily fixed.
Discuss (2
responses) (Join
/ Login first) Edit
permalink: #
|
| .NET Rashomon
Rudeness |
| One fact, multiple perspectives:
.NET continues to
surprise me. For example, we recently had to play a .WAV
file on .NET. Luckily .NET allows you to call the underlying
operating system to play the file. We really have the best of both
worlds with .NET: platform and language portability, with hooks into
the OS.
.NET continues to
surprise me. It still remains a thin layer between the OS and you.
For example, we recently found out that playing a
.WAV file in .NET requires making a Win32 call. It's frustrating
that there is no .NET multimedia equivalent.
.NET continues to
surprise me. It still remains a thin hymen between the OS and you.
Once you penetrate it, and do stuff like playing sexy
music, you're no longer a virgin - you're hooked up to something
better than .NET sex - the Win32 api.
Discuss (5
responses) (Join
/ Login first) Edit
permalink: #
|
| Andrew
Stopford's the man for Rotor latte |
| Andrew Stopford wrote a book on PHP on Windows, but he's
addicted to Rotor latte nowadays. He's sipping on hot .NET and CLR
expresso on his blog. I read it to find out what's happening in the
.NET world. Great brewing, Andrew!
Discuss (1
response) (Join /
Login first) Edit
permalink: #
|
| The "Simple"
Art of Code Design |
| I just visited SitePoint's Advanced PHP Forums, and spotted a debate
on cache design. I only stumbled upon this because I saw that
Manuel Lemos posted to this thread. Manuel has a unique way of
writing that almost always guarantees a heated response. But Manuel
brings up a good point. Most cache classes out there could do with
better designs.
Now one good way of designing things is to define the parameters
the code must work under in the beginning, then work your code from
there. For example, for caching it could be:
- Works in multi-user environment and is portable. I'm also
assuming in this design that we are saving to a file.
- Data must have a basic integrity checks. File IO can fail.
Programs can crash.
- Solid concurrency and stress testing. Caching is typically
most useful in overloaded environments.
Let's address these issues:
- Multi-user: You probably shouldn't be using PHP's
file-locking, unless you're writing a private label class for only
yourself. PHP's flock doesn't work on many systems such as NFS and
FAT, so that's a big no-no. In adodb (and newer versions of smarty
too i believe), content saving use the following file trickery:
a. save contents into temporary file. b. delete the original
cache file, eg. unlink($filename). c. rename the temporary
file, eg. rename($tempname, $filename).
A file rename operation is guaranteed to be atomic by the
operating system, and is very efficient as file functions are
always highly optimized.
- Data integrity: You should store and check the file-size in
the cache file to ensure that the process didn't crash while
saving. If you want stronger integrity, use crc32. In adodb, we
store the file-size and use PHP's serialize function, which has
many built-in integrity checks. Using serialize internally also
means you can automatically save objects and arrays, not merely
strings.
- Testing: I've seen elaborate unit tests for caching. But
without the most basic concurrency test you might as well not do
any testing.
One such simple testing recipe is to write a script to
continiously poll a cache file, saving occasionally. Output or log
all errors (having a debug mode makes it straight-forward). Then
execute the same script concurrently in separate processes and
watch for problems. Let the testing cook for at least 10 minutes.
The above suggestions sound simple, but as I've learnt,
simplicity is not easy to achieve. In youth, everything is
oversimplified or too complicated - real simplicity comes from
experience. Unfortunately, it's only when we get too old and senile
that everything becomes truly simple ;-)
Discuss (7
responses) (Join
/ Login first) Edit
permalink: #
|
| It takes a
snake to prove that .NET is fast for scripting |
| Some language experts have said publicly that scripting
languages built on .NET or Java bytecodes will always perform poorly
due to limitations in bytecode design.
Jim Hugunin, contrarian to the core, has posted an email which
suggests the reverse, by comparing Python written in C (CPython)
with one Jim wrote using .NET (IronPython). Jim is the original author of Jython,
the Java implementation of Python, so he has the skills to back his
claim.
And if his claim is correct, I think that the push to move
scripting languages (eg. PHP) to Parrot or similar bytecode systems
which can be Just-In-Time compiled will accelerate. The downside is
that the overhead of dynamic eval's is much higher - making scripts
that use eval() to dynamically compile code prohibitively expensive.
Discuss (Join / Login first) Edit
permalink: #
|
| Zend/Win
Enabler - Running PHP on Windows |
| I've been advocating
FastCGI as the most stable way to run PHP on Windows for a long
time. Now I see Zend has decided to commercialize such a FastCGI
solution.
Open source is an extremely incestous activity. Everyone works
together and competes together (I just hope they don't sleep
together). The FastCGI work was done by an Shane Caraveo, who works
for ActiveState, a company that sells products that directly
competes with Zend. And in the fine tradition of open source, there
is always a free solution when you don't want to pay anything - you
can use the free
FastCGI+PHP installer for Windows I wrote a while back.
Discuss (1
response) (Join /
Login first) Edit
permalink: #
|
| Study
Questions Companies' Ability to Handle J2EE |
| I'm back from my holidays, and am continuing to dive into Java.
I'm trying out Eclipse, been using
JBoss and am reading Thinking in Java.
Java remains an excellent replacement to coding in C or C++. But
for many domains, Java is too complex and an overkill. Here's a recent
study that supports this view:
Wily Technology, a Brisbane, Calif. company that gauges how
well Java performs, said it has completed a study on the quality
of how Java 2 Platform, Enterprise Edition (J2EE) applications run
in certain enterprise environments.
After interviewing more than 350 J2EE users who work outside
the software industry, the research firm concluded that
applications running on the J2EE platform were available to users
88 percent of the time. "That means on average, companies are
losing over 20 hours, or one full day per week, of availability,"
said Lewis Cirne, founder and chief technology officer of Wily
Technology.
The Wiley study found that enterprises who rely on J2EE to run
their applications must endure at least one full day of downtime
per week. Downtime, a reviled word in the computing industry,
could mean lost dollars for companies that rely on software to
power their businesses.
Discuss (5
responses) (Join
/ Login first) Edit
permalink: #
|
| The Mans Aways
on Holidays |
| Back in 2 weeks time, hopefully sane, zesty and healthy. And to
all you Malaysians - a happy Raya holidays!
Discuss (3
responses) (Join
/ Login first) Edit
permalink: #
|
| PHP
Benchmark |
| Sebastian is doing some neat work on testing performance. I
found it hard to decipher the data, so I graphed it in Excel.
There seems to be some drop-off in performance in PHP 4.3.4. I
guess the core developers are putting their energies into PHP 5.
Discuss (2
responses) (Join
/ Login first) Edit
permalink: #
|
| 21st Century
Blues |
I guess i've
finally arrived in the 21st century and it feels good. I have
submitted to the onslaught of new technology and just bought an Apple iPod. I liked it so much
that i bought it even though i wasn't sure whether i had a firewire
connection on my windows notebook. Luckily it did, or i would have
felt really stupid.
Discuss (Join / Login first) Edit
permalink: #
|
| ProFont for Windows
and X-Windows |
| I remember one of my programming colleagues (Rajesh, a very
talented coder, at that time just fresh from college) many years ago
placing newlines after every line of code, using a lot of white
space, making his source code look very artistic. I pointed out to
him that when you can only see 10 lines of code on one screen, you
need to scroll up and down all the time, and it takes a very long
time to understand what's going on.
My advice to him was to chunk his code, making each dense chunk
represent some complex computation, and to to make each chunk fit
one screen. I also gave him a second piece of advice. It was related
to fonts: avoid Courier - use a professional font where zero's are
slashed, and 1 and l are distinguishable (that's one and letter L).
That's where ProFont comes in. It's a great programming
font for the Apple Macintosh, designed to look good in small font
sizes. I remember using ProFont religiously on my Mac for software
development many years ago, and looking for something similar when I
moved to Windows. Recently I found that it has been ported to
Windows (and X-Windows too). If you want to a panoramic view of your
code, but you don't want to go blind, this is the solution - highly
recommended.
PS: Use the .FON version. It's better optimized for small sizes
than the .TTF one.
Discuss (4
responses) (Join
/ Login first) Edit
permalink: #
|
| PEAR2: The
interface is the framework |
| PEAR1 is now suffering from the fact that it focused from a
long period of building technical foundations without planning the
community growth at the same time. -- Lukas Smith.
Now that the cat is out of the bag about PEAR's problems, I will
give my 2 cents worth.
Firstly, I want to say that I am a PEAR user, but not a PEAR
developer. I do use code from PEAR, everyday. There is some first
class code in there. But yes I do see problems, and I also have a
radical solution at the very end of this essay.
The biggest problem IMHO with PEAR is the arbitrary way things
are apparently run. Why is one class accepted, and another rejected?
If it is because they do the same thing, then why are there so many
classes that do the same thing in PEAR?
My perception is that
- there is no planning except for flavour of the day
- there are egos in the community, so people introduce new
classes rather than rewrite existing ones to add new functionality
- poor design in some of the original classes makes it
impractical to extend the original classes
- the quality of PEAR is uneven, brilliant gems combined with
duds - and the problem is that the duds begin with the base
classes (PEAR and PEAR_Error)
How can things improve? Obviously PEAR2 must be designed to grow
with the community. One way to grow in an open source world (if
there is no leader with the stature of Linus in the community) is to
give everyone a chance to contribute their own code fairly, without
the arbitrariness that we see today.
In the Perl world, we have CPAN, where there is no
thought police telling you that your code is not acceptable - you
just register as a developer, declare a namespace, and upload. Yes
there is lots of code duplication, and yes there is no coding
standard - but i do perceive that it is fair.
I have a suggestion for PEAR. I think they need to be more open,
and allow multiple people to develop similar classes for the same
category. What the PEAR group could do is standardize the API's for
specific categories. This API defines a minimum inter-operatable
subset of code for a specific category. Poor documentation will be
less of an issue if all contributed code share a common base API.
I can see some similar classes in PEAR share common APIs
informally, but it would be better if it were standardized, so it is
less arbitrary and more inclusive. In fact, PHP5 has a good way of
enforcing this contract with the implements keyword. The
launch of PHP5 is also a good time to start, because we can finally
put PEAR exceptions to sleep and use built-in exceptions.
Now here comes the really radical suggestion - if your
contributed class conforms to that category's unit tests, it should
be accepted into PEAR. No ifs, no buts, no thought police. And how
would you pick which class to use if there are 10 implementations?
Let the PHP community decide! Votes or download statistics could be
displayed to "fairly" quantify the best code contribution for a
category.
Lastly, I want to remind everyone that standardizing on a common
API does not inhibit creativity, provided that the API is sensibly
chosen. For example, ADOdb and MDB both have a PEAR DB emulation
layer. In fact common APIs do encourage innovation within a shared
framework (eg. the BSD forks go in different directions but share
code).
Thanks for listening.
Discuss (23
responses) (Join
/ Login first) Edit
permalink: #
|
| Today I ran
JBoss for the first time |
| Another personal milestone i just wanted to note before i
forget.
Discuss (2
responses) (Join
/ Login first) Edit
permalink: #
|
| Smash
the Windows |
| As our society becomes ever more dependent on information
technology, the gulf between those who understand computers and
those who don't will get wider and wider. In 50 years, perhaps much
less, the ability to read and write code will be as essential for
professionals of every stripe as the ability to read and write a
human language is today. If your children's children can't speak the
language of the machines, they will have to get a manual job - if
there are any left -- Dylan Evans.
Discuss (6
responses) (Join
/ Login first) Edit
permalink: #
|
| A Turning point
for PEAR? |
A posting in the php.pear.dev newsgroup by one of PEAR's leading
developers. I think this is a very honest assessment, and I'm
hopeful about this. > From: Xavier Noguer [mailto:xnoguer#xavier-noguer.com]
> Sent: Saturday, November 08, 2003 3:41 PM
> Martin Jansen <mj#php.net> escribió
>
> > (http://pear.php.net/manual/en/developers.contributing.php)
> >
> > It is pretty funny to see how much developers have actually read the
> > "Developers Guide" ...
>
> I've read that guide. I just don't seem to be able to take it seriously
> when
> it lists requirements that have never, as far as I know, been voted by the
> pear group or the developer community at large, such as regressions tests
> (http://cvs.php.net/diff.php/peardoc/en/guide/developers/contributing.xml?
> r1=1.8&r2=1.9&ty=h)
I agree here.
While previous mistakes don't make a wrong a right, I think our whole manual
lacks any consistent concept of what we really feel needs to happen when and
where.
Anyways we have a bunch of messes that are a result of the long period of
limited peer review, followed by a period of package inflation, followed by
the today ruling confusion.
I think its time we fix our standards by starting from a clean slate with a
PEAR2. There we can think about how to best deal with our developer and
users base and how to great the best possible code in a PHP version which
actually supports our needs for OOP.
PEAR1 should of course be maintained as we all have an interest to keep that
code running and to use PEAR1 as a momentum towards a PEAR2 which build on
the past experience.
<rant>
PEAR1 is now suffering from the fact that it focused from a long period of
building technical foundations without planning the community growth at the
same time. This has lead to numerous problems and is making us very
inefficient. Of course PEAR1 has a lot to offer, but I don't think we are
scaling well and past mistakes seem to haunt us more and more, which we
don't seem to be able to fix. So I think we need to recognize our past which
we of course need to maintain to remain credible, but at the same time we
should work to build a more scalable PEAR2 in which we can address our
issues on a clean slate.
</rant>
> Would you be so kind to point me to the pear group document or public
> discussion in which this requirement was approved?
There is no such decision I can remember.
As George pointed out we discussed this point in Amsterdam, however I don't
remember that anyone decided on requiring documentation at first commit.
However our decisions there were mostly only concepts and not complete.
Anyways maybe someone should check when this was commited anyways.
Regards,
Lukas Smith
Discuss (2
responses) (Join
/ Login first) Edit
permalink: #
|
| Is
Novell-SuSE deal a brilliant Big Blue power play?
|
| Perhaps the most interesting take on the Novell-SuSE deal is the
above link. What has been missed by other commentators is that this
is a tri-partite agreement with IBM. David Berlind clarifies a lot
of things, even if it is still not the full picture.
Red Hat is also not standing still, abandoning its hobbyist roots
to sell only to Enterprises. As Bruce
Perens says, "The open source community is supposed to
produce Fedora so Red Hat can put a stamp on it and charge lots of
money for it."
Does anyone have any recommendations for free Linux distros? Ease
of installation and use are more important to me than power (hey,
the first computer i ever bought was a Mac.)
PS: Björn Schotte has some groovy
pictures of the recent PHP Conference 2003 in Germany.
Discuss (9
responses) (Join
/ Login first) Edit
permalink: #
|
| ADOdb 4.02 released
with PHP5 suppport |
| Been playing around with PHP5. ADOdb 4.02 now works
transparently with both PHP4 and PHP5. If PHP5 is detected then the
following features will be automatically enabled:
Support for PHP5 iterator overloading
$rs = $DB->Execute("select * from table");
foreach($rs as $row => $fields) { var_dump($fields); }
Support for PHP5 exceptions
Just include adodb-exceptions.inc.php and you can now catch
exceptions on connection and execute errors as they occur.
include("../adodb-exceptions.inc.php");
include("../adodb.inc.php");
try {
$db = NewADOConnection("oci8");
$db->Connect('','scott','bad-password');
} catch (exception $e) {
var_dump($e);
}
I managed to surprise myself, the PHP5 iterator code is backward
compatible with PHP4, even though the IMPLEMENTS keyword is illegal
in PHP4, thanks to the magic of includes.
And IMHO, the PHP5 iterator implementation with
IteratorAggregate and Iterator, though powerful, is
too complicated - certainly not in the spirit of PHP.
Discuss (7
responses) (Join
/ Login first) Edit
permalink: #
|
| PHP 4.3.4
|
| After a lengthy QA process, PHP 4.3.4 is finally out! This is
a medium size maintenance release, with a fair number of bug fixes.
All users are encouraged to upgrade to 4.3.4.
Bugfix release PHP 4.3.4 contains, among others,
following important fixes, additions and improvements:
Fixed disk_total_space() and disk_free_space() under FreeBSD.
Fixed FastCGI being unable to bind to a specific IP. Fixed
several bugs in mail() implementation on win32. Fixed crashes in
a number of functions. Fixed compile failure on MacOSX 10.3
Panther. Over 60 various bug fixes!
The bug-fix that most concerns me most is this one:
Fixed bug 25404 (ext/pgsql: open transactions not closed
when script ends).
In our early days with PHP, we used MySQL a lot. Nowadays, most
of our PHP work is with PostgreSQL and Oracle (with the occasional
MSSQL project). MySQL is still a good database, but without triggers
and views, it no longer meets our company's needs.
Discuss (2
responses) (Join
/ Login first) Edit
permalink: #
|
| Sun,
Zend integrate PHP with Sun's Web server |
| Zend, oversees the development of PHP and also sells a
commercial implementation of the technology. On Monday it released
two products that integrate with Version 6.1 of Sun's Java System
Web Server, allowing companies to deploy PHP on Sun's software. (The
Java System Web Server was known previously as the Sun ONE Web
Server.)
The two products are the PHP Enabler, which is intended to let
PHP programs run smoothly on Sun's Web server, and the Zend
Performance Suite, which uses code acceleration, content caching and
other software tricks to improve the performance of PHP on the Sun
platform, the companies said.
Discuss (Join / Login first) Edit
permalink: #
|
| Icky Sticky
Leaky PHPloat |
| After running PHP5 beta 2 for 2 days on Apache 1.3
(multi-threaded SAPI on Windows), I was surprised to find that the
process was taking 500 Mb. There must be lots of memory leaks. I'm
pretty happy that most PHP code just runs, but obviously it's not
production ready.
Now the hard work starts, how to integrate new PHP5 functionality
without impacting old PHP4 code. After thinking a while, I realize
there are only a few things I can do to write portable code:
Unfortunately, most other features require explicit use of
keywords that are illegal in PHP4, eg. private, protected,
implements, etc. These require maintaining a separate codebase for
both versions of PHP, or some special pre-processing to be done on
the code depending on the versions. Does anyone have a better
suggestion?
PS: I will be releasing a new version of ADOdb soon, one that
should be compatible with PHP5.
Discuss (4
responses) (Join
/ Login first) Edit
permalink: #
|
| PHP 5 beta 2 is out |
| Just downloaded the Windows install and tested it.
To run on Apache, I copied php4apache.dll to the php5 root
directory, and modified Apache's httpd.conf: LoadModule php5_module c:/php5/php5b2/php4apache.dll
AddModule mod_php5.c
AddType application/x-httpd-php .php
For some reason, php_mysql.dll is not working. The error message
is "The procedure entry point mysql_create_db could not be located
in the dynamic link library LIBMYSQL.dll". I made sure that the
LIBMYSQL.dll was the one that came with the PHP release. I'm using
MySQL 4.0.12. Perhaps someone can comment on this.
However Oracle's oci8 extension is working fine. As most of our
software runs on Oracle, it was easy for us to continue testing
PHP5. 99% of all code ran fine. The only gotcha i found was that if
your function returns a reference, you can no longer do this:
return $this->function();
but have to change your code to this:
$ret =& $this->function(); # & is not needed if you don't support PHP4
return $ret;
A very impressive beta release except for the above glitches.
Discuss (3
responses) (Join
/ Login first) Edit
permalink: #
|
| Is Pharrot That
Fast? |
| I have not seen the sources of the Pharrot compiler, but if it
is just a proof of concept, it is not likely that much
error-checking is done. I expect the figures for Pharrot below to
increase by at least 50% when all the error-handling is put in.
The figures are still very impressive despite these reservations.
Generating a Mandelbrot fractal
– PHP, 2.4 seconds
– PHP-Hacked, 1.2 seconds
– Parrot without JIT, 0.5 seconds x1.5 = .75
– Parrot with JIT (Intel), 0.08 seconds x1.5 = .12
This
is the best Parrot reference, next to the source of course.
Discuss (Join / Login first) Edit
permalink: #
|
| The Shape of Pharrots to
Come |
| John Coggeshall mentions
that the PHP on Parrot project has been named "Pharrot" by the
php-con conference attendees.
Here's my take on things. Now I don't have any inside info, so
the following is entirely guesswork, and any resemblence to reality
is entirely your imagination:
- Although Sterling and Thies are very modest, given the fact
that they were given the closing keynote and the amazing
performance improvements - Pharrot will probably be PHP 6.
- The speed of the JIT means that PHP will become a general
programming language. A high performance application server
written 100% in PHP becomes practical. A high performance
anything becomes practical in PHP.
- The tribes using Parrot will probably include Python, PHP and
Perl. Code sharing between different programming tribes will
become a reality. This does not mean that there will be full
interoperatability between all languages, because (a) there is no
common runtime library (yet), (b) and no consensus on what will be
the default PMC's
(Parrot's language extensions) installed.
- There will be battles fought over the run-time. In PHP4/5,
after execution, we throw away the opcodes together with the bath
water, or store them in shared memory. Parrot gives you more
choices. See the end of Dan Sugalski's Parrot
internals presentation (ppt).
- The Zend API is dead - big deal. Parrot is a big opportunity
for companies with skill and resolve. The tools market for open
source programming languages suddenly becomes much larger because
you are able to support so many more languages effectively.
- My prediction: the first beta of Pharrot will be out in 2006.
PS: Selkirk was prescient
about parrot. Smart chap.
Discuss (4
responses) (Join
/ Login first) Edit
permalink: #
|
| PHP & Parrot
(PDF) |
| Here is the PHP-Con Closing Keynote by Sterling Hughes and Thies
Arntzen on running PHP on Parrot.
Parrot is a virtual machine used to efficiently execute bytecode
for interpreted languages. Parrot will be the target platform to
which Perl 6 code is compiled.
From their slides: Parrot is FAST!
Generating a Mandelbrot fractal
– PHP, 2.4 seconds
– PHP-Hacked, 1.2 seconds
– Parrot without JIT, 0.5 seconds
– Parrot with JIT (Intel), 0.08 seconds
I presume that PHP-Hacked is the patched
PHP that Sterling and Thies released in August. I have been a
sceptic about Perl 6 because it has taken so long, but this is
impressive stuff.
Discuss (5
responses) (Join
/ Login first) Edit
permalink: #
|
| Natural Born
Killers of PHP |
| Recently I revised Optimizing
PHP, an article that I wrote in 2002. I'm pleased to say that it
hasn't aged much. The changes I had to make include a recommendation
to use FastCGI with IIS, adding Turck MMCache to the opcode cache
list, recommending Cache_Lite, replacing foreach with list/each for
large arrays, and the realisation that arrays need to be passed by
reference too.
Then I realized how much more I could have discussed. I started
to think more about performance profiling after discussing APD in a
previous
post.
Now there is some logic to the fact that functions with many
lines of code will run slower than ones which are short and brief.
However sometimes it's the shortest code snippet that cause the real
slowdowns. That's because these code snippets call external
functions that hide a lot of complexity behind a deceptively simple
and light exterior. With any of the following natural born killers,
one thoughtless line of code can result in a unbearable x100 times
slowdown:
- SQL statements - for example, forgetting to add an index to a
large table will have a massive impact on performance.
- Regular expressions - because regular expressions work by
back-tracking when a match fails, its common for a regular
expression to be exponentially expensive to compute. The longer
the string, the worse it becomes.
- Network calls - the increasing popularity of SOAP and similar
inefficient but easy-to-use protocols open up new vistas of
unscalability for the unwary.
Classic performance tuning techniques used by XDebug and APD give
you a summary of all functions with long execution times. This is
useful, but we need tools to easily pinpoint, measure and tune the
overhead of killers such as SQL statements, regular expressions and
RPC calls. I think that we already have enough Open Source CMS
projects out there. Tuning tools like this are great idea for
students and developers with time to kill, looking for an
interesting Open Source project to start.
Discuss (7
responses) (Join
/ Login first) Edit
permalink: #
|
| PHP
Performance Profiling with APD |
| Good article on profiling PHP using APD, the Advanced PHP
Debugger. This is a bit of a misnomer, because APD is not a debugger
you use to step through your code, but is actually a diagnostic and
profiling tool. I mostly use XDebug for profiling, but
APD looks like a cool alternative.
And if you are interested in performance tuning, do have a look
at ADOdb's database
performance monitoring features. The dreadful thing about SQL is
that it is an iceberg of complexity hidden in a deceptively simple
query language. Bad PHP code can slow your code down by a factor of
x2-5 perhaps. But one bad SQL statement can cause a x10-100 times
slowdown.
Lastly, if you are using Windows, unless you have the ability and
means to compile APD, you're out of luck. This is one area where
PECL (which is the official repository for PHP extensions) could
improve on.
Update: George (APD's author)
mentions that Shane Caraveo has ported APD to Windows, and Will and
John add that pre-compiled PECL dll's for Windows are available from
here and there. (added 25
Oct 2003).
Discuss (5
responses) (Join
/ Login first) Edit
permalink: #
|
| PHP Compiler
Cache Internals |
| The latest English issue of PHP
Magazine has an interesting article about implementing a PHP
Opcode Cache by George Schlossnagle, the author of APC.
If you're familiar with the English expression, don't throw
the baby out with the bath water, then you will be amused to
learn that that's exactly how the Zend Engine (PHP's compiler)
works. It will compile the PHP into opcodes for a page request, and
throw the opcodes away immediately after the code completes.
This may sound really wierd and inefficient, but of course Zeev
and Andi would not have been able to start their own company, Zend,
without a business plan that involved fixing this "stupidity". And
you thought Microsoft was evil ;-)
Now it is perfectly normal when developing a platform to leave
gaps for commercial vendors to fill. That creates a ecosystem where
we have companies willing to pay to maintain and promote PHP. So
this isn't meant to be an attack against Zend, but an
acknowledgement of business realities.
This omission of the Zend Engine stimulated interest in several
open source developers to create their own opcode caches. APC is one
of the earliest open source opcode caches.
In my benchmarks (yes, you see me benchmark a lot, because that's
the only way to understand the performance profile of PHP software
without spending a lot of time examining source code) I noticed that
the overhead of PHP opcode caches was less for small scripts.
Obviously there is some copying of instructions from the cache in
shared memory during script execution. The question was how much?
How did it affect performance?
Now we have the answer. George says restoration of the opcode
info for script execution "involves only a so-called shallow copy
of the op_array. A shallow copy means that only the structure itself
is copied, but none of the elements it contains pointers to."
This means that the actual opcodes are not actually copied, only
the pointers to the structures that contain the opcodes. Apart from
that, the function and class metadata and any static variables are
restored, and the inheritance hierarchy is dynamically resolved.
So the overhead of the opcode cache is O(n), where n is the
number of functions+classes+inheritance levels+properties+PHP
files. It is not proportional to the number of lines of code -
that would be as worrying as throwing the baby with the bath water.
Another excellent issue of PHP Magazine!
Discuss (Join / Login first) Edit
permalink: #
|
| ADOdb 4.00
released |
| ADOdb 4.00 is out after a 3 month beta testing process.
The distinguishing feature of this release is the performance
monitoring functionality. AFAIK, it is the first Open Source
cross-platform, multi-database performance monitoring and health
check software in the world.
It features:
- A quick health check of your database server using
$perf->HealthCheck() or
$perf->HealthCheckCLI().
- User interface for performance monitoring,
$perf->UI(). This UI displays:
- the health check,
- all SQL logged and their query plans,
- a list of all tables in the current database
- an interface to continiously poll the server for key
performance indicators such as CPU, Hit Ratio, Disk I/O
- Gives you an API to build database monitoring tools for a
server farm, for example calling
$perf->DBParameter('data
cache hit ratio') returns this very important statistic in
a database independant manner.
ADOdb also has the ability to log all SQL executed, using ADOdb's
LogSQL
feature. All SQL logged can be analyzed through the performance
monitor user interface. In the View SQL mode, we categorize
the SQL into 3 types:
- Suspicious SQL: queries with high average execution
times, and are potential candidates for rewriting
- Expensive SQL: queries with high total execution times
(#executions * avg execution time). Optimizing these queries will
reduce your database server load.
- Invalid SQL: queries that generate errors.
Each query is hyperlinked to a description of the query plan, and
every PHP script that executed that query is also shown.
Databases that work with the performance monitoring features
include MySQL, PostgreSQL, Oracle, Informix, MSSQL, DB2. Code
contributions are very welcome.
Download: http://php.weblogs.com/adodb#downloads Performance
Monitoring Docs: http://phplens.com/lens/adodb/docs-perf.htm
Discuss (Join / Login first) Edit
permalink: #
|
| PHP EasyWindows Installer
4.3.3.1 released |
| The latest release of this installer, which sets up PHP 4.3.3,
FastCGI, Turck MMCache 2.4.1, PEAR and ADOdb for IIS, Apache 1.3 and
2.0. This is the installer that we use in-house for our Windows
projects.
My recent benchmarks with PHP to test XML-RPC performance show
that IIS is the fastest PHP web server on Windows. The XML-RPC
benchmark was a PHP script with 1 select, 2 update queries to
Oracle. XML-RPC
Reqs/sec
Apache 1.3 44
Apache 2.0 42
IIS+FastCGI 108
If you want a PHP installer that is tuned for IIS with
FastCGI, this is the one.
Formerly, the version numbers of the installer were not
synchronized with PHP. Now the numbering system will match the PHP
version for easy tracking.
Discuss (2
responses) (Join
/ Login first) Edit
permalink: #
|
| The
Philosophy of PHP |
Every couple of months, someone requests the Perl style regular
expression operator (=~) be implemented for PHP. The PHP internals
group response to this is a good insight into their clarity-first
design style:
I don't think there's a chance we'd agree to
implement such regex operators (wether the singular or plural
versions :) Except for it pushing PHP in Perl's direction of being
unreadable it doesn't really give any added value. I don't see how
it is a significant improvement over using a function such as
preg_match(). (Actually I think the latter is more readable).
Here's a small quote from Mr. Ritchie: "A language that doesn't
have everything is actually easier to program in than some that
do."
I think Perl is a prime example of why the quote is
correct. -- Andi Gutmans and
You are pushing towards
$_~=/^\.*?\$$/;
This is not human-readable code and one of the basic
characteristics that sets PHP apart from Perl. Every non-trivial
line of PHP code has a decypherable keyword that you can plug into
the manual to figure out what that line is doing. We make sure of
this by keeping the number of operators to a minimum. As for your
bitshifting example. It has nothing to do with the frequency of
use, it has to do with readability. -- Rasmus
Discuss (3
responses) (Join
/ Login first) Edit
permalink: #
|
| We
need barking dogs to fix "The PHP Scalability Myth"
|
This above link is a great article, but you must be an
asphyxiating ostrich burrowed in the sand to say that PHP doesn't
scale. It's not that slow a programming language, it has few
bottlenecks, and perhaps most important - there are many case
studies of large web sites using it for mission critical work (eg.
Lufthansa's
online booking and Yahoo).
However I do feel something is still missing from PHP. Not in the
language per se, but in the conceptual overview. What we really need
are dogs barking and cats meowing in our very own virtual Pet Shop.
For people who don't understand this reference, the Pet Shop is
famous web application created to demonstrate best practices in
scalability and software design for J2EE
and .NET.
In contrast, there is no reference PHP Pet Shop, and no
accepted and well-documented methodology on how to create scalable
web-sites with PHP. So you need to be very smart, or get the advice
of technical gurus to keep everything scalable and running smoothly.
It's no accident that Yahoo, perhaps the biggest web-site that has
invested heavily in PHP, employs so many cool cats who have a deep
knowledge of PHP such as Rasmus and Andrei.
Update: Slashdot
discusses this article. I liked this
comment best:
The sad reality is that so few developers know
enough to fully exploit J2EE that they wind up doing little more
than what PHP does better in the first place.
True wisdom is knowing your limitations (18 Oct 2003).
Discuss (3
responses) (Join
/ Login first) Edit
permalink: #
|
| 10,000 ways to
Ni Hao with Unicode and PHP |
| Joel Spolsky has been cursing the lack of support for Unicode in
PHP. So last week, he wrote this great article on The
Absolute Minimum Every Software Developer Absolutely, Positively
Must Know About Unicode and Character Sets (No Excuses!).
Of course, that still doesn't answer his initial question of how
to get Unicode working with PHP. Well Scott Reynen had the solution
and wrote How
to develop multilingual, Unicode applications with PHP in
response to Joel's frustration.
Scott's technique works on all versions of PHP. Or you can just
use the UTF8 character set and mbstring
functions, which should run faster as they are coded in C. To use
mbstring, you need PHP 4.3 or later (it was buggy pre-4.3). On Unix,
you will need to compile the extension in. On Windows, you just need
to modify your php.ini.
Update: l0t3k is working on a Unicode I18N extension
based on IBM's open source International Components for
Unicode. The extension includes a UnicodeString class with the
requisite searching, replacing, casing, trimming, and classification
methods. Get the CVS version (16 Oct 2003).
PS: Ni Hao
means hello in Chinese.
Discuss (3
responses) (Join
/ Login first) Edit
permalink: #
|
| Oracle
ramping up its support for PHP? |
| Recently, Christopher Jones from Oracle Australia emailed me,
asking me a few questions about ADOdb. Apparently Oracle is looking
into ramping up its PHP support, and are starting to study what PHP
developers are actually using to connect to Oracle databases.
It was a pleasant surprise to find out that he was 1 year behind
me at the University of Melbourne, Australia. We never met, but
probably passed each other in the dingy computer labs. Ah, youth - I
was quite a skinny scruffy lad, and 22 kilos lighter too (50 lbs).
And for those of you who are having problems with Oracle and PHP,
this
forum at Oracle can be used to ask PHP questions. The nice thing
about this forum is that you don't need to get an expensive metalink
account (Oracle's support site) to get answers from real experts.
A quick dig through the forums revealed some useful nuggets of
information that I didn't know, such as this guide
on using DBMS_OUTPUT with PHP. DBMS_OUTPUT is Oracle's
equivalent of echo.
Discuss (Join / Login first) Edit
permalink: #
|
| Bridging PHP
and .NET with XML-RPC |
| A couple of months ago, I mentioned that we are developing rich
Windows clients running on .NET. We are still using PHP on the web
server, and using XML-RPC for communication between .NET and PHP. We
could have tried to use SOAP, but at that time there were too many
deficiencies in NuSOAP's support of arrays and collections.
We used the following XML-RPC libraries: Keith Deven's for
PHP, and the Cook Computing
library for .NET. It turned out to be very simple to get
everything working. The docs were clear, and with half a day of
coding, we got basic connectivity up.
We've been happy using Keith Deven's PHP library, but recently we
decided to test using the epinions xmlrpc
extension that comes bundled with PHP. We thought it would be
significantly faster as it was coded in C. To our surprise, it made
very little difference to performance. The improvement in speed was
just 1%. The reason is simple: most of the overhead is in the
networking, apache processing, and actual computation. The breakdown
of time is roughly:
- 10% testing overhead
- 50% networking/apache overhead
- 40% PHP computation (1 query and 2 updates to
Oracle)
Given that the client/server testing was done on a single
machine, that 50% networking/apache overhead is quite high. This
explains my interest in
Nanoweb and simpler alternatives.
Update: On Windows, it appears IIS
is still the high performance solution. Apache 2.0.46 and 1.3.28 did
not exhibit good performance (with PHP running as a SAPI). But with
IIS 5 running FastCGI, the networking/IIS overhead drops to about
10-15%.
In my tests, I was pounding the web server with 100 simulated
clients, and I suspect that there are still some internal
bottlenecks crippling PHP performance as a threaded Apache SAPI
(either in the threads support or oci8 extension). I think that
Apache with FastCGI would perform better, but I don't have the time
to test this.
Discuss (7
responses) (Join
/ Login first) Edit
permalink: #
|
| Twisting by the
PHPool |
| Recently I posted a link to Twisted, a python library for
developing socket servers. This generated a storm of interest as to
whether PHP is suitable for developing similar custom networking
applications. You should read the commentary in the above link for
some interesting remarks by BDKR.
So i benchmarked Nanoweb, a webserver written in PHP, against
Apache. Test was repeatedly requesting a 4K HTML file
(adodb-session.htm), using M'soft's WAST set at 10 concurrent
threads. All software running on a 2.6 Ghz Win XP machine. PHP-CLI
4.3.3 was used to run Nanoweb. Requests/Sec
Nanoweb 176
Apache 1.3.28 300
These figures suggest that a PHP networking app (Nanoweb)
running on brand-new 2003 hardware (3 Ghz PC) will be faster than a
program written in C (Apache) on good year 2000 hardware (1Ghz PC).
Now if someone had told me in 2000 that in 3 years I would be
able to run a web server written in PHP, and that it would run
faster than the Apache of the year 2000, I wouldn't have believed
the person. I think this sort of performance is a fantastic
achievement for PHP and its developers.
PS: For those of you who read the Twisted docs, you will see that
it does not use threads, but callbacks. It should be possible to do
something similar in PHP too.
New: If you click on Discuss,
you will see that Kemar has posted some benchmarks on Linux, and the
numbers are quite different from Windows XP (9 Oct 2003)
Discuss (14
responses) (Join
/ Login first) Edit
permalink: #
|
| php-con West extends Early Bird to
Oct. 10th! |
| Oops. Here's another announcement that slipped through the
cracks while i was sick. Sorry Monica.
php-con West 2003's early bird deadline has been extended till
Friday, October 10th. It's not too late to sign up tutorial,
technical session or full conference packages at reduced fees.
Register online at http://www.php-con.com/ or
download and fax one of our registration PDFs.
** Additional Savings and Special Promotions **
Do you work for a non-profit or university? Are you a student or
a member of a PHP user group? You may be eligible for additional
savings on registration fees.
* Students get 50% off registration when they include a current
class schedule and photo ID with their faxed registration
* Employees of Yahoo!, Universities and Non-Profits can take an
additional 10% off rates.
* Members of recognized PHP User Groups, phpclasses.org and
PostNuke are also eligible for additional savings.
Want to find out if you qualify? Email monica#php-con.com for
more information.
Discuss (Join / Login first) Edit
permalink: #
|
| Getting Twisted
|
| Twisted is a framework, written in Python, for writing
networked applications. It includes implementations of a number of
commonly used network services such as a web server, an IRC chat
server, a mail server, a relational database interface and an object
broker. Developers can build applications using all of these
services as well as custom services that they write themselves.
Twisted also includes a user authentication system that controls
access to services and provides services with user context
information to implement their own security models.
This is certainly one area where PHP is lacking. Twisted looks
really nice for developing specialized network servers.
PS: Also see David Mertz's multi-part
tutorial on Twisted.
Discuss (5
responses) (Join
/ Login first) Edit
permalink: #
|
| Tuning PHP
Database Performance |
| In a typical web application, most of the time is spent querying
the database, formatting the results and spitting the html back to
the browser. This also means one of the biggest speed bottlenecks
are your SQL queries.
I became
interested in db tuning tools after using the excellent 3rd party
Oracle development tool TOAD
which has similar functionality. We have a team of PHP developers
who use Oracle, and without a tool like TOAD, it would be impossible
to identify bad SQL and database bottlenecks quickly. As our clients
use a wide range of databases, I felt that a similar tuning tool
that works across multiple databases would be useful. And to my
surprise, popular tools such as PgAdmin, PHPMyAdmin and the like do
not have equivalent functionality.
That is why the latest feature of ADOdb (download here) is
performance monitoring. This allows you to log all SQL executed into
a table called adodb_logsql (sometimes when it comes to naming, not
being original is a virtue), together with the execution time and
the name of the script that called it. This logging is done by
calling $db->LogSQL(true);
ADOdb also provides tools to analyze bottlenecks in your SQL.
After logging your SQL, ADOdb then classifies your SQL into
- Suspicious SQL, which is SQL with high average
execution times. It is suspicious because the SQL might be poorly
written. Tuning these SQL will improve the response times of your
web application.
- Expensive SQL, which is SQL with high total execution
times (eg. average execution time * # times executed). Tuning
these SQL will drop the load on your database server.
- Invalid SQL, which is SQL that generates error
messages.
We provide a HTML UI, which you can invoke with: <?php
include_once('adodb.inc.php');
$db = NewADOConnection('mysql'); # driver
$db->Connect(...);
$perf = NewPerfMonitor($db);
$perf->UI();
?>
Click on "View SQL", you will see your queries nicely
classified. Clicking on an SQL query will bring up a new window that
explains the execution plan, and lists which the PHP scripts
that call this SQL.
You can also control the number of SQL queries to view by
entering a number in the input field at the top-right of the View
SQL screen.
Field Test
This week I had my first successful field test of the system on a
customer site. I have tuned several in-house systems already, but
this was my first opportunity to test this on a system where I was
not familiar with the code.
After gathering the statistics for half-an-hour, we opened the
performance UI. Of course performance is relative: a query that
returns 100,000 rows and takes 1 second is pretty fast, but any
query that takes 0.1 seconds to retrieving one record is pretty
slow.
We found several statements that process less than 10 records but
were taking over a second to execute. A quick check revealed that
the table in question was not properly indexed. After indexing, we
saw a 1000% improvement in performance for those queries. Also some
queries were pulling whole tables with thousands of rows from the
database and performing joins in PHP; a simple query rewrite solved
this.
We found all these problems within the first hour of using the
new performance monitor. Before I had this tool, I would have had to
interview the end-users on what pages were slow, manually benchmark
the pages to check the response times, and then read the source code
to try to identify the bottlenecks. This new feature saved me days
of tuning! Good job, if i do say so myself.
PS: Thx for all the nice messages while I was sick.
Discuss (4
responses) (Join
/ Login first) Edit
permalink: #
|
| Busy Sick
Bumblebee |
| Busy with work pressures and sick (stomach bacterial infection)
this week. Let the miserable rest.
Discuss (2
responses) (Join
/ Login first) Edit
permalink: #
|
| Cyberinsecurity:
The Cost of Monopoly |
| Technology analyst Daniel Geer, formerly the CTO of @Stake, was
allegedly fired for co-authoring this report that purports that
Microsoft's monopoly on operating systems and its tight integration
of software constitutes a huge security risk to the world's
technology infrastructure.
More details
on the firing.
Discuss (Join / Login first) Edit
permalink: #
|
| A
Conversation with Joshua Bloch on Java 2, SE 1.5
|
| The new language features all have one thing in common: they
take some common idiom and provide linguistic support for it. In
other words, they shift the responsibility for writing the
boilerplate code from the programmer to the compiler.
An interesting release. From my C++ experience, the new generics
feature is like the late Charles Bronson, an ugly as sin syntax but
quite an attractive feature once you get used to the initial looks.
And here is feedback from some
real Java programmers (read the comments for a plethora of views).
Also see the Stephen
Jungels article on 1.5.
Discuss (1
response) (Join /
Login first) Edit
permalink: #
|
| PHP Component Model
(PCOM) |
| The goal of this project is to develop a standard for the
development of exchangable components that can be customized and
plugged together using PHP Development Environments (PHP IDEs).
-- Sebastian Bergmann
Very interesting. To make this succeed, we need to get through
the hard part, which is agreeing on what common set of services
everyone will support. To illustrate, here are some issues that come
immediately to mind:
What exception mechanism will be supported if an invalid event or
service occurs? I like PHP5's try/catch, but I dislike PEAR's error
handling because it interferes with my in-house error handling
mechanism.
Who is responsible for serialization and storage of component
settings/properties? What serialization protocol will be used? For
phpLens, we decided that we had no choice but store the properties
as PHP code for speed.
Are multiple versions of the same component allowed like in .NET?
PHP does not have an OOP model that allows multiple classes with the
same name to co-exist.
There needs to be some dichotomy between edit and runtime modes.
Different sets of services will need to be available. How will this
work?
PS: I emailed Sebastian asking him to setup a Wiki/mailing-list
on this. Lets see what happens.
Discuss (3
responses) (Join
/ Login first) Edit
permalink: #
|
| Harry
Fuecks and the Control Freaks |
| Harry explains his thoughts on processing logic for web pages. I
like Harry's conclusion, but looking at the twisted way the
conclusion was derived, all I can say is: Harry, I think your mind
has been warped by too many design patterns. Throw them away and
breathe freedom :-)
I also like this:
Put bluntly, what I'm saying here is I think most of the PHP
frameworks out there today have got it "wrong". And yes this site is
guilty of exactly these "crimes"...
In my opinion, there is too much mindless imitation of .NET and
Java out there. Remember that PHP is a dynamic language, and many of
the design choices in .NET and Java do not make sense for PHP.
Here's an example of something that is very hard to implement in
VB.NET or C# or Java: PHP can dynamically recompile itself; the
famous Smarty template compiler
being just the tip of the iceberg. There is a world of PHP
innovation out there beyond cloning PHPNuke or J2EE, just waiting to
be discovered!
PS: Something different: Leendert Brouwer has posted an essay on
template
engines.
Discuss (3
responses) (Join
/ Login first) Edit
permalink: #
|
| J2EE,
.NET and PHP at MIT |
| A project done in Java will cost 5 times as much, take twice
as long, and be harder to maintain than a project done in a
scripting language such as PHP or Perl. People who are serious about
getting the job done on time and under budget will use tools such as
Visual Basic. -- Phil Greenspun
The nice thing about Java is that it is a good general purpose
language. Unfortunately that strength also means that it is really
good at nothing, and is a handicap in domain specific areas. You can
argue that C# has the same defects as Java, so why is C# is easier
to use? The evidence
shown here hints why Java has failed in this area.
PS: The responses posted in comments
are interesting also.
Discuss (Join / Login first) Edit
permalink: #
|
| Code bloat and benchmarking
database libraries, round 2 |
| Recently, I came across this
thread at SitePoint about code bloat. This is an issue because
PHP in its default configuration has to recompile all scripts on
execution. So comments and complex logic have to be reparsed on each
request.
Now Shakespeare would have said, "Much ado about nothing." In
Malaysia we say: "All talk, no action-lah!" If you really are
interested in speed simply install a PHP accelerator which compiles
your PHP once, and subsequent requests will use the cached compiled
script. Then this is a non-issue.
Some of you might say you are in a shared hosting environment and
your ISP doesn't provide an accelerator. Well, the fact that you're
using shared hosting probably means the ability to operate
under a heavy load is not really a requirement. And I have heard of | |