Skip navigation.

Updated Optimizing PHP Article #

I have just updated my popular Optimizing PHP article with additional information on caching. I discuss memcache and squid. I also updated the PHP Accelerators and changed the tone of some parts of the article. I quote:

Perhaps the most significant change to PHP performance I have experienced since I first wrote this article is my use of Squid, a web accelerator that is able to take over the management of all static http files from Apache. You may be surprised to find that the overhead of using Apache to serve both dynamic PHP and static images, javascript, css, html is extremely high. From my experience, 40-50% of our Apache CPU utilisation is in handling these static files (the remaining CPU usage is taken up by PHP running in Apache).

It is better to offload downloading of these static files by using Squid in httpd-accelerator mode. In this scenario, you use Squid as the front web server on port 80, and set all .php requests to be dispatched to Apache on port 81 which i have running on the same server. The static files are served by Squid.

Malaysian FOSS Conference 2009 Opening Keynote #

Last Saturday, I gave the opening keynote of the Malaysian Free & Open Source Software 2009 conference. The speech was prepared the day before, but as usual, I will improvise some stuff, so some parts have been amended based on memory:

Ladies and gentlemen, honored guests, good morning!

Today the landscape of information technology has been transformed by the vision of free software and open source. The search engines of Google roar with the sounds of open source Linux. Our Malaysian government encourages the use of open source whenever possible. Sounds of PHP, MySQL, Apache, GPL have become familiar names in the tapestry of IT.

But that was not what it was like when I first started out as a young student in the mid-80s at the University of Melbourne, Australia. Things were different then. Concepts such as open source, GPL were still unknown. I still remember a fellow student was expelled from university for making copies of the source code of proprietary Unix software for his personal use.

I admit I was disturbed by this, because I too had an insatiable curiosity about how software worked, and it was impossible to learn more without access to the source code. I wanted to find and understand the wiring inside the software.

I remember fondly, and today with a bit of guilt, that I used to crack copy protected games, not for the pursuit of profit, but as an intellectual challenge – well ok, I have to admit I did it to play the games. The trick doing this (cracking) metaphorically is finding the wiring behind the copy protection and reversing the wires so that instead of refusing to run, it does the opposite and continues working.

Of course to quickly find the right wires to switch and crack a large program is not easy. Which brings me to the first piece of advice if you want to be successful in software design… You need to have good taste, which is kind of weird because nerdy programmers are notoriously bad dressers, fond of bad hair days and certainly not fussy about the finer points of fine dining.

What I’m taking about is of course is a taste for good logic. The feel of a beautiful idea, the taste of a mighty logic, or the fun in a great hack.

Games designers and coders are a talented bunch of people, and if you understand their logical rhythms and designs, it becomes obvious where the wires you need to reverse to crack the software reside.

The other important element of success is being happy. You have to have passion and enjoy what you are doing. To me cracking games was like cracking walnuts, a fun thing to do, but after a while it got boring. You need to do something with others and share with others to become really passionate about something.

Social responsibility is another important element of life. You need to channel your life productively - only then will you find true happiness. Cracking games became boring and I found other better diversions.

It was around this time my fellow student was expelled that I learned about the international USENET community. To young people, you have to imagine a time before the World Wide Web, when people used the Internet primarily for email. USENET was a fantastic group of mailing lists with forums and archives. USENET was also used to disseminate programming ideas and knowledge in the form of source code.

So even before the concepts of Open Source and licenses such as GPL became well known, there was this thriving community of programmers who shared their source code and learnt from others. Which brings me to the next lesson: the typical image of the best programmers being lonely introverted hackers is misleading. People are only successful in a community. Open source software needs to be grown organically and for that you need social skills. The classic example here is of course Linus Torvalds, author of Linux, who has skillfully led the Linux community from its inception.

It was through the USENET that I released software that I had written, including the one that won runner-up for best Australian Macintosh software in 1988 while still a foreign student in Melbourne.

You know, while preparing this speech, at the back of my mind I have always wondered why Malaysia has not had a bigger role in contributing software to the open source community? Was what I achieved due to my overseas education? I was thinking about it last night while writing this speech, and I don't think so: I will tell you why...

Malaysians do not lack ability. I see many smart and interesting people around me here at the conference. And I have seen many sophisticated pieces of software in the commercial world developed by talented teams of Malaysians. English, the language of Science and the Internet, is widely spoken here. However in the open source world, we have many more consumers than contributors.

Is it our education system? Perhaps an over-emphasis on exams it is a contributing factor, but I don’t think that is the main reason. I studied for 12 years in Malaysian state schools, and I survived sane and reasonably intelligent! And exposure to the Internet has made young people more worldly than any previous generation of Malaysians.

After reflecting, I suspect the reason is primarily economic. After college, it is difficult to sustain a living and have the time to contribute meaningfully to an open source project here in Malaysia. There are companies with strong support for open source here, but most companies here see little value in allowing their staff to contribute to open source.

So let’s flash forward from studying Melbourne in the 80’s to working in Malaysia in the year 2000. At that point in time, my company was planning on developing their next generation web application server, called PHPLens. An application server is a professional software framework which makes it easier for programmers to build high quality software modules.

We also wanted PHPLens to support as many databases as possible. That was the reason why we decided to open source our database abstraction library. Contributions from the programming community were encouraged so that we could support more databases.

And as this was the 3rd database abstraction library I had developed in my career, I had some meaningful experience in this area. Other developers liked it and today the library has become very popular world-wide and is in use by thousands of developers.

I have been working with and supporting the ADOdb abstraction library for over 9 years. I can tell you working on open source is sometimes not fun. You work for hours to implement some feature and then the feedback you get is that it’s not very useful. People will disagree with you. You also get cranky people emailing you in broken English to fix their problems urgently. And if you misunderstand them, it just gets worst. To survive, you need to be passionate about your work, really listen to people (which isn’t easy in an email exchange) and be committed to excellence.

I would like to show you now a presentation I did on ADOdb a few years ago. [presentation here]

In closing, I would like to ask how do I think the Malaysian Free & Open Source Software movement can advance further? Actually I think we are doing a good job. I see a lot of local companies have already switched to using Open Office or running Linux, Apache, MySQL, PHP for their web-sites.

As I mentioned before, the real factors we need to look into are still economic, your take-home pay. What we need is more demand for people with the right skills to support this open source infrastructure, and an ecosystem where the pay is attractive.

We need to transition from the idea that “free software is cheap” to “free software is cost-effective”. There is dignity in work, and people deserve to be rewarded. Thank you.

Monitoring and logging CPU Utilization of Virtual Machines in Xen #

Oct 6 update: Added logging of disk [d] and network [n] info.
Oct 4 update: added availability option. Now uses xentop internally.
Oct 2 update: added graphing to xenstat.pl. Now xenstat.pl detects Guest VM start/shutdown and resets itself. Number of vcpus also shown. Misc bug fixes.

You can download xenstat.pl here.

Syntax

perl xenstat.pl [$mode] [$intervalsecs=5] [$nsamples=0] [$urlToPostStats]

Quick Guide

perl xenstat.pl          -- generate cpu stats every 5 secs
perl xenstat.pl 10       -- generate cpu stats every 10 secs
perl xenstat.pl 5 2      -- generate cpu stats every 5 secs, 2 samples

perl xenstat.pl d 3      -- generate disk stats every 3 secs
perl xenstat.pl n 3      -- generate network stats every 3 secs
perl xenstat.pl a 5      -- generate cpu avail (e.g. cpu idle) stats every 5 secs

perl xenstat.pl 3 1 http://server/log.php    -- gather 3 secs cpu stats and send to URL
perl xenstat.pl d 4 1 http://server/log.php    -- gather 4 secs disk stats and send to URL
perl xenstat.pl n 5 1 http://server/log.php    -- gather 5 secs network stats and send to URL

Requires xentop from Xen 3.2 or newer xentop backported to Xen 3.1.

Usage

To use run "perl xenstat.pl" in domain 0. The following output will be generated, with a new statistic generated every 5 seconds:

[root@server ~]# perl xenstat.pl 
cpus=2
       40_falcon   2.67%    2.51 cpu hrs  in 1.96 days ( 2 vcpu,  2048 M)
       52_python   0.24%  747.57 cpu secs in 1.79 days ( 2 vcpu,  1500 M)
     54_garuda_0   0.44% 2252.32 cpu secs in 2.96 days ( 2 vcpu,   750 M)
           Dom-0   2.24%    9.24 cpu hrs  in 8.59 days ( 2 vcpu,   564 M)

                    40_falc 52_pyth 54_garu   Dom-0    Idle
2009-10-02 19:31:20     0.1     0.1    82.5    17.3     0.0 *****
2009-10-02 19:31:25     0.1     0.1    64.0     9.3    26.5 ****
2009-10-02 19:31:30     0.1     0.0    50.0    49.9     0.0 *****


In the above output, the first few lines summarise the CPUs and running domains. Then we have the statistics generated every 5 seconds. At the end of each line is a simple graph. 5 stars means 90% or over CPU utilisation, 4 stars is 70% or over, etc.

You can also define the interval to poll (in seconds), and the number of samples just like vmstat:

[root@server ~]# perl xenstat.pl 3 2
cpus=2
       40_falcon   2.67%    2.51 cpu hrs  in 1.96 days ( 2 vcpu,  2048 M)
       52_python   0.24%  748.07 cpu secs in 1.79 days ( 2 vcpu,  1500 M)
     54_garuda_0   0.44% 2258.38 cpu secs in 2.96 days ( 2 vcpu,   750 M)
           Dom-0   2.24%    9.24 cpu hrs  in 8.59 days ( 2 vcpu,   564 M)

                    40_falc 52_pyth 54_garu   Dom-0    Idle
2009-10-01 12:14:59     0.0     0.0     1.7     5.7    92.5
2009-10-01 12:15:02     0.0     0.0     0.3    10.4    89.3 *

[root@server ~]#

Logging Using REST web service

To log the CPU utilisation using the Perl script, I didn't want to install a database client in Dom-0. So I added another parameter, a URL to a web server to call with the CPU info as GET parameters. I assume wget is installed in your Dom-0.

[root@server ~]# perl xenstat.pl 10 1  http://192.168.0.1/
cpus=2
     54_garuda_0  0.49%  165.81 cpu sec over 3.62 days (2 vcpu,   750 M)
    59_gyrfalcon  0.62%   69.03 cpu sec over 0.80 days (2 vcpu,  2000 M)
           Dom-0  1.57%    2.15 cpu hrs over 3.62 days (2 vcpu,   564 M)


--10:46:42--  http://192.168.0.1/?54_garuda_0=0.1&59_gyrfalcon=2.1&Dom%2D0=2.2&
Connecting to 192.168.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 498 [text/html]
Saving to: `STDOUT'

100%[============================================>] 498         --.-K/s   in 0s

10:46:42 (67.8 MB/s) - `-' saved [498/498]

2009-09-29 10:46:42  0.1  2.1  2.2 95.6

This will accumulate statistics for 10 seconds then send it to the above url in this format:

 http://192.168.0.1/?54_garuda_0=0.1&59_gyrfalcon=2.1&Dom%2D0=2.2&.

This allows you to log the data using a REST-ful web service.

Network mode [n]

Shows total network reads and writes in KBytes or MBytes for that time period.

perl xenstat.pl n

    Network I/O (K)  52_pyth 54_garu 59_gyrf   Dom-0
 2009-10-05 19:55:08       7     979       1       3
 2009-10-05 19:55:13       6    1.2M       1       1
 2009-10-05 19:55:18       5     600       2       3

Disk IO mode [d]

Shows total reads and write requests for each domain for that time period.

 perl xenstat.pl d

  Disk I/O (Reqs)    52_pyth 54_garu 59_gyrf   Dom-0
 2009-10-05 19:51:02       4       0    1317       0
 2009-10-05 19:51:07      27       0    1140       0

Availability Option [a]

Shows CPU Availability % (which is the same as CPU Idle %) instead of CPU Utilisation %.

The problem with showing CPU Utilisation occurs when you have multiple Guest VMs with different number of vcpus. If the CPU Utilisation of a guest VM is 50% can you tell whether it is already capped (vcpus = 50% of physical cpus), or can it go higher?

The solution is to reverse the CPU figures and view information in terms of Available CPU % left (100 - CPU Utilisation %). The advantage is that you know when the CPU of a guest VM are exhausted as the figures approach zero. In the example below, note that garuda has only 1 vcpu which means that cpu available is capped at 50% for garuda.

[server~ ]# xenstat a

Output:
-------
cpus=2
     40_falcon   2.33%    2.53 cpu hrs  in 2.26 days (2 vcpu,  2048 M)
     52_python   0.26%  940.55 cpu secs in 2.08 days (2 vcpu,  1500 M)
   54_garuda_0   1.48%   18.47 cpu secs in 0.01 days (1 vcpu,   750 M)
         Dom-0   2.28%    9.73 cpu hrs  in 8.89 days (2 vcpu,   564 M)

    Available CPU %  40_falc 52_pyth 54_garu   Dom-0 CPU-free
2009-10-07 18:25:20   100.0    49.9    99.8    99.5    99.1
2009-10-07 18:25:22   100.0    48.2    42.1    91.7    32.0 ***
2009-10-07 18:25:24   100.0    45.2    25.5    79.3     0.0 *****
2009-10-07 18:25:26    99.9    50.0     0.3    99.8     0.0 *****
2009-10-07 18:25:28   100.0    50.0    16.7    87.7     4.3 *****
2009-10-07 18:25:30   100.0    50.0    73.7    99.8    73.3 *

Initially in the first line of statistics below everything is quiet. With CPU Availability as the statistic, we can immediately notice that garuda has 1 vcpu (50% of 2 physical cpus) and all the others have 2 vcpus:

    Available CPU %  40_falc 52_pyth 54_garu   Dom-0 CPU-free
 2009-10-07 18:25:20   100.0    49.9    99.8    99.5    99.1

In the 2nd line, we can see:

    Available CPU %  40_falc 52_pyth 54_garu   Dom-0 CPU-free
 2009-10-07 18:25:22   100.0    48.2    42.1    91.7    32.0 ***
Now the server is getting busy (with garuda being the busiest), and the amount of CPU-free is less than each of the domains. This means that python domain has 48.2% virtual idle capacity, but at that point in time only 32% of that idle capacity can be serviced.

In the 3rd line, python is heavily loaded and there is no more spare CPU capacity.

    Available CPU %  40_falc 52_pyth 54_garu   Dom-0 CPU-free
 2009-10-07 18:25:26    99.9    0.03    50.0    99.8     0.0 *****

If we were looking at it in terms of CPU idle, it would not be obvious that python is overloaded, as you can see if we look only at CPU usage for the same statistics as the 3rd line:

[server~]# xenstat 
                     40_falc 52_pyth 54_garu   Dom-0    Idle
 2009-10-07 18:25:26     0.1   49.97    50.0     0.2     0.0 *****

I hope this is useful for anyone using Xen. This has been a good experience down memory lane too as I haven't coded in Perl for nearly 10 years!

Download xenstat.pl.

History

In Sept 2009, we started experimenting with the Xen hypervisor. In my testing, I have found that Linux performance is better on Xen than VMWare and we are considering it for Linux rollouts.

Normally when we roll out a new server for a customer, we have a simple PHP script installed as a cron job that runs vmstat and logs the CPU utilization of the server into our database every 5 minutes. It's very useful for benchmarking, monitoring and troubleshooting mysterious performance problems. I needed a similar script for Xen.

A search in Google revealed a Perl script by Tom Brown to record the Xen domain CPU utilisation.

However the following limitations led me to modify it:

  • I want total CPU utilisation to be capped at 100%, which is the way "top" works, but not the way "xm top" works.
  • Does not work properly with multi-core CPUs. CPU utilisation can go over 100%.
  • Unfortunately sleep() does not sleep for precisely the number of seconds you define causing the CPU utilization to go over 100% again. There is some perturbation, either because Dom-0 is still virtualised or some other reason.
  • No easy way of logging to a database.

So i rewrote parts of the script and renamed in xenstat.pl (after vmstat).


Other tools: see xentop, which can run in batch mode, but cannot post to web server.

The original script written by Tom Brown.