Skip navigation.

Easy Parallel Processing in PHP

The proliferation of multicore CPUs and the inability of our learned CPU vendors to squeeze many more GHz into their designs means that often the only way to get additional performance is by writing clever parallel software.

One problem we were having is that some of our batch processing jobs were taking too long to run. In order to speed the processing, we tried to split the processing file into half, and let a separate PHP process run each job. Given that we were using a dual core server, each process would be able to run close to full speed (subject to I/O constraints).

Here is our technique for running multiple parallel jobs in PHP. In this example, we have two job files: j1.php and j2.php we want to run. The sample jobs don't do anything fancy. The file j1.php looks like this:

$jobname = 'j1';
set_time_limit(0);
$secs = 60;

while ($secs) {
        echo $jobname,'::',$secs,"\n";
        flush(); @ob_flush();  ## make sure that all output is sent in real-time
        $secs -= 1;
        $t = time();
        sleep(1); // pause
}

The reason why we flush(); @ob_flush(); is that when we echo or print, the strings are sometimes buffered by PHP and not sent until later. These two functions ensure that all data is sent immediately.

We then have a 3rd file, control.php, which does the coordination of jobs j1 and j2. This script will call j1.php and j2.php asynchronously using fsockopen in JobStartAsync(), so we are able to run j1.php and j2.php in parallel. The output from j1.php and j2.php are returned to control.php using JobPollAsync().

#
# control.php
#
function JobStartAsync($server, $url, $port=80,$conn_timeout=30, $rw_timeout=86400)
{
	$errno = '';
	$errstr = '';
	
	set_time_limit(0);
	
	$fp = fsockopen($server, $port, $errno, $errstr, $conn_timeout);
	if (!$fp) {
	   echo "$errstr ($errno)<br />\n";
	   return false;
	}
	$out = "GET $url HTTP/1.1\r\n";
	$out .= "Host: $server\r\n";
	$out .= "Connection: Close\r\n\r\n";
	
	stream_set_blocking($fp, false);
	stream_set_timeout($fp, $rw_timeout);
	fwrite($fp, $out);
	
	return $fp;
}

// returns false if HTTP disconnect (EOF), or a string (could be empty string) if still connected
function JobPollAsync(&$fp) 
{
	if ($fp === false) return false;
	
	if (feof($fp)) {
		fclose($fp);
		$fp = false;
		return false;
	}
	
	return fread($fp, 10000);
}

###########################################################################################

 
if (1) {  /* SAMPLE USAGE BELOW */

	$fp1 = JobStartAsync('localhost','/jobs/j1.php');
	$fp2 = JobStartAsync('localhost','/jobs/j2.php');
	
	
	while (true) {
		sleep(1);
		
		$r1 = JobPollAsync($fp1);
		$r2 = JobPollAsync($fp2);
		
		if ($r1 === false && $r2 === false) break;
		
		echo "<b>r1 = </b>$r1<br>";
		echo "<b>r2 = </b>$r2<hr>";
		flush(); @ob_flush();
	}
	
	echo "<h3>Jobs Complete</h3>";
}

And the output could look like this:

r1 = HTTP/1.1 200 OK
Date: Wed, 03 Sep 2008 07:20:20 GMT
Server: Apache/2.2.4 (Unix) mod_ssl/2.2.4 OpenSSL/0.9.8d
X-Powered-By: Zend Core/2.5.0 PHP/5.2.5
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

7 j1::60


r2 = HTTP/1.1 200 OK
Date: Wed, 03 Sep 2008 07:20:20 GMT
Server: Apache/2.2.4 (Unix) mod_ssl/2.2.4 OpenSSL/0.9.8d
X-Powered-By: Zend Core/2.5.0 PHP/5.2.5
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

7 j2::60
----
r1 = 7 j1::59

r2 = 7 j2::59
----
r1 = 7 j1::58

r2 = 7 j2::58

----

Note that "7 j2::60" is returned by PollJobAsync(). The reason is that the HTTP standard requires the packet to return the payload length (7 bytes) in the first line.

I hope this was helpful. Have fun!

PS: Also see Divide-and-conquer and parallel processing in PHP. Also see popen for an alternative technique.