Performance issues with Debian’s timezone database PHP patch

I have recently been debugging Apache2/PHP5 (running with mpm_prefork what turned out to be important later) server suffering for very high load spikes. Load was completely CPU related, there was almost no IO operations performed during peak times and surprisingly significant part of CPU time was spent in the kernel (about 20%) mode. I have examined Apache2 process using strace and found that there were three main groups of syscalls:

strace -p 28105 -f -c

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 50.15    8.867324         304     29128      3861 stat
 30.30    5.357779        5299      1011       194 access
 19.05    3.368963          18    184918      4641 lstat

Tunning FollowSymLinks and AllowOverride allowed to significantly cut down amount of lstat calls.

Before I’ve examined stat() closely, I’ve made wrong assumption that it APC (opcode/user cache that utilizes SHM or MMAP shared memory segments) is responsible for the syscalls. APC will usually issue stat() on PHP file prior to serving it’s opcode from cache. If file has been modified since APC cached it, it is sent to the parser again. This is done to allow changing code on the fly, without restarting the web server (or other means of flushing APC cache).

Since it was relatively safe to do I’ve tried reconfiguring APC to not check file’s modification dates by adding:

apc.stat = 0

Unfortunately it didn’t help much, so I’ve ended up analyzing details of syscalls. After quick analysis it turned out that occasionally I can find whole blocks (of over 600) calls to check the data in /usr/share/zoneinfo. This seemed like some sort of initialization for me, and it seemed that it’s only PHP content causing that.

[pid 10293]      0.000121 stat("/usr/share/zoneinfo/Pacific/Funafuti",
{st_mode=S_IFREG|0644, st_size=141, ...}) = 0
[pid 10293]      0.000121 stat("/usr/share/zoneinfo/Pacific/Fiji",
{st_mode=S_IFREG|0644, st_size=296, ...}) = 0
[pid 10293]      0.000122 stat("/usr/share/zoneinfo/Pacific/Fakaofo",
{st_mode=S_IFREG|0644, st_size=140, ...}) = 0
[pid 10293]      0.000121 stat("/usr/share/zoneinfo/Pacific/Enderbury",
{st_mode=S_IFREG|0644, st_size=204, ...}) = 0
[pid 10293]      0.000122 stat("/usr/share/zoneinfo/Pacific/Efate",
{st_mode=S_IFREG|0644, st_size=464, ...}) = 0
[pid 10293]      0.000122 stat("/usr/share/zoneinfo/Pacific/Easter",
{st_mode=S_IFREG|0644, st_size=8971, ...}) = 0
[pid 10293]      0.000123 stat("/usr/share/zoneinfo/Pacific/Chuuk",
{st_mode=S_IFREG|0644, st_size=144, ...}) = 0
[pid 10293]      0.000121 stat("/usr/share/zoneinfo/Pacific/Chatham",
{st_mode=S_IFREG|0644, st_size=2018,

I went to apache2.conf and found cause of the load – mpm_prefork was misconfigured:

 
   StartServers          8
   MinSpareServers      15
   MaxSpareServers      10
   MaxClients          150
   ServerLimit         300
   MaxClients          300
   MaxRequestsPerChild   4000

Because of MinSpareServers being bigger than MaxSpareServers (which is illogical itself), Apache kept forking off and terminating processes a lot – and every of short lived processes had to initialize PHP if it served any. Changing MaxSpareServers to more sensible value reduced both user and kernel CPU load (I should have read Apache’s config properly in the first place ;) ).

After that I have kept playing, trying to narrow down the reason for all these syscalls. I’ve got a copy of application on a spare box and set MaxRequestsPerChild to 1, what should make Apache to initialize PHP on every single request. Few die()’s later, it turned out that the timezone data is being scanned on:

date_default_timezone_set('GMT');

, and it’s easily replicable from CLI by running:

 strace -r php -n -r"date_default_timezone_set('GMT');" 2>&1 | grep zoneinfo

I have initially submitted bug for PHP, however it was quickly straightened that it’s Debian patching its PHP package for the convenience of utilizing system’s timezone updates (use_embedded_timezonedb.patch). Full discussion can be found here:

Eventually, the whole timezone initialization is not really of much importance (at least with PHP served of Apache), the real problem server I was looking at had was misconfigured Apache, however it seems that lack of cooperation between PHP authors and package maintainers leads to performance regressions (slight, but still).

Share
    • Kevin
    • January 20th, 2012 4:52pm

    Your post here, although the root cause was different in my situation, still helped me diagnose and understand what was causing my apache instance to spin out of control occasionally.

    Thanks for this!

  1. I’ve come across this and it annoys me too :-( so thanks for the URIs to PHP and Debian bugzillas. I’ve done microtiming on a couple of hosted services — one mod_php5 and one suPHP (so PHP image activation occurs 1 per child and 1 per req respectively) as well as my laptop and a couple of VMs.

    The hit seems to be ~15-20 mSec if not fully VFAT cached and ~2 mSec if cached. Strace’ing actually skews the timing because you are paying a per tracepoint cost-of tracing overhead.

    So tacky rather than a show stopper.

  1. No trackbacks yet.