Mantis Bugtracker
  

Viewing Issue Advanced Details Jump to Notes ] View Simple ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0002981 [Quercus] minor always 09-30-08 08:40 10-08-08 18:48
Reporter ferg View Status public  
Assigned To ferg
Priority normal Resolution fixed Platform
Status closed   OS
Projection none   OS Version
ETA none Fixed in Version 3.2.1 Product Version 3.2.1
  Product Build
Summary 0002981: include performance issues
Description (rep by Steven Grimm)

Over the weekend I did a little bit of profiling of Quercus. There are two hotspots where Quercus is spending a lot of time when it goes through our big big list of include files. My test was just running "ab" to serially fetch a dummy .php file that contained nothing but an include_once of our top-level master include file.

First, Quercus is checking the modification times on each original PHP file on every run. I don't see a config option to only do that periodically (though I may have missed something obvious) -- it's actually quite expensive, accounting for nearly 1/4 of the wall time of my test. That's possibly not a huge deal in a development environment but in a production environment we'd probably want to only check mod times once a minute or so since on-the-fly edits are extremely rare.

Second, it's spending tons of time assembling the include file paths. To support multiple developer sandboxes, our include statements are all dynamic, like:

    include_once $_SERVER['PHP_ROOT'] . '/lib/display/pages.php';

and we set PHP_ROOT in the Apache virtual host config (on vanilla PHP) or in the Resin config. That ends up generating Java code like:

    env.includeOnce(_quercus_selfDirectory, v__SERVER.get(qv_PHP_ROOT_0).toStringBuilder(env).append(_qc_libdisplaypagesphp).toString(), false);

According to the profiling, a huge amount of time is being spent in the various parts of that chain of string operations, primarily in the append() method which appears to need to do a reallocation of its buffer for each and every one of these includeOnce() calls. This may be related to bug 2971.

Maybe this can instead do some compile-time evaluation of the paths, like:

if (v__SERVER.get(qv_PHP_ROOT_0).equals("/path/at/compile/time")) {
    env.includeOnce(_quercus_selfDirectory, "/path/at/compile/time/lib/display/pages.php", false);
    // and ideally all the other adjacent includes that depend only on $_SERVER['PHP_ROOT'],
    // though the get().equals() is probably not too expensive to do for each one
} else {
    // do the dynamic include as above, *and* trigger a background recompile of the page
}

That would basically eliminate the string construction cost of all these includes, of which we have zillions since each module "include_once"s its direct dependencies whether or not those dependencies were already included earlier in another module. (There are currently 18094 include_once statements in our code base, to give you some idea, though not all of them are hit on every request.) But maybe there's another way to make this fast; the above is just the first thing that comes to mind.

Once we have a fix for 2971 in hand, and hopefully for the issues above too, we'll do another round of benchmarks/profiling and see where things stand.

Steps To Reproduce
Additional Information
Attached Files  bench.php [^] (1,412 bytes) 10-05-08 12:25

- Relationships

- Notes
(0003498)
koreth
10-05-08 11:19

I tried setting <dependency-check-interval> to 5000 seconds in resin.xml and it had no effect on the number of modification time checks in my profile run (this is using the code from svn as of October 3).
 
(0003499)
koreth
10-05-08 12:23

I tried converting all the dynamic include_once paths to static ones in the .php files. The result is that my library load time is between 5% and 10% faster according to "ab" tests -- and the actual difference in CPU time is bigger than that because some of the load time is setup that requires communicating with remote hosts, so constant overhead is pretty large.

I wrote a little benchmark (which I'll attach) to quantify the difference without the constant overhead. In vanilla PHP the dynamic version is about 15% slower than the static version. In Quercus it's about 150% slower. And Quercus includes are MUCH slower than vanilla PHP, too; the static test runs in 1.05 seconds on Quercus but only takes .0006 seconds on vanilla PHP (where "vanilla" = "with APC installed").
 
(0003503)
ferg
10-08-08 18:48

The dependency-check-interval was not being used for compiled pages. Also, the string concatenation was unnecessary allocation calls.
 

- Issue History
Date Modified Username Field Change
09-30-08 08:40 ferg New Issue
10-02-08 10:42 koreth Issue Monitored: koreth
10-05-08 11:19 koreth Note Added: 0003498
10-05-08 12:23 koreth Note Added: 0003499
10-05-08 12:25 koreth File Added: bench.php
10-08-08 18:48 ferg Note Added: 0003503
10-08-08 18:48 ferg Assigned To  => ferg
10-08-08 18:48 ferg Status new => closed
10-08-08 18:48 ferg Resolution open => fixed
10-08-08 18:48 ferg Fixed in Version  => 3.2.1


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
35 total queries executed.
28 unique queries executed.
Powered by Mantis Bugtracker