Mantis Bugtracker
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0001562 [Quercus] major always 01-17-07 03:40 06-25-07 12:45
Reporter obaltz View Status public  
Assigned To sam
Priority normal Resolution fixed  
Status closed   Product Version 3.1.0
Summary 0001562: Problem with back references to subpatterns in preg_match_all
Description When using a back reference within the pattern, the behaviour of preg_match_all differs from the original php implementation. The PEAR template engine (class HTML_Template_IT) doesn't work due to this bug. See Additional info for a demo script. The pattern used in the script is the same as used in the PEAR class.

The demo script contains the same pattern twice, firstly as a single-quoted, secondly as a double-quoted string. The original php implementation treats those differently, Quercus does not. Quercus always behaves as if it were double-quoted.
Additional Information Demo script:
<?php
$pattern = '@<!--\s+BEGIN\s+([0-9A-Za-z_-]+)\s+-->(.*)<!--\s+END\s+\1\s+-->@sm'; // this will work with original php interpreter ONLY
// $pattern = "@<!--\s+BEGIN\s+([0-9A-Za-z_-]+)\s+-->(.*)<!--\s+END\s+\1\s+-->@sm"; // this will never work
$string = "pre block <!-- BEGIN testblock --> inside block <!-- END testblock --> post block";
$regs = array();
$result = preg_match_all( $pattern, $string, $regs, PREG_SET_ORDER );
var_dump( $result );
var_dump( $regs );
?>

The original php interpreter outputs:

int(1)
array(1) {
  [0]=>
  array(3) {
    [0]=>
    string(60) "<!-- BEGIN testblock --> inside block <!-- END testblock -->"
    [1]=>
    string(9) "testblock"
    [2]=>
    string(14) " inside block "
  }
}

Quercus outputs:
int(0)
array(0) {
}
Attached Files

- Relationships
has duplicate 0001561closed nam Problem with back references to subpatterns in preg_match_all 
has duplicate 0001560closed nam Problem with back references to subpatterns in preg_match_all 

- Notes
(0001723)
obaltz
01-17-07 03:46

I'm sorry, the file upload didn't work but the rest of the bug was saved. Forget about 1560 and 1561.
 
(0001789)
obaltz
03-27-07 07:07

Today I found out that the back reference actually works. A different problem causes zero results on quercus in the example above. In fact, it's the whitespace \s+ right AFTER the back reference!

Try this pattern instead:
$pattern = '@<!--\s+BEGIN\s+([0-9A-Za-z_-]+)\s+-->(.*)<!--\s+END\s+\1 \s*-->@sm';

The output will be:
int(1)
array(1) {
  [0]=>
  array(3) {
    [0]=>
    string(60) "<!-- BEGIN testblock --> inside block <!-- END testblock -->"
    [1]=>
    string(9) "testblock"
    [2]=>
    string(14) " inside block "
  }
}

However, the original php engine works with \1\s+ just like it should.
 
(0001834)
obaltz
04-11-07 07:59

Here are some simpler examples focusing more on the actual problem:

<?php
$pattern = '/F(O)\1\s+BAR/';
$result = preg_match( $pattern, "FOO BAR" );
var_dump( $result );
?>

original php output: int(1)
quercus output: int(0)

Those two patterns work fine:
$pattern = '/F(O)\1 \s*BAR/'; // back reference not followed by \s+
$pattern = '/FOO\s+BAR/'; // no back reference before \s+

Actually it does not matter whether \s+ or \. or whatever comes after the back reference - if it just starts with a backslash, it won't work:

<?php
$pattern = '/F(O)\1\.BAR/';
$result = preg_match( $pattern, "FOO.BAR" );
var_dump( $result ); // outputs int(0)
?>

However, the first expression must be a back reference to reproduce that error, just two "backslashed" expressions in a row won't make it:

<?php
$pattern = '/FOO\.\.BAR/';
$result = preg_match( $pattern, "FOO..BAR" );
var_dump( $result ); // outputs int(1)
?>
 
(0001836)
nam
04-11-07 13:46

Thanks for the additional information. It appears to be a very involved issue and we are still deciding how and when to tackle it.
 
(0002085)
sam
06-25-07 12:45

php/1530
 

- Issue History
Date Modified Username Field Change
01-17-07 03:40 obaltz New Issue
01-17-07 03:46 obaltz Note Added: 0001723
01-17-07 12:41 nam Relationship added has duplicate 0001561
01-17-07 12:42 nam Relationship added has duplicate 0001560
01-17-07 14:57 obaltz Issue Monitored: obaltz
03-27-07 07:07 obaltz Note Added: 0001789
04-11-07 07:59 obaltz Note Added: 0001834
04-11-07 13:46 nam Note Added: 0001836
06-25-07 07:14 sam Status new => assigned
06-25-07 07:14 sam Assigned To  => sam
06-25-07 12:45 sam Status assigned => closed
06-25-07 12:45 sam Note Added: 0002085
06-25-07 12:45 sam Resolution open => fixed
06-25-07 12:45 sam Fixed in Version  => 3.1.2


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
46 total queries executed.
35 unique queries executed.
Powered by Mantis Bugtracker