Mantis Bugtracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0001898 [Quercus] major always 07-23-07 19:25 09-13-07 12:58
Reporter rjc View Status public  
Assigned To ferg
Priority normal Resolution fixed  
Status closed   Product Version 3.1.2
Summary 0001898: BinaryBuilderValue and InternStringValue toKey() produce different results for identical strings
Description A value of 'sysop' taken from a varbinary column, when indexed into an array containing an entry with key 'sysop' won't work.

in Mediawiki 1.10+, there is a function in User.php
  static function getGroupPermissions( $groups ) {
                global $wgGroupPermissions;
                $rights = array();
                foreach( $groups as $group ) {
                        if( isset( $wgGroupPermissions[$group] ) ) {
                                $rights = array_merge( $rights,
                                        array_keys( array_filter( $wgGroupPermissions[$group] ) ) );
                return $rights;

MediaWiki stores group names in the user_groups table, with the ug_group column defined as a varbinary(16). Typically, the admin user has a group 'sysop' added.

The $wgGroupPermissions array is defined in DefaultSettings.php, and initializes it with literals, e.g. $wgGroupPermissions['sysop'] = ...

The problem is, the value fetched from the database won't work as a key against this global array due to the incompatibility of the way the toKey() functions work. This problem does not occur under the real PHP5.

Additional Information The following hack "fixes" this problem, but who knows where else it is occuring.

$group = trim(' '.$group);

Attached Files

- Relationships

- Notes
07-23-07 19:27

Note: $group == 'sysop' will return true, however the toKey() on $group and 'sysop' will be different.
07-23-07 19:43

In BinaryBuilderValue.toKey(), would not the following fix the bug?

 public Value toKey()
    byte []buffer = _buffer;
    int len = _length;

    if (len == 0)
      return this;

    int sign = 1;
    long value = 0;

    int i = 0;
    int ch = buffer[i];
    if (ch == '-') {
      sign = -1;

    for (; i < len; i++) {
      ch = buffer[i];

      if ('0' <= ch && ch <= '9')
        value = 10 * value + (char)(0xFF & ch) - '0';
        return this;

    return new LongValue(sign * value);

All I've done is cast the byte from signed to unsigned.
07-26-07 00:53

Test case:


$array = array("foo" => "123", b"foo" => "456");



This will give an array with two objects in both PHP6 and Quercus 3.1.2.
08-10-07 15:46

Whether or not this is "correct" behavior for PHP6, it is causing major issues for MediaWiki on Quercus with Mysql. I have two Resin servers, one running 3.1.1 and one running 3.1.2, both pointing at the same Mysql database.

3.1.1 works, 3.1.2 fails, in several functions where the result of parsed text that has come out of the database is used to index into arrays.

Note only getGroupPermissions fails, but Parser.php's argSubstitution() function, which is fundamental to substituting {{{arg}}} parameters in MediaWiki Templates, is broken.

Try the following on the latest version of MediaWiki:
Make a page called Testpage, in it call a template

{{Foo|xxx={{PAGENAME}} }}

then make Template:Foo with

XXX = {{{xxx}}}

The result on 3.1.2 is that it prints 'XXX ={{{xxx}}}' instead of 'XXX = Testpage'. The result on 3.1.1 is that it prints 'XXX = Testpage'

In my opinion, "abc" should equal b"abc", regardless of the text's original encoding, and therefore, they should have identical keys.

Changing the first line of the argSubstitution() function in Parser.php from

$arg = trim($matches['title'] );

to this

$arg = trim(" ".$matches['title'] );

fixes the problem. I ask you, how can this not be considered a major bug that has to be fixed, where any string data that gets converted into BinaryValue has to be converted back with liberal usage of trim()? This is a dynamically typed language after all, and the expectation is that strings will be silently converted.

In any case, Resin 3.1.2 breaks MediaWiki, and we had to go back to using Quercus on top of Tomcat so I could patch this problem.
08-10-07 16:43

The issue here is that Quercus supports PHP6, but Quercus did not allow unicode to be turned off. For 3.1.3, we are adding the option to turn off unicode semantics.

In PHP6, a binary string is different from a unicode string. So PHP6 will also have problems with this issue, when unicode semantics is on.
08-10-07 17:08

Yes, please add an option for PHP5 semantics. I for one view the PHP6 behavior as broken. The encoding of a string should not determine equality. The string "hello world" in UTF8 should be equal to the string "hello world" in ISO-8859-1, US-ASCII, UTF7, UCS-16, and any other encoding you can dream up, even ShiftJIS. I am not PHP knowledgable enough to know why they made this design decision, but it seems bizarre to me.

In any case, do you have a recommendation on how to prevent BinaryValue strings from getting into the MediaWiki runtime. It seems the database is one source of them, as anything that is VARBINARY/LOB seems to become a binary string, but are there other functions which MediaWiki could be using which create binaries? How about data that comes from the request?

See, the problem with the "byte strings != char strings" approach is that is wreaks havok on legacy apps, where data coming from different sources ends up implicitly as different types, and because PHP lacks static types, it is not obvious someone looking at the source why $a != $b, when print $a and print $b show the same strings, with the same length.
08-10-07 17:33

The other major source of binary strings is from file functions. var_dump() is an excellent way of distinguishing between unicode and binary types. To convert a binary to unicode, simply typecast it with "(unicode)".
09-13-07 12:58

php/0i71, php/0j71

- Issue History
Date Modified Username Field Change
07-23-07 19:25 rjc New Issue
07-23-07 19:27 rjc Note Added: 0002126
07-23-07 19:43 rjc Note Added: 0002127
07-23-07 19:47 rjc Issue Monitored: rjc
07-26-07 00:53 nam Note Added: 0002137
08-10-07 15:46 rjc Note Added: 0002179
08-10-07 16:43 nam Note Added: 0002182
08-10-07 17:08 rjc Note Added: 0002184
08-10-07 17:33 nam Note Added: 0002186
09-13-07 12:58 ferg Note Added: 0002293
09-13-07 12:58 ferg Assigned To  => ferg
09-13-07 12:58 ferg Status new => closed
09-13-07 12:58 ferg Resolution open => fixed
09-13-07 12:58 ferg Fixed in Version  => 3.1.3
09-13-07 12:58 ferg Description Updated
09-13-07 12:58 ferg Additional Information Updated

Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
47 total queries executed.
34 unique queries executed.
Powered by Mantis Bugtracker