PHP 1.2.1 lib: bugs for ParseHTML.php in Yadis and detect.php in OpenID's examples

Jonathan Daugherty cygnus at janrain.com
Wed Feb 14 11:04:36 PST 2007


# Audra Johnson wrote:
# > #1 -- Services/Yadis/Yadis.php needs to be included in detect.php
# > Which can be fixed by including the line
# > 
# >   include 'Services/Yadis/Yadis.php';
# 
# The other files included by detect.php are checked for errors. While
# $_file3 could be added, a loop seems like the better solution.
# http://www.atrus.org/hosted/openid/php/1.2.1/fix-detect.diff

Actually, Services/Yadis/Yadis.php isn't listed because the Yadis code
is no longer distributed as a separate package.  Incidentally, it's
still a good idea to make sure it's installed, because if users don't
use PEAR to install the package, the Services/ directory might be
missed.  Another possibility is that the PEAR installer is broken,
which has happened frequently.

Thank you for the patch!

# I used the "shut up operator" to silence warnings. The lack of this
# in the original might have been so that warnings would be printed,
# but they shouldn't be relied upon... It might make sense to list the
# missing files.

Thanks again.

# > Upping the pcre.recursion_limit and pcre.backtrack_limit settings
# > from 100,000 to 160,000 seemed to do the trick on most of them,
# > although not all, but that solution made me uncomfortable--those
# > settings have never been too restrictive on me before, and so I
# > did some monkeying around with the file and ended up adding one
# > line and changing another.  I'm pretty sure one of these is a bug,
# > although the other may not be.  The diff is here:
# > http://audrajohnson.com/openid/ParseHTML.diff

Thanks for the detailed explanation.

# because the changes are so massive, here's the new file for easier
# viewing:
# http://www.atrus.org/hosted/openid/php/1.2.1/ParseHTML.txt
# http://www.atrus.org/hosted/openid/php/1.2.1/ParseHTML.phps
# 
# Without any alteration, ParseHTML.php works on
# http://missbabyblue.livejournal.com/
# with pcre.recursion_limit and pcre.backtrack_limit set 7000. With
# Audra's minor changes, it requires about 5000 (1.4x
# improvement). Using my overhaul, it works as low as 200 (35x
# improvement).
# 
# I did my tests with:
# PHP Version 5.2.1 (cli)
# PCRE Library Version 6.7 04-Jul-2006

That's great.  I'd be very happy to see these improvements integrated.
Thanks for your work on this!

I installed your ParseHTML code and ran our test suite.  The tests
failed when parsing the following four HTML documents, with expected
results indicated.  The code returned "found" in all four cases,
whereas the expected return was null.

  1. <meta http-equiv="X-XRDS-Location" content="found">

     Expected result: nothing, as there is no HTML or HEAD tag.

  2. <html><meta http-equiv="X-XRDS-Location" content="found">

     Expected result: nothing, as there is no HEAD tag.

  3. <head><html><meta http-equiv="X-XRDS-Location" content="found">

     Expected result: nothing, since the tags are reversed.

  4. <html><head/>
     <meta http-equiv="X-XRDS-Location" content="found">

     Expected result: nothing, as the head tag is empty.

I can eventually get around to debugging these, but in the mean time
you can run the test suite on the HTML parsing tests in particular by
running:

  openid$ php admin/texttest.php ParseHTML

You'll get some output like:

  ==========================================
  Test suite: Tests_Services_Yadis_Parse
  ------------------------------------------
  ..............FF.F..F..
  Ran 23 tests in 0.029 seconds with 0 errors, 4 failures
  ==========================================

(in addition to details about the tests which failed.)

You'll need PHPUnit 1.x to run the tests.  I currently use 1.3.2.
(That's ancient, but we've had trouble with newer versions.)

If there's anything else I can do, let me know.  Thanks again!

-- 
  Jonathan Daugherty
  JanRain, Inc.
  irc.freenode.net: cygnus in #openid
  cygnus.myopenid.com



More information about the Dev mailing list