Welcome! Log In Create A New Profile

Advanced

Please add HTML support for http_xslt_module (there's an nginx fork which has it already)

Peter Halasz
March 09, 2012 01:22AM
Hi devs,

I work for an environmental not-for-profit organisation where we use
XSLT to theme our website. (The XSLT is generated by Diazo, and the
site largely runs on Plone).

Currently we use Nginx to do the XSLT transformation. There's a
problem though, that our un-themed site doesn't come out as perfect
XML, so we need an XSLT parser which can transform HTML (not just
XML). Nginx's http_xslt_module does NOT currently support HTML
parsing, and I'd really like to see this feature added.

The problem isn't specific to Diazo, but the Diazo manual explains the
need for HTML parsing:

> In theory, any XSLT processor will do. In practice, however, most websites do not produce 100% well-formed XML (i.e. they do not conform to the XHTML “strict” doctype). For this reason, it is normally necessary to use an XSLT processor that will parse the content using a more lenient parser with some knowledge of HTML. libxml2, the most popular XML processing library on Linux and similar operating systems, contains such a parser.

Fortunately there's a fork of nginx which does use libxml2: the
xslt_html project http://code.google.com/p/html-xslt/.
Unfortunately, the project is not maintained, so it ties us to a
patched version of nginx 0.7.67 (circa June 2010). I'd like to upgrade
nginx -- I've hit nginx bugs that were fixed long ago. I'm sure there
are many other nginx users with the same needs, so I'm requesting the
fork's changes make their way into the mainline. I'm assuming it's
just been forgotten.

The Diazo documentation also explains deploying with this patched Nginx:

> To deploy an Diazo theme to the Nginx web server, you will need to compile Nginx with a special version of the XSLT module that can (optionally) use the HTML parser from libxml2.

> In the future, the necessary patches to enable HTML mode parsing will hopefully be part of the standard Nginx distribution. In the meantime, they are maintained in the html-xslt project.

We're using this html-xslt fork of nginx at my organisation. But
unfortunately, it's not maintained, and the functionality hasn't made
it into the standard Nginx distribution. Can we please include it?

The fork adds the directive: "xslt_html_parser on;" which causes the
http_xslt_module to parse in HTML mode.

I've just made a diff http://pastebin.com/CP1P8Gzj to see what the
fork changes, and it's 755 lines long. (That's a bit longer than I
expected)

The files modified by the html-xslt fork are:

src/http/modules/ngx_http_xslt_filter_module.c
src/http/ngx_http_variables.c
auto/options
auto/lib/libxslt/conf

The diff is against nginx 0.7.67. Since then the
ngx_http_xslt_filter_module.c has seen about 300 lines removed and 20
lines added or changed, so obviously the diff can't be used as a patch
against the current version of nginx.

Hopefully that's more than enough info to get started if developers
are interested in folding the fork into nginx.

I know the other solution to our problem here is to move the XSLT to
another layer of the stack -- such as Varnish or Apache -- but I want
to make sure nginx devs know about the feature they're missing first.

Thanks for listening and I hope HTML parsing for XSLT can make it to
the mainline of nginx,

Peter Halasz.

_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

Please add HTML support for http_xslt_module (there's an nginx fork which has it already)

Peter Halasz 4481 March 09, 2012 01:22AM

Re: Please add HTML support for http_xslt_module (there's an nginx fork which has it already)

Laurence Rowe 1629 March 09, 2012 06:24PM

Re: Please add HTML support for http_xslt_module (there's an nginx fork which has it already)

Peter Halasz 1118 March 10, 2012 10:46PM

Re: Please add HTML support for http_xslt_module (there's an nginx fork which has it already)

splitice 1303 December 31, 2012 07:04AM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 304
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready