February 27, 2004

RelaxngValidatingReader improvements

Recently I committed RELAX NG validating reader stuff. It was 9 months ago when I made the last commits on that classes, while that is what I had really wanted to do ;-)

The usage is very easy:


XmlReader r = new RelaxngValidatingReader (
new XmlTextReader ("sample.xml"),
new XmlTextReader ("sample.rng"));

Or you can specify RelaxngPattern instead of XmlReader:


// Wow, relaxng.rng is really self-describing, unlike XMLSchema.xsd ;-)
RelaxngPattern p = RelaxngPattern.Read (
new XmlTextReader ("relaxng.rng"));
XmlReader r = new RelaxngValidatingReader (
new XmlTextReader ("relaxng.rng", p));

[The code is in mcs/classs/Commons.Xml.Relaxng]

The first priority task was to rename public classes and fix member signatures (mainly access modifiers). I don't want to put extraneous public methods/fields (it was very bad design I think). Well, if any of you had been using "RngPattern", it now became "RelaxngPattern" (and all RngXXX class became RelaxngXXX as well).

I have been using James Clark's derivative algorithm (basically) and it is implemented in classes in Commons.Xml.Relaxng.Derivative namespace. Though the classes are made as public, they are not expected to be used right now, and in fact I changed them radically.

It is still not as stable as I want, but it became more stable I think. I put standalone tests that uses James Clark's test suite. I could reduce nearly 120 grammar compilation failures (out of 373 cases) by less than 40 cases, and possibly a large number of instance validation errors by nearly 20 cases.

I also added datatype support on them. By default it supports XML Schema datatypes ("http://www.w3.org/2001/XMLSchema-datatypes") as well as default namespace datatypes (i.e. "string" and "token"). To support them, my derivative validation design had to be changed.

You don't have to change any lines of your code to get XML schema support. Just embed XML schema datatypes URI (as relaxng.rng does) and use it.

Data type support is done by these classes:


  • RelaxngDatatype: it represents the actual data type

  • RelaxngDatatypeProvider: it provides the way to get RelaxngDatatype from QName and parameters

If you want to implement your own data type, it can be done by extending RelaxngDatatype - especially by implementing Parse(string text, XmlReader context) -, and extending RelaxngDatatypeProvider to return the new datatypes by GetDatatype (string name, string ns, RelaxngParamList parameters). Well, there is already similar datatype project by Kohsuke Kawaguchi, but I took another way - my RelaxngValidatingReader is not based on different validating context (mine is simply XmlReader).

I have many things wanted to add to them, but this time, not yet.

Posted by atsushi at February 27, 2004 02:11 AM
Comments
Post a comment