monogatari stories by Atsushi Eno

RSS: xml.gif

Atom: atom.gif

Docs:

All entries

Old entries (movableType)

mono home

monologue

2004 / 12 / 16: (2nd) Mono meeting in Tokyo (Permalink)

Since Duncan is coming to Tokyo this weekend, we are going to have (2nd) Mono hackers meeting in Tokyo (yes, we had the first meeting two years ago). It is planned on 19th lunch time, at Umegaoka (close to Shimokitazawa). We'd welcome a few more people who would like to in join us (sorry but a few; we have no preparation for large meeting). Please feel free to mail me (atsushi@ximian.com) if you are interested.

One of the reason why XmlSchemaValidator does not rock

After hacking label collector functionality for RELAX NG, I noticed that .NET 2.0's XmlSchemaValidator is kind of such an API (note that the link to MSDN documentation above shows so obsoleted). So I decided to implement it nearly a week ago. And now it's mostly implemented and checked in mono's svn (I think it is one of the hackiest xsd validator ;-).

Here is an example application of XmlSchemaValidator I wrote to test my implementation (it compiled with 2.0 csc).

Some of you might know that implementing XmlSchemaValidator sounds weird, because there is no API documentation for this new face (well, at least for VS 2005 October CTP which I reference). But the functionality is mostly obvious (at least to me). For example, ValidateElement() is startTagOpenDeriv, ValidateAttribute() is attDeriv, and ValidateEndOfAttribute() is startTagCloseDeriv (btw I really don't like those method names) described in James Clark's derivative algorithm for RELAX NG. One thing I was mystified was what ValidateElement(string,string,XmlSchemaInfo,string,string,string,string) overload meant, but thanks to MS developers, it was solved.

Actually XML Schema is much useless than RELAX NG stuff. XmlSchemaValidator is fully stateful, thus you cannot go back to the previous state easily. RELAX NG derivative implementation is stateless, so you can just attach those derivative instances to nodes in the editor. Oh, yes, XmlSchemaValidator could be stateful, if it supports cloning. But I don't think it can be lightweight.

btw, if you want, you can generate xsd from DTD and use XmlSchemaValidator, if you want.

2004 / 12 / 07: rethinking element/attribute label collector (Permalink)

19:51 (alp) eno: dude, am never getting any expected attributes

... So, I missed the point that after RelaxngValidatingReader.Read(), my validation engine which is based on James Clark's derivative algorithm keeps the state only "after it closed the start tag" (i.e. startTagCloseDeriv in that paper) and thus no attributes must be allowed. Sigh. So, to implement attribute auto-completion, I had to expose state transition object that users can try "the state transition after an attribute occured" (i.e. attDeriv). So, now the code became more complicated than yesterday (well, it is required complexity):

XmlTextReader xtr = new XmlTextReader ("relaxng.rng"); RelaxngPattern p = RelaxngPattern.Read ( new XmlTextReader ("relaxng.rng")); RelaxngValidatingReader rvr = new RelaxngValidatingReader (xtr, p); TextWriter Out = Console.Out; for (; !rvr.EOF; rvr.Read ()) { object state = rvr.GetCurrentState (); Out.WriteLine ("Current node: {0} ({1}) -> {2}", rvr.Name, rvr.NodeType, rvr.Emptiable (state) ? "Emptiable" : "not Emptiable"); Out.WriteLine (" - expected elements -"); foreach (XmlQualifiedName qn in rvr.GetElementLabels (state)) { Out.WriteLine (" " + qn); object astate = rvr.AfterOpenStartTag ( state, qn.Name, qn.Namespace); Out.WriteLine (" - expected attributes -"); foreach (XmlQualifiedName aqn in rvr.GetAttributeLabels (astate)) Out.WriteLine (" " + aqn); } }

I put the code example (above), and the updated result.

So now RelaxngValidatingReader implicitly expects to the validating editor not to call .Read() until it closes the start tag. Instead, now each of the elements and attributes can hold the state at the node itself. (Am not sure it really works fine; I should consider cut/paste, insertion, and so on.)

BTW, personally I don't want to expose such features and requires "implementors" of RelaxngValidatingReader functionality to implement highly derivative-dependent features like this (that is bad for standardizng API). I won't recommend to learn this feature as long-live, good to know stuff. I am, on the other hand, expecting System.Xml 2.0 to have such functionality for XML Schema (IF Microsoft people can provide), but still don't think it's worthy of standardization.

On the next stage, I will have to implement some "error recovery" stuff so that users can enter invalid nodes and the implementation can still continue remaining validation.

Many thanks to Alp to try it out with his experimental UI stuff and to let me improve this library (I could also have chance to fix bugs and to optimize Commons.Xml.Relaxng stuff).

2004 / 12 / 06: expecting the next element/attribute names w/ RelaxngValidatingReader (Permalink)

(5:00am JST: Updated the API and example that looks better.)

00:18 (alp) eno: do you have any thoughts on how i could
            use a DTD to hack together xml completion?
00:19 (eno) alp: you want to develop such functionality
            in your app?
00:20 (eno) mhm, actually I have no idea that supports
            something like nxml-mode
00:20 (alp) yeah, perhaps for monodevelop

Actually that it is sort of what I wanted. However, for DTD and XSD, the implementation is not extensible (validation implementation is hidden in System.Xml.XmlValidatingReader) So I (kinda) implemented something like that, using my RelaxngValidatingReader:

XmlTextReader xtr = new XmlTextReader ("relaxng.rng"); RelaxngPattern p = RelaxngPattern.Read ( new XmlTextReader ("relaxng.rng")); RelaxngValidatingReader rvr = new RelaxngValidatingReader (xtr, p); rvr.MoveToContent (); for (rvr.MoveToContent (); !rvr.EOF; rvr.Read ()) { Console.WriteLine ("Name: {0}, NodeType: {1} -> {2}", rvr.Name, rvr.NodeType, rvr.Emptiable () ? "Emptiable" : "not Emptiable"); Console.WriteLine (" - expected attributes -"); foreach (XmlQualifiedName qn in rvr.ExpectedAttributes) Console.WriteLine ("{0} in {1}", qn.Name, qn.Namespace); Console.WriteLine (" - expected elements -"); foreach (XmlQualifiedName qn in rvr.ExpectedElements) Console.WriteLine ("{0} in {1}", qn.Name, qn.Namespace); }

Here I put the output of the example above. It is hacky (written mostly in 2 hours) and it does not check rejection by notAllowed. It might be improved later. Also, it uses Hashtable right now, but it does not have to be dictionary.

I also added Emptiable() (of type bool) that determines if an end tag is acceptable or not in current state. Actually to complete an end element, its name should be available, but due to the difference between QName and end tag name, it should be (and could be) implemented without RELAX NG validation stuff (to support such functionality, just keep start tag names in a stack). Similarly, you should also keep track of in-scope namespace declaration to fill proper prefix that is bound to a namespace of the QName contained in the results.

Oh, BTW don't ask Alp about that "dream": he has many other tasks and interests ;-)

2004 / 12 / 03: mcs now supports /doc (Permalink)

Finally, I checked in /doc support patches in mcs. I remember the first patch was written in a day, nearly 7 months ago, and that worked mostly fine.

During the hacking on it, I found some problems around /doc feature:

... and more (I cannot remember anymore right now). Well, some of them are not actually problems. Some looks just bugs.

Well, actually csc must be doing better job than my hacky cref interpretation. It seems recursively tokenizing the attribute as a type name as well as the source itself, while I don't.

Anyways, it is kind of job I did only because there are some users (originally it would have been used to examine our System.Xml implementation by using NDoc in practice). I think monodoc format is much better and I don't think C# doc feature is good, as a translator who keeps track of changes in original document, usually from document themselves, not from source code files.

Now I am so glad that I can fully go back to sys.xml hackings.

2004 / 11 / 28: RelaxngInference (Permalink)

Yesterday I started to write RELAX NG grammar inference. I hope this design won't **** you.

2004 / 11 / 24: document("") (Permalink)

Just voted my first 5 on this very important bug that shows W3C standard conformance breakage.

Such XPathNavigator instance could be kept in memory only for such a stylesheet that contains document("")document() (it could be done in static analysis). So the reason of "by design" does not make sense.

Real developers could just implement standard-conformant implementation in easy way, instead of using casuistry on whether it is conformant or not, which just result in imposing annoyance on real users.

2004 / 11 / 19: Let's make System.Xml 2.0 not suck (Permalink)

On the suggestion on "infer elements always globally", Am getting positive feeling from Microsoft XML guys via the feedback center.

On the other hand, am getting negative response for the suggestion on XmlSchemaSimpleType.ParseValue() which validates string considering facets. But I believe that XML Schema based developers will be absolutely appreciated by that feature. For example, it will be mandatory for XQP project that must support user-defined type constructor defined in the section 5 of W3C XQuery Functions and Operators specification. Microsoft guys might want to help your development.

It depends on you, XML developers, whether Microsoft will improve their library or not. We could provide our own advantages, but it would be still better that your advanced code will run on MS.NET too.

(FYI: You can "vote" for the suggestions ;-)

2004 / 11 / 17: Useful codeblock (Permalink)

using QName = System.Xml.XmlQualifiedName; using Form = System.Xml.Schema.XmlSchemaForm; using Use = System.Xml.Schema.XmlSchemaUse; using SOMList = System.Xml.Schema.XmlSchemaObjectCollection; using SOMObject = System.Xml.Schema.XmlSchemaObject; using Element = System.Xml.Schema.XmlSchemaElement; using Attr = System.Xml.Schema.XmlSchemaAttribute; using AttrGroup = System.Xml.Schema.XmlSchemaAttributeGroup; using AttrGroupRef = System.Xml.Schema.XmlSchemaAttributeGroupRef; using SimpleType = System.Xml.Schema.XmlSchemaSimpleType; using ComplexType = System.Xml.Schema.XmlSchemaComplexType; using SimpleModel = System.Xml.Schema.XmlSchemaSimpleContent; using SimpleExt = System.Xml.Schema.XmlSchemaSimpleContentExtension; using SimpleRst = System.Xml.Schema.XmlSchemaSimpleContentRestriction; using ComplexModel = System.Xml.Schema.XmlSchemaComplexContent; using ComplexExt = System.Xml.Schema.XmlSchemaComplexContentExtension; using ComplexRst = System.Xml.Schema.XmlSchemaComplexContentRestriction; using SimpleTypeRst = System.Xml.Schema.XmlSchemaSimpleTypeRestriction; using SimpleList = System.Xml.Schema.XmlSchemaSimpleTypeList; using SimpleUnion = System.Xml.Schema.XmlSchemaSimpleTypeUnion; using SchemaFacet = System.Xml.Schema.XmlSchemaFacet; using LengthFacet = System.Xml.Schema.XmlSchemaLengthFacet; using MinLengthFacet = System.Xml.Schema.XmlSchemaMinLengthFacet; using Particle = System.Xml.Schema.XmlSchemaParticle; using Sequence = System.Xml.Schema.XmlSchemaSequence; using Choice = System.Xml.Schema.XmlSchemaChoice;

I've 90% finished XmlSchemaInference. I implemented it only because .NET 2.0 contains it.

XmlSchemaInference is very useful. For example, if you have such document like:

<products> <category> <category> <product name="foo" /> <product name="bar" /> <product name="baz" /> </category> <product name="hoge" /> <product name="fuga" /> </category> </products>

It creates two different definition of "product" elements. Here is the infered schema and generated serializable class.

So now I wonder if I had better port the same feature to Commons.Xml.Relaxng. RELAX NG is not so sucky than XML Schema, so I might be able to provide better XML structure inference engine. But XML structure inference itself is not so fun.

... after some thoughts, I decided to enter a new suggestion to MS feedback center which seems working again recently.

2004 / 11 / 13: xml:id and canonical XML (Permalink)

I found that the Last Call working draft of xml:id was out. But I think xml:id will be incompatible with Canonical XML (xml-c14n). Below is an excerpt from 2.4 Document Subsets in xml-c14n W3C REC:

The processing of an element node E MUST be modified slightly when an XPath node-set is given as input and the element's parent is omitted from the node-set. The method for processing the attribute axis of an element E in the node-set is enhanced. All element nodes along E's ancestor axis are examined for nearest occurrences of attributes in the xml namespace, such as xml:lang and xml:space (whether or not they are in the node-set). From this list of attributes, remove any that are in E's attribute axis (whether or not they are in the node-set). Then, lexicographically merge this attribute list with the nodes of E's attribute axis that are in the node-set. The result of visiting the attribute axis is computed by processing the attribute nodes in this merged attribute list.

Well, I don't think xml:id is wrong here. It is xml-c14n that is based on non-committed premises that all xml:* attributes must be inherited (yes, xml:lang, xml:space and xml:base were). Anyways, don't worry about that incompatibility. Canonical XML is already incompatible with XML Infoset with related to namespace information items.

2004 / 11 / 11: A change of seasons (Permalink)

Many of my friends have been saying that they feel sorry for Rupert that he does not have any more clothes in this cold season. Today I was hanging around Shibuya (central Tokyo area) with my friends, and they were so kind to buy a new one for him (from my budget). Now he looks younger than before.

XmlSchemaInference

I was escaping from /doc stuff and looking into xsd inference task (I cannot stand working only on that annoying task). I wrote some notes but incomplete. Apparently the most difficult area is particle inference, but right now not so many ideas. My current idea is to support non-XmlSchema language.

2004 / 11 / 10: the latest /doc patch (Permalink)

I've finally hacked all /doc support feature, including the related warnings. Am now working on testing stuff; I've already done warning tests, but need to compare results. Here's the latest patch and the set of the compiler sources.

monodoc-aspx on windows

For those who are interested, here is my local changes in monodoc to make monodoc-aspx runnable on windows.

2004 / 11 / 05: ResolveEntity() (Permalink)

XmlReader.ResolveEntity() is one of the biggest problem for custom XmlReader implementors, since it never provides XmlParserContext. Well, most of those implementations would just use XmlReader as construction arguments. In such cases, they could just invoke the argument XmlReader's. ResolveEntity(). If you have DTD information (name / public ID / system ID/internal subset), then you're still lucky; because you can create XmlDocumentType node and XmlEntityReference that holds ChildNodes. That's how our XmlNodeReader is implemented now.

btw, XmlTextReader in System.Xml 2.0 can resolve entities. It means, now XmlTextReader always checks if there is entity reader inside the class. Actually the similar situation lies in XmlNodeReader and DTD validating reader. They are mostly the same (still different though). As an example, I wrote new XmlNodeReader and XmlTextReader based on the old implementations, but on entity handling, they are so close. So I think, handling entity-resolvable XmlReader might be possible to extract to one (abstract) class - I haven't tried though (since I cannot change the class hierarchy on XmlNodeReader and that of XmlTextReader).

A similar problem and possible solution lies under post validation information providor (such as DTD validator and XSD validator, that handle default values). But I won't provide common solution for it, because people should never use something like PSVI that makes documents inconsistent (well, entity is also, but it is already-happened disaster).

2004 / 11 / 03: The United States of Absurdity (Permalink)

Am being disappointed in American citizens.

Anyways, XmlTextReader got 20% faster yesterday in my box and in my testing. With my pending patch, it even goes 33% (as total), but I haven't committed as yet because it is nasty, expanding some functions inline manually.

XQueryCommand is dropped. How about XQueryConvert?

From the revision log of latest W3C XQuery 1.0 working draft:

A value of type xs:QName is now defined to consist of a "triple": a namespace prefix, a namespace URI, and a local name. Including the prefix as part of the QName value makes it possible to cast any QName into a string when needed.

xs:QName is mapped to XmlQualifiedName, which only contains local name and namespace URI.

2004 / 11 / 02: CS1587 (Permalink)

I decided to postpone /doc patch checkin (well, requesting approval and checkin) for a while (maybe two weeks or so), since there are many dangerous factors (many changes and breakage) for committing patches right now. Instead, I decided to create the complete patch for /doc (that implies no more significant changes intended).

But that also means, I have to add parser/tokenizer annoyance for CS1587 "XML comment is placed on an invalid language element which can not accept it." ... it is really annoying. Just for the support for that warning, my small patch increased 1.5 times larger than before. And since doc comment is not kind of token which can be recognized by the parser (it is not "error" to have those comments in improper code blocks), the task is mostly done by the tokenizer, and I need to keep track of token transition at a time. Having tokenization control code both in parser and in tokenizer is a bad idea, but there was no better way (there was a similar annoyance in XQuery parser).

Anyways, that CS1587 task is mostly done (I hope so). I still have to handle 'cref' related warnings, which are also annoyance.

It is still a mystery that such person that dislikes /doc feature is working on it (ah, but was ditto for XmlSchema ;-).

[19:30] I noticed that I had been making significant changes with related to the latest changes in mcs that makes my /doc patch mostly useless. Here's the latest patch.

2004 / 11 / 01: Japanese Monkeyguide Translation (Permalink)

Recently one Japanese person has been contributing monkeyguide translations (well, we know that monkeyguide is old). I put those translations here. Those it is rather a repositry than good-to-browse pages. The files are still raw (because old monkeyguide did not have index.html).

Some Japanese people complains that there is no Japanese resources. We have. Please navigate from http://www.mono-project.com to "Resources" >> "Related Sites" >> "Mono Japanese Translation". The page style is old, but the large part of the translations are up-to-date (well, the web pages are slightly changed after the 1.0 release).

2004 / 10 / 13: On upcoming System.Xml 2.0 changes (Permalink)

Improved mcs /doc support patch

I had left my /doc patch as broken for a cpl months. But since there are many voices of need, I decided to fix and improve it. Here is the latest patch am working on. I'll post it when I finished remaining warning stuff (thanks again to Marek for the list).

On upcoming System.Xml 2.0 changes

http://blogs.msdn.com/DareObasanjo/archive/2004/10/13/241591.aspx

In short, it is nice.

So now XQuery is being removed. No wonder. Apparently, XQuery won't become W3C REC until .NET 2.0 gets out. It is a good decision that Microsoft dropped XQuery from their forthcoming .NET FX, despite being regarded as "now XML 2.0 is not so fantastic". But just imagine; we know how MSXML brought confusion wrt http://www.w3.org/TR/WD-xsl, and we - well, at least I - don't want another disaster.

Actually besides the progress of the spec, even with the latest working draft, there are some difficulty on current .NET XQuery implementation strategy that premises the unity of CLR and XQuery datatypes. For example, with xs:gYearMonth mapped to System.DateTime, you can't implement XQuery 1.0 and XPath 2.0 Functions and Operators section 17. Casting. It also applies to conversion between xs:QName and xs:NOTATION (both are represented as XmlQualifiedName, while those types differ in casting to xs:string and xdt:untypedAtomic (it can be avoided by making derived class for xs:NOTATION though).

Other than XQuery, there seems some nice improvements. For example, XPathDocument is no longer editable (it makes sense very much), XmlReader.Create() supports XmlParserContext (maybe like I thought that we need something like IXmlParserContext), and so on. Am especially interested in XmlSchemaValidator that might help people creating validating XML editor. I have no idea what it is like, but it sounds cool.

Oh, BTW it does not mean that our XQuery implementation will be totally discarded. Well, it will disappear from System.Xml.dll (and less important but from System.Data.SqlXml.dll too), but it might be extracted into another library. ATM, it cannot handle positional predicates, sequence type matchings, many XQuery functions, and more, but it can already handle some of the queries from XQuery Use Cases (you will need some special XmlResolver to handle those nonexisting document instances though).

2004 / 09 / 20: Leaving Japan for a while (Permalink)

Am going to Cambridge and attend to XMLOpen 2004, to learn English^W the latest XML technologies. After that, vacations :-) Am looking forward to my first visit to Europe. Will be in London, Brussels (currently a dinner with Gert is planned), and Heidelberg. I'll appear again on 1st Oct, in Ximian office, and will spend most of that month there.

new Microeconomics by Krugman/Wells

I found that the forthcoming Microeconomics book by Paul Krugman and Robin Wells contains a sample chapter on Technology, Information Goods and Network Externalities [pdf], refering to CTEA. Nothing should be new to Lessig readers though.

2004 / 09 / 04: XPathDocument Changes (Permalink)

As a sincere ex-law student the worst thing I dislike is false advertisement and unfair trade. And now am spending my time on persuading people saying that copyright is a natural right (sigh). I wonder if I should directly tell the fact that the origin of exclusive right had started just from British book companies.

I put my translation of Miguel's Longhorn Changes into Japanese. You can read it from here. There are no or few people who share the ideas for that matter with Miguel in Japan, so this translation should be informative for Japanese developers.

I had been saying that developers won't use XPathNavigator. I don't have any statistics, but it was obvious, like Avalon and XP. Actually, as a developer who had spent much time on debugging XSLT that is based on XPathNavigator rather than DOM, I don't believe that XPathNavigator is better for practical development, while don't deny that XPathNavigator is cool.

It is a while ago, but there were announcement on System.Xml.XPathDocument being reverted. No wonder, and no need to be afraid. XPathEditableNavigaor is said as still remaining. Only XPathDocument is eliminated. The voices saying that "XPathDocument rocked!!" are really ignorable. XPathEditableNavigator is much easier than XPathDocument because it doesn't have to support transaction (AcceptChanges() and RejectChanges()). XPathEditableNavigator can be implemented over XmlDocument.

(I have no idea how XPathEditableNavigator will be provided though.)

To my feelings, to be IRevertibleChangeTracking, XPathDocument must have been like XmlDocument that might contain sequential text nodes, otherwise it must had too-complex internal change states. It was too unnatural for XPathDocument which is so close to XmlDocument to be able to be much faster. There should be actually little performance difference between XPathDocument and XmlDocument (MS XPathDocument must be based on tree node model unlike our DTMXPathNavigator which is based on document table model).

Some guys say that XmlDocument is 10x or more slower than XPathDocument. It is totally not practical. To my knowledge, it only applies to some kind of reverse-axis queries such as preceding-sibling::*. It is because XmlNode does not have reference to PreviousSibling. If XmlNode is going to have such reference, then its performance should not be different from XPathDocument. Since now that Microsoft XML team publicly said that they will improve XmlDocument, they could make such changes.

Editable XPathNavigator from XPathDocument is still annoyance for such people like me who found that read-only document structure is faster than editable one (I was being forced to implement slower XPathNavigator just for such a silly change). But I won't care. Our internal use of XPathNavigator will be still faster DTMXPathNavigator.

There were another degrading discussion saying that XmlDocument's SelectNodes() will suck with XmlNamespaceManager. That is really not true. Now SelectNodes() should be able to accept IXmlNamespaceResolver like XPathDocument (XmlDocument will be improved, no?) and thus they could write like "theNode.SelectNodes(xpath_string, new XmlNodeReader (theNode))". Or even XmlNode could implement IXmlNamespaceResolver (there are already similar methods like GetPrefixOfNamespace()).

2004 / 08 / 26: Wondering around CLI function support for XQuery (Permalink)

Am on designing something like CLI native function call support in our XQuery engine. Right now the code below runs with our cvs version.

XQueryCommand cmd = new XQueryCommand (); cmd.Compile (new StringReader (@" declare namespace math='System.Math'; declare function math:Log10($arg as xs:double) as xs:double external; &lt;doc>{math:Log10(number(5.5))}&lt;/doc>")); cmd.Execute ((IXPathNavigable) null, XmlWriter.Create (Console.Out));

(sorry for those who disables JavaScript - '<' is incorrectly escaped.)

$ mono func.exe <doc>0.740362689494244</doc>

The original idea is from SAXON 8.0 that supports Java method invokation (and yes, it also looks like IXsltContextFunction). Currently my implementation immediately infers every external functions as native public static methods, and I don't like this design (especially "everything is CLI method" design, and the point that we must define every functions. It could be easy module imports).

I wonder how Microsoft developers think about XQuery extensions.

2004 / 08 / 22: The truth on RELAX NG, XML Schema and XML serialization (Permalink)

In short, there is no reason you cannot use RELAX NG in your web services theoretically. That's just the matter of insufficient implementation support of frameworks rather than spec matter.

As I implemented RelaxngDatatypeProvider in Commons.Xml.Relaxng.dll that supports XML Schema datatypes, you can use RELAX NG to represent "typed grammar". And it is possible for some grammars to map its items to runtime types (it is "theoretically" impossible to support runtime-type mapping for all kind of RELAX NG grammars. Read more for details). On the other hand: XML Schema is (also) not always mappable to runtime types. You can try xsd.exe with this simple example schema below and what useful classes it generates:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Foo" type="myComplex" /> <xs:complexType name="myComplex"> <xs:choice> <xs:sequence> <xs:element name="element_A" type="xs:string" /> <xs:element name="element_B" type="xs:string" /> </xs:sequence> <xs:sequence> <xs:element name="element_C" type="xs:string" /> <xs:element name="element_D" type="xs:string" /> </xs:sequence> </xs:choice> </xs:complexType> </xs:schema>

(I haven't checked how JAXB 1.0 RI works here.)

The fact is: if you are lucky, your schema can be used for object mapping.

For RELAX NG side, we don't have runtime type mapping tool yet. At least, target grammar must be "deterministic" which is one of the reason why XML Schema is much complex than RELAX NG (XML Schema forces deterministic schema, while RELAX NG allows non-deterministic grammars). To implement it, first we have to create non-deterministic grammar detection utility (like RelaxMeter by Kohsuke Kawaguchi, which is, however, to detect ambiguity).

Once deterministic grammar detection got implemented, we can implement object mapping using RelaxngDatatypeProvider noted above.

But note that, for usual develoers, understanding "how we can create deterministic grammar" might be more difficult than understanding and believing XML Schema blindly (actually it won't be, since XML Schema has some extraneous buggy things like substitution groups). In fact am one of such people who don't fully understand how deterministic model detection will work on RELAX NG.

Am not interested in other discussion such like "New X won't be your solution, since every existing systems are based on old Y". We're those people who believed that .NET Framework makes development easier, while it had abandoned legacy stuff.

2004 / 08 / 20: Hello, XQuery (Permalink)

I've been apart from cvs log history for a while, because I was XQuery pregnant.

using System.IO; using System.Xml; using System.Xml.Query; public class Test { public static void Main () { XQueryCommand cmd = new XQueryCommand (); cmd.Compile (new StringReader ("&lt;doc>hello, XQuery.&lt;/doc>")); cmd.Execute ((XmlResolver) null, new XmlTextWriter (System.Console.Out)); } }
_@r50 ~/tests/xml2_0/xquery
$ csc hello.cs -r:System.Data.SqlXml.dll -nologo

_@r50 ~/tests/xml2_0/xquery
$ ./hello
<doc>hello, XQuery.</doc>
_@r50 ~/tests/xml2_0/xquery
$ mono hello.exe
<doc>hello, XQuery.</doc>

Not sure when I can leave this hospital.

2004 / 08 / 07: NemerleCodeProvider (Permalink)

Today was the first day I tried to touch Nemerle. Inspired by Miguel's post, I created NemerleCodeProvider. There must be many mistakes caused by my ignorant of nemerle specification, but right now it would do something.

Since our xsd.exe can load external code provider, I tried my provider with xsd.exe.

$ cat test.xml <root> <foo/> <bar/> <baz/> </root> $ xsd test.xml Written file .\test.xsd $ xsd /c /g:Nemerle.Contrib.NemerleCodeProvider,NemerleCodeProvider.dll Loaded custom generator type Nemerle.Contrib.NemerleCodeProvider,NemerleCodeProvider.dll . Written file .\test.n --> [*1] $ xsd /d /g:Nemerle.Contrib.NemerleCodeProvider,NemerleCodeProvider.dll Loaded custom generator type Nemerle.Contrib.NemerleCodeProvider,NemerleCodeProvider.dll . Written file .\test.n --> [*2]

[*1] test.n

[*2] test.n

The former test.n compiles, but curiously the latter one didn't compile. Today I have no clue (I tried to talk to nemerle community but could not get connected).

2004 / 08 / 01: XPathDocument2Editable (Permalink)

Yesterday I put XPathDocument2, saying that it could be easily made as editable. And (against my expectation) today I made another XML toy. Well, am mostly functioning just as a copy machine of (my) prior ones.

It is one day hacking, based on XPathEditableDocument I put last Thursday.

2004 / 07 / 31: wandering around XPathDocument2 (Permalink)

After making XmlDocumentNavigator editable, today I was trying to create another XPathDocument based on another tree model like XOM... that will be editable in a few days (on the same way as shown in XmlDocumentEditableNavigator), but I hoped too much for today's XML toy. In fact the basic tree model is created nearly a week ago, so today I just added Load() and XPathNavigator implementation (though I had to modify the large part of the tree model). Right now it just implements IXPathNavigable (it won't be difficult though), and has some limitation (e.g. MoveToId() won't work right now).

In fact, at first I intended to create more useful document API like XOM for new XPathDocument, but I found that new XPathDocument had better be more simple implementation that is not kind for "users" (such like omitting parameter checks). Thus, the tree model now became so different from original XOM way.

Like MS one, this XPathNavigator is "a bit" faster than XmlDocumentNavigator, but it is not faster than our DTMXPathNavigator. I just tried only those navigation API used in XPathNavigatorReader, so on reverse axes it will be much faster than XmlDocumentNavigator.

This code is just two days' hack, so it might have many bugs. Well, yesterday's XPathNavigatorReader was so buggy and I've fixed many bugs in cvs.

2004 / 07 / 30: sorry, today's toys are not so interesting ones (Permalink)

I've checked today's toy in cvs; SubtreeXmlReader.cs. It is used to implement XmlReader.ReadSubtree().

After creating it, I felt sorry, since it is not interesting one. So today I made another subtree reader; XPathNavigatorReader.cs (for XPathNavigator.ReadSubtree()). It in fact used to be in cvs for a while (and moreover I made it more than 1 year ago), but it did not behave as a fragment reader.

Well, though they are based on 2.0 bits, they could be easily modified to be usable on 1.x. Once such modification is done, you could replace XmlReader.ReadSubtree() and XPathNavigator.ReadSubtree() with them for backward compatibility. They are also runnable under MS.NET.

Update [7/31]: The XPathNavigatorReader implementation was so buggy... so please pick the correct implementation up from mono cvs.

2004 / 07 / 29: Who said that XmlDocument's XPathNavigator is not editable? (Permalink)

On implementing XML 2.0 bits, I noticed that I need something new in our implementation - such as subtree XmlReader from XmlReader, or XPathNavigator fragment reader. So there seems many interesting toys I have to implement.

In Microsoft's .NET Framerowk 2.0, XPathNavigator that is created from XmlNode is not derived from XPathEditableNavigator. It is not cool, since DOM is editable. So here I hacked a new toy named XPathEditableDocument that implements IXPathEditable and created from XmlDocument. Right now I just tried only a few members.

It can be used like:

XmlDocument doc = new XmlDocument (); doc.LoadXml ("<root/>"); XPathEditableDocument xp = new XPathEditableDocument (doc); XPathEditableNavigator nav = xp.CreateEditor ();

... and that XPathEditableNavigator can be (maybe ;-) used as usual. It is also runnable under MS.NET 2.0. This code contains a short testing driver, so comment them out before using it.

Am not sure if I am going to implement (well, test enough) other interfaces implemented in XPathDocument (such as IRevertibleChangeTracking), but it would be usable to implement some XPathEditableNavigator dependent part in XQuery implementation (well, it doesn't have to be dependent on that class though).

I also think that we can adapt this class in our XmlNode.CreateNavigator(). I There is little worry about breaking existing stable implementation. In this code, XmlDocumentEditableNavigator is mostly just wrapping internal XmlDocumentNavigator (mono's DocumentXPathNavigator). Of course, it can be supported only post-1.0.

I would like to recommend Microsoft XML guys to make XmlDocument as IXPathEditable like this way. It will make XPathEditableNavigator more stable (having two implementations is much better than having just one implementation, especially there are virtual members that is overridable by third parties). It doesn't have to be as complex as XPathDocument, by omitting some interfaces such as transaction support.

2004 / 07 / 28: how to disappear completely (Permalink)

The day before yesterday, I wrote that XQuery 1.0 SchemaContextLoc will become obsolete soon. The same day W3C announced another new XQuery WD, which completely removed SchemaContextLoc from ValidateExpr. Wow. Am glad to know that and no sooner I did eliminate SchemaContext from my parser stuff (in my box yet). Such a niche functionality should be done by extensions, not by standard. Now everything in its right place.

2004 / 07 / 27: 2,000,000 of listeners who had spent $1,000 in 5 years moved from commercial records to Winny (Permalink)

Hmm, my last entry went under 7/25. So I wanted to insert noise here.

ACCS is kind of BSA in Japan. They reported that commercial records decreased $40,000,000 of sales in those 5 years, and it is because of Winny, a file sharing tool (that did not exist 5 years ago). They also said that there are 2,000,000 Winny users. Thus, at least every users had spent $1,000 in 5 years must have ceased to buy commercial records because of Winny. Wow!

It describes how reliable their staticsics are. If you found any literatures that reference their statistics, you had better advise them to find another ones. You can also rank such literatures by yourself.

Similarly, if any of you know any associates or statistics (latter would be preferable) in your country that are unreliable, please tell me. I decided to create a black list (mainly) for Japanese IP researchers.

2004 / 07 / 26: Looking forward to XQuery 1.1 (Permalink)

Today I checked in my first XQuery parser stuff, with some hack on System.Xml.Query.XQueryCommand (in System.Data.SqlXml.dll). DISCLAIMER: It won't do anything other than parse ;-) I tried all the query examples contained in XQuery Use Cases, though it is still incomplete. Besides "fn:empty" function call, only 3 examples fail (The reason why fn:empty fails is like what this message appeals - empty() and empty(item) are still? ambiguous to my parser). I put those extracted examples here.

The next stage is to design static context model from current abstract syntax tree. And it will be concurrently done with kinda XPathItemIterator design. Other new XML 2.0 API might be implemented when I want to run away from XQuery I want to use them in XQuery implementation and/or I want to fill missing bits, as I have done in this month.

BTW, recently W3C published the updated working draft of XML Schema component designator. If you know XQuery, you will notice that the spec targets exactly as the same one as the terminal named SchemaContextLoc in XQuery 1.0. To me, it looks like the near future version of XQuery 1.x will replace that SchemaContextLoc part and XQuery 1.0 might soon get obsolete like http://www.w3.org/TR/WD-xsl in a few years (unless that Component Designator spec gets discarded).

Of course, that does not mean "it is no worth learning XQuery after publication of 1.0 spec". XQuery might be one of the most useful XML processing language. It just means that "some" schema-related part of XQuery would become obsolete. It is also true to XPath 2.0 and thus XSLT 2.0.

One possibility is that the component designator spec work soon gets W3C recommendation status, and XQuery 1.0 uses that spec. That would be nice, but early implementation of XQuery will be still unstable then.

2004 / 07 / 15: on MS Compatibility or W3C Conformance (Permalink)

My recent flamatory on MS/W3C conflict went to xml.com article (yes, what I wanted is not only talking about XmlTextReader). No sooner I got response from Dare Obasanjo (Microsoft XML PM) via Miguel. And I found the Microsoft feedback page is enough cool to view others' bug reports. I filed that matter, but no sooner I got quicker reply from the development team from Dare. I believe Miguel is on the right way and I'd like to be cool as he drew. I'd expect that MS guys would have taken (or would take) the same way on SAX.NET project they had commented.

Oh, BTW I am not a "W3C God" believer (as shown in the excerpt in xml.com or even on Dare's weblog; I had been afraid of being said as such). People should note that it is also ECMA CLI specification. (No need to worry about XmlCDataSection; XmlDocument is not part of ECMA CLI).

Anyways that bug won't be fixed under MS.NET because (as Dare posted to the mono-devel-list; well, its maybe waiting for approval) Microsoft customers might have already been dependent on this bug (as Ian already shown). So I'll fix by providing MS compatibility mode or Mono improved mode - yes, finally none of W3C standard conformance situation was solved here.

2004 / 07 / 12: documentation patch delayed (Permalink)

Roughly three months ago I wrote /doc support patch for mcs. For several reason, it is still not in cvs. The patch is not a small one, so we could not incorporate the patch before mono 1.0 release. I had no trouble keeping the patch in my working mcs for those months, bug we didn't want to add unstability. After 1.0, Miguel asked me to check the patch into cvs. So I posted the patch to the list (recent one). Sadly to say, the next day it broke strongname signing. I could not find where the problem lay, and it got working one day later. So I decided not to check in now and wait for a while.

I started .NET 2.0 XML tasks. It is my primary task until 1.2 release.

2004 / 07 / 08: Tanabata (Permalink)

Pablo (tetsuo) asked that why am not blogging. Well, I am ;-) but haven't posted for a while. Now am switching to lameblog.

I want to write about some lightweight tasks that is however a bit important. These cpl months we had to fix core individual classes such as System.Decimal that needed fixes. I also touched a few classes such as System.DateTime and System.Uri mostly for the first time. I felt that those classes can be rewritten for performance and v2.0 compatibility. I didn't such optimization since it needs more time than we could spend, and it's not time to make them unstable.

I just put two examples, but there should be more.

Mono (still) also have not a little tasks to do. For example, our CultureInfo is not complete. That is mostly culture-specific matter, but there are more general problems, such as DateTime above and Calendar integration (how DateTime.Parse() should work against calendar-specific day and month names; how calendar-specific DayNames should be checked against different year/month/day count such in ThaiBuddhistCalendar (here I wish there were hackers from Thailand ;-).

If any of you want to help, please let us know.

Yesterday was "tanabata" in Japan; People wrote their wishes on cards and put them on bamboo. Rupert also did:

rupert on bamboo