Massimiliano Mantione's Blog

RSS

../../massi.rss2

Email

massi@ximian.com

Docs:

I'm working on...

14 Feb 2008 (Permalink)

Hack week interlude!

On Monday and Tuesday I have been implementing this crazy idea, but then I got stuck in a bad design problem.
The idea is a long term thing anyway... I have never thought to be able to do something related to compiler and language design this week, I only wanted to have a flexible parsing framework, which eventually will be used in compilers.
But even just that has proved being a bit hard. On Wednesday I was tempted to throw in the towel, and do something different (like helping with the regex engine reimplementation, which likely will be the hack week hit in the Mono project).

Then, I've been fascinated by this post by uber-hacker lupus.
It's obviously a hack, but a clever and cute one.

And it has something in common with the idea of extending compilers... so, I took a one day break to see if it was possible doing it in a different way, without changing the mcs C# grammar, but instead modifying mcs to accept arbitrary "extensions" that put into it code that is executed at compile time, and can in turn emit arbitrary code inside methods.

Thanks to Marek Safar, that was ready to help me when I could not understand what mcs was doing... here is the code I can now write:

using System; using Mono.CSharp.Extensions; #pragma compiler extensions enable namespace Test { public class Program { public static void Main () { IL.Ldstr ("Hello, world!"); IL.Call ("System.Console", "WriteLine", "System.String"); } } }

Of course, if one really wanted to insist in using IL syntax, he could write an extension that allows code like this:

using System; using Mono.CSharp.Extensions; #pragma compiler extensions enable namespace Test { public class Program { public static void Main () { IL.Emit (@" ldstr ""Hello, world!"" call void class [mscorlib]System.Console::WriteLine(string) "); } } }

The point is, once you have hacked mcs to allow extensions, you can go wild and write your own :-)
And before anyone tells me: yes, Lisp and Scheme macros are better at this.
The mcs patch is minimal: the diff is 178 lines, with exactly 124 lines added, and more than one third of the patch deals with putting in place the "#pragma" to activate the detection of extensions (if you don't enable the pragma mcs behaves normally).
Then, each extension has its own implementation, in an assembly clearly separated from the compiler, and which (of course!) is only needed at compile time and not at run time. For instance, the "inline IL" compiler extension looks like this:

using System; using System.Reflection; using System.Reflection.Emit; using Mono.CSharp; #pragma compiler extensions enable namespace Mono.CSharp.Extensions { public class IL { [CompileTimeInvocationAttribute("EmitLdstr")] public static extern void Ldstr (string s); public static void EmitLdstr (CompileTimeInvocation invocation, EmitContext ec) { ec.ig.Emit (OpCodes.Ldstr, ArgumentAsStringLiteral (invocation.Arguments [0])); } [CompileTimeInvocationAttribute("EmitCall")] public static extern void Call (string t, string m, string a); public static void EmitCall (CompileTimeInvocation invocation, EmitContext ec) { Type type = ArgumentAsType (invocation.Arguments [0]); string methodName = ArgumentAsStringLiteral (invocation.Arguments [1]); Type[] argumentTypes = ArgumentAsTypeArray (invocation.Arguments [2]); MethodInfo method = type.GetMethod (methodName, argumentTypes); ec.ig.Emit (OpCodes.Call, method); } ................ The idea is the following: each call to a method marked "CompileTimeInvocationAttribute" will not be emitted. Instead, the code of the method named in the attribute will be executed inside mcs in the "Emit" pass, and it has the responsibility of emitting the needed code.

Now, to be fair: this is a hack, and a bad one as well: extensions can use the internal mcs API, so changing mcs can break them. In practice, one is writing code that technically should be placed inside the compiler, but he is allowed to do it in a separated assembly.
Anyway, it works well, it's been fun, and I've learned a lot doing so!

And Cocos? Well, it's not dead at all. Taking the break allowed me to solve the design issue, and it's shaping up well. It still has the potential of being a good parsing framework. Particularly, it should make it very easy to re-parse only fragments of the program when they change (think when you are in an editor inside an IDE). And I am succeeding in keeping the semantic actions separated from the grammar, so that the same grammar can be reused in different tools. Only, it will not work by Friday :-(
So, more hacking ahead :-)

All entries
This is a personal web page. Things said here do not represent the position of my employer.