Massimiliano Mantione's Blog

RSS

../../massi.rss2

Email

massi@ximian.com

Docs:

I'm working on...

25 May 2006 (Permalink)

Chasing elusive JIT bugs...

%*@#!
Back from Sicily, I debugged the "last" issue related to deadce, but as soon as I committed the fix and re-enabled the five options described here, more amd64 specific issues surfaced.

The problem is that I am not able to reproduce the bugs in any way on the 64 bit machine I have access to.
I really need the bug reporters to help me in finding the problem.

But, since instructions on how to isolate the source of JIT problems without having JIT knowledge could be interesting to many, instead of just mailing them privately I'll publish them here (and then copy them to the wiki, too).
The only knowledge that is assumed is that you can rebuild mono from source, you have some C programming knowledge so you know how #define works, and you can search the code with grep.

So, assume you have a test case (even a whole program) that always fails with a specific JIT option (or combination of options), and always works without that option.

The first thing to do, when a JIT option causes trouble, is identifying one single method that is compiled incorrectly when the "bad" option is enabled, and correctly when it is disabled. This method could be the top one in the stack trace in case of crash, but often it is not so, and anyway to avoid wasting time debugging correct code it would be nice to have a way to tell for sure if you have really found the right method.

For this purpose, I often instrument the JIT so that it applies certain options only to single methods, or anyway to a subset of the program's methods (by the way, to get a list of the compiled methods it is enough to execute the test giving one "-v" flag to "mono", and grepping for " emitted at ").

Inside the mini directory, you can grep for macros like "MONO_APPLY_DEADCE_TO_SINGLE_METHOD", "MONO_APPLY_TREE_MOVER_TO_SINGLE_METHOD", "MONO_APPLY_SSAPRE_TO_SINGLE_METHOD", "MONO_INLINE_CALLED_LIMITED_METHODS", and "MONO_INLINE_CALLER_LIMITED_METHODS" (these last two only since r61101). Each of them, when true, enables code that compares the current method full name with what's found in a specific environment variable, and applies the option only if the strings match.

The macros with "single" expext an exact match, while the ones with "limited" have a "starts with" behavior, which is more useful when searching "manually" (I sometimes use perl scripts to automate the search for the failing method), but anyway it is really easy to change this behavior.

So, suppose that "-O=inline" causes a problem. You could set "MONO_INLINE_CALLER_LIMITED_METHODS" to 1, rebuild, and start running the test setting the environment variable "MONO_INLINE_CALLER_METHOD_NAME_LIMIT" initially to "System", to see if the failing method is in the System namespace, and then refining the search.
Keeping a sorted list of all the methods that were jitted when the test passed can be useful to direct the search. Note that I said "when the test passed", and not "failed", because the failure could have cut the logs and made the list incomplete...

At some point, you'll eventually find one single method that changes the test result. Then, the best thing you can do is running the test with lots of logs enabled (ideally put the "-v" option five times on the command line, and enable all the relevant debugging macros like "DEBUG_DEADCE" or "DEBUG_ALIAS_ANALYSIS").

Be warned that the resulting logs can be huge!
There are two basic ways to produce smaller logs: just compile the failing method with "mono --compile ...", or put an "if" at the beginning of "mini_method_compile" that sets "cfg->verbose_level" to 5 if the method name matches (at that point you can just hard code the string in the source!).
I prefer the second method because it does not alter the test run (so that if the error was dependent on something else, like having a class constructor already executed, you are in the same condition, otherwise you could end up chasing ghosts).

When you have those logs, please send them to me, they are generally enough to understand what went wrong :-)

All entries
This is a personal web page. Things said here do not represent the position of my employer.