Massimiliano Mantione's Blog

RSS

../../massi.rss2

Email

massi@ximian.com

Docs:

I'm working on...

04 May 2004 (Permalink)

It's rainy, but I'm beginning to see the light...

Things are more funny, but not really satisfactory...

Bad news first: still no notebook, so I'm working from my home desktop, with just a dialup connection. This also means that my one year old son, Michele, must generally sleep/play in my (and my wife's) bedroom, at least while I am working.

Anyway, I managed to get something done. I rebuilt mono from the 20040502 CVS snapshot (had to get monolite for that, but never mind), and easily applied the ABC removal patch to it. What is "not satisfactory" is that the code works correctly (bounds checks are removed), but there are no performance gains at all (actually performance gets worse!).

This completely contradicts previous tests, so I investigated a bit.
Machine code for this loop:

  for (int i = 0; i <a.Length; i++)
  {
    a[i] = i;
  }

Becomes like this:

  2b:	eb 0b                	jmp    38 
  2d:	8b c3                	mov    %ebx,%eax
  2f:	8b cf                	mov    %edi,%ecx
  31:	8d 44 88 10          	lea    0x10(%eax,%ecx,4),%eax
  35:	89 38                	mov    %edi,(%eax)
  37:	47                   	inc    %edi
  38:	8b c3                	mov    %ebx,%eax
  3a:	8b 40 0c             	mov    0xc(%eax),%eax
  3d:	3b f8                	cmp    %eax,%edi
  3f:	7c ec                	jl     2d 

And without bounds check removal like this:

  2b:	eb 14                	jmp    41 
  2d:	8b c3                	mov    %ebx,%eax
  2f:	8b cf                	mov    %edi,%ecx
  31:	39 48 0c             	cmp    %ecx,0xc(%eax)
  34:	0f 86 26 00 00 00    	jbe    60 
  3a:	8d 44 88 10          	lea    0x10(%eax,%ecx,4),%eax
  3e:	89 38                	mov    %edi,(%eax)
  40:	47                   	inc    %edi
  41:	8b c3                	mov    %ebx,%eax
  43:	8b 40 0c             	mov    0xc(%eax),%eax
  46:	3b f8                	cmp    %eax,%edi
  48:	7c e3                	jl     2d 

Now, it is obvious that the bounds check has been removed. What is not so obvious is why the code does not run faster (and yes, I know that JIT time should be factored out)!

Maybe I'll have a look at the profiler, and see where the execution time is actually spent. After all, during my next optimization tasks using a profiler will be a must, so it's better starting immediately.

On other fronts, I have had a look at the existing implementations of Array.Copy and Buffer.BlockCopy, understood most of how they work and (even more important) understood how to provide and internal implementation for methods (with "InternalCall", and adding the implementation in "icall.c").

At this point, the next thing to do is learn how to use the profiler well, so that I can properly understand what's going on...

All entries
This is a personal web page. Things said here do not represent the position of my employer.