I'm working on a hands-off log mechanism for my c# application.
Here's what I'd like it to look like:
function a(arg1, arg2, arg 3.....)
calls function b(arg4,arg5,arg6....)
, which in turn calls log()
which is than able to detect the stacktrace (this can be done via Environment.StackTrace
) and the values with which each function (e.g. a
and b
) in the stacktrace is called.
I want it to work in debug and release mode (or, at least, in debug mode).
Is this possible to do in .net?
Provably not possible:
By the time b
is called, the space in the stack used by a's arg1
(the IL stack, so possibly it was never even put in a stack, but had been enregistered on the call) is not guaranteed to still be used by arg1
.
By extension, if arg1
is a reference-type, the object it referred to is not guaranteed to not have been garbage collected, if it isn't used after the call to b
.
Edit:
A bit more detail, since your comment suggests you're not grokking this and still think it should be possible.
The calling conventions used by the jitter are not specified in the specs for any of the relevant standards, which gives implementers freedom to make improvements. They do indeed differ between 32-bit and 64-bit versions, and different releases.
However, articles from MS people suggest that the convention used is akin to the __fastcall convention. In your call to a
, arg1
would be put into the ECX register*, and arg2
into the EDX register (I'm simplifying by assuming 32-bit x86, with amd64 even more arguments are enregistered) of the core the code is running on. arg3
would be pushed on the stack and would indeed exist in memory.
Note that at this point, there is no memory location in which arg1
and arg2
exist, they're only in a CPU register.
In the course of executing the method itself, the registers and memory are used as necessary. And the b
is called.
Now, if a
is going to need arg1
or arg2
it'll have to push that before it calls b
. But if it doesn't, then it won't - and things might even be re-ordered to reduce this need. Conversely, those registers may have already been used for something else already by this point - the jitter isn't stupid, so if it needs a register or a slot on the stack and there's one going unused for the rest of the method, it's going to reuse that space. (For that matter, at the level above this, the C# compiler will reuse slots in the virtual stack that the IL produced uses).
So, when b
is called, arg4
is placed in register ECX, arg5
into EDX and arg6
pushed on the stack. At this point, arg1
and arg2
don't exist and you can no longer find out what they were than you can read a book after it has been recycled and turned into toilet paper.
(Interesting note is that it's very common for a method to call another with the same arguments in the same position, in which case ECX and EDX can be just left alone).
Then, b
returns, putting its return value in the EAX register, or EDX:EAX pair or in memory with EAX pointing to it, depending on size, a
does some more work before putting its return in that register, and so on.
Now, this is assuming there haven't been any optimisations done. It's possible that in fact, b
wasn't called at all, but rather that its code was inlined. In this case whether the values where in registers or on the stack - and in the latter case, where they were on the stack, no longer has anything to do with b
's signature and everything to do with where the relevant values are during a
's execution, and it would be different in the case of another "call" to b
, or even in the case of another "call" to b
from a
, since the entire call of a
including its call to b
could have been inlined in one case, not inlined in another, and inlined differently in yet another. If for example, arg4
came straight from a value returned by another call, it could be in the EAX register at this point, while arg5
was in ECX as it was the same as arg1
and arg6
was somewhere half-way in the middle of the stack-space being used by a
.
Another possibility is that the call to b
was a tail-call that was eliminated: Because the call to b
was going to have its return value immediately returned too by a
(or some other possibilities), then rather than pushing to the stack, the values being used by a
are replaced in-place, and the return address changed so that the return from b
jumps back to the method that called a
, skipping some of the work (and reducing memory use to the extent that some functional style approaches that would overflow the stack instead work and indeed work well). In this case, during the call to b
, the parameters to a
are likely completely gone, even those that had been on the stack.
It's highly debatable whether this last case should even be considered an optimisation at all; some languages heavily depend upon it being done as with it they give good performance and without they give horrible performance if they even work at all (instead of overflowing the stack).
There can be all manner of other optimisations. There should be all manner of other optimisations - if the .NET team or the Mono team do something that makes my code faster or use less memory but otherwise behave the same, without my having to something, I for one won't be complaining!
And that's assuming that the person writing the C# in the first place never changed the value of a parameter, which certainly isn't going to be true. Consider this code:
IEnumerable<T> RepeatedlyInvoke(Func<T> factory, int count)
{
if(count < 0)
throw new ArgumentOutOfRangeException();
while(count-- != 0)
yield return factory();
}
Even if the C# compiler and the jitter had been designed in such a wasteful way that you could guarantee parameters weren't changed in the ways described above, how could you know what count
had already been from within the invocation of factory
? Even on the first call it's different, and it's not like the above is strange code.
So, in summary:
- Jitter: Parameters are often enregistered. You can expect x86 to put 2 pointer, reference or integer parameters in registers and amd64 to put 4 pointer, reference or integer parameters and 4 floating-point parameters into registers. They have no location to read them from.
- Jitter: Parameters on the stack are often over-written.
- Jitter: There may not be a real call at all, so there's no place to look for parameters as they could be anywhere.
- Jitter: The "call" may be re-using the same frame as the last one.
- Compiler: The IL may re-use slots for locals.
- Human: The programmer may change parameter values.
From all of that, how on earth is it going to be possible to know what arg1
was?
Now, add in the existence of garbage collection. Imagine if we could magically know what arg1
was anyway, despite all of this. If it was a reference to an object on the heap, it might still do us no good, because if all of the above meant that there were no more references active on the stack - and it should be clear that this quite definitely does happen - and the GC kicks in, then the object could have been collected. So all we can magically get hold of is a reference to something that no longer exists - indeed quite possibly to an area in the heap now being used for something else, bang goes the entire type safety of the entire framework!
It's not in the slightest bit comparable to reflection obtaining the IL, because:
- The IL is static, rather than just a state at a given point in time. Likewise, we can get a copy of our favourite books from a library a lot more easily than we can get back our reaction the first time we read them.
- The IL doesn't reflect the impact of inlining etc. anyway. If a call was inlined every time it was actually used, and then we used reflection to get a
MethodBody
of that method, the fact that its normally inlined is irrelevant.
The suggestions in other answers about profiling, AOP, and interception are as close as you're going to get.
*Actually, this
is the real first parameter to instance members. Lets pretend everything is static so we don't have to keep pointing this out.
It's impossible in .net. At the runtime JITter may decide to use CPU registers instead of stack to store method parameters or even rewrite the initial (passed) values in the stack. So it would be very performance-costly to .net to allow to log parameters at any point in source code.
As far as I know the only way you can do it in general is to use .net CLR profiling API. (Typemock framework for example is able to do such things and it uses CLR profiling API)
If you only need to intercept virtual functions/properties (including interfaces methods/properties) calls you can use any intercepting framework (Unity or Castle for example).
There are some information about .net profiling API:
MSDN Magazine
MSDN Blogs
Brian Long's blog
This is not possible in C#, you should use an AOP approach and perform method argument logging when each method is called. This way you can centralize your logging code, make it reusable and then you would just need to mark which methods require argument logging.
I believe this could be easily achievable using an AOP framework like PostSharp.