So I've read many times before that technically .NET does support tail call optimization (TCO) because it has the opcode for it, and just C# doesn't generate it.
I'm not exactly sure why TCO needs an opcode or what it would do. As far as I know, the requirement for being able to do TCO is that the results of a recursive call are not combined with any variables in the current function scope. If you don't have that, then I don't see how an opcode prevents you from having to keep a stack frame open. If you do have that, then can't the compiler always easily compile it to something iterative?
So what is the point of an opcode? Obviously there's something I'm missing. In cases where TCO is possible at all, can't it always be handled at the compiler level than at the opcode level? What's an example of where it can't?
Guess: In a simple language like x86 assembler where you manage the stack "manually", you don't need an opcode - you can just set up the call stack appropriately.
But in something higher-level like .NET CIL, the stack is partially managed for you, and the whole act of invoking a function is a single opcode (e.g. call). So you need a different opcode to implement TCO - one that does "pass control flow to this function, but without creating a new stack frame".
Following the links you already provided, this is the part which seems to me, answers your question pretty closely..
Source
The most interesting part in context of your question, which makes it super clear in my opinion, among many scenarios, is example of security mentioned above...
Security in .NET in many cases depends on the stack being accurate... at runtime.. Which is why, as highlighted above, the burden is shared by both the source to CIL compiler, and (runtime) CIL-to-native JIT compilers, with the final say being with the latter.