C#: Using pointer types as fields?

2019-03-25 02:08发布

问题:

In C#, it's possible to declare a struct (or class) that has a pointer type member, like this:

unsafe struct Node
{
  public Node* NextNode;
}

Is it ever safe (err.. ignore for a moment that ironic little unsafe flag..) to use this construction? I mean for longterm storage on the heap. From what I understand, the GC is free to move things around, and while it updates the references to something that's been moved, does it update pointers too? I'm guessing no, which would make this construction very unsafe, right?

I'm sure there are way superior alternatives to doing this, but call it morbid curiosity.

EDIT: There appears to be some confusion. I know that this isn't a great construction, I purely want to know if this is ever a safe construction, ie: is the pointer guaranteed to keep pointing to whatever you originally pointed it to?

The original C-code was used to traverse a tree (depth first) without recursion, where the tree is stored in an array. The array is then traversed by incrementing a pointer, unless a certain condition is met, then the pointer is set to the NextNode, where traversal continues. Of course, the same can in C# be accomplished by:

struct Node
{
  public int NextNode;
  ... // other fields
}

Where the int is the index in the array of the next node. But for performance reasons, I'd end up fiddling with pointers and fixed arrays to avoid bounds checks anyway, and the original C-code seemed more natural.

回答1:

Is it ever safe to use this construction? I mean for long term storage on the heap.

Yes. Doing so is usually foolish, painful and unnecessary, but it is possible.

From what I understand, the GC is free to move things around, and while it updates the references to something that's been moved, does it update pointers too?

No. That's why we make you mark it as unsafe.

I'm guessing no, which would make this construction very unsafe, right?

Correct.

I'm sure there are way superior alternatives to doing this, but call it morbid curiosity.

There certainly are.

is the pointer guaranteed to keep pointing to whatever you originally pointed it to?

Not unless you ensure that happens. There are two ways to do that.

Way one: Tell the garbage collector to not move the memory. There are two ways to do that:

  • Fix a variable in place with the "fixed" statement.

  • Use interop services to create a gc handle to the structures you wish to keep alive and in one place.

Doing either of these things will with high likelihood wreck the performance of the garbage collector.

Way two: Don't take references to memory that the garbage collector can possibly move. There are two ways to do that:

  • Only take addresses of local variables, value parameters, or stack-allocated blocks. Of course, in doing so you are then required to ensure that the pointers do not survive longer than the relevant stack frame, otherwise, you're referencing garbage.

  • Allocate a block out of the unmanaged heap and then use pointers inside that block. In essense, implement your own memory manager. You are required to correctly implement your new custom memory manager. Be careful.



回答2:

Why not:

struct Node
{
    public Node NextNode;
}

or at least:

struct Node
{
    public IntPtr NextNode;
}

You could use the fixed statement to prevent the GC to move pointers around.



回答3:

Yes, the garbage collector can move the objects around and, no, it will not update your pointers. You need to fix the objects you point to. More information can be found on this memory management explanation.

You can fix objects like this:

  unsafe {
     fixed (byte* pPtr = object) {
         // This will fix object in the memory
        }
     }
  }

The advantages of pointers are usually performance and interaction with other unsafe code. There will be no out-of-bounds checks etc, speeding up your code. But just as if you were programming in e.g. C you have to be very careful of what you are doing.



回答4:

Some obvious integrity checks have been excluded. The obvious problem with this is you have to allocate more than you will need because you cannot reallocate the buffer as the keyword fixed implies.

public unsafe class NodeList
{
    fixed Node _Nodes[1024];
    Node* _Current;

    public NodeList(params String[] data)
    {
        for (int i = 0; i < data.Length; i++)
        {
            _Nodes[i].Data = data[i];
            _Nodes[i].Next = (i < data.Length ? &_Nodes[i + 1] : null);     
        }

        _Current = &_Nodes[0];
    }

    public Node* Current()
    {
        return _Current++;  
    }
}

public unsafe struct Node
{
    public String Data;
    public Node* Next;
}


回答5:

A dangerous idea, but it may work:

When your array of structs exceeds a certain size (85000 bytes) it will be allocated on the Large Object Heap where blocks are scanned and collected but not moved...

The linked article points out the danger that a newer CLR version might move stuff on the LOH...