Clear C# String from memory

2020-03-08 12:08发布

问题:

I'm trying to clear the memory contents of a C# string for security reasons. I'm aware of the SecureString class, but unfortunately I cannot use SecureString instead of String in my application. The strings which need to be cleared are created dynamically at runtime (e.g. I'm not trying to clear string literals).

Most search result I found basically stated that clearing the contents of a String is not possible (as string are immutable) and SecureString should be used.

Therefore, I did come up with my own solution (using unsafe code) below. Testing shows that the solutions works, but I'm still not sure if there is anything wrong with the solution? Are there better ones?

static unsafe bool clearString(string s, bool clearInternedString=false) 
{
    if (clearInternedString || string.IsInterned(s) == null)
    {
        fixed (char* c = s)
        {
            for (int i = 0; i < s.Length; i++)
                c[i] = '\0';
        }
        return true;
    }
    return false;
}

EDIT: Due to the comments on the GC moving the string around before clearString gets called: what about the following snippet?

string s = new string('\0', len);
fixed (char* c = s)
{
    // copy data from secure location to s
    c[0] = ...;
    c[1] = ...;
    ...

    // do stuff with the string

    // clear the string
    for (int i = 0; i < s.Length; i++)
        c[i] = '\0';
}

回答1:

Your problem with this is that strings can move. If the GC runs, it can move the contents to a new location, but it won't zero out the old one. If you did zero out the string in question, you have no guarantee that a copy of it doesn't exist elsewhere in memory.

Here is a link to the .NET Garbage Collector, and it talks about compacting.

EDIT: here's your problem with the update:

// do stuff with the string

The problem is that once it leaves your control, you lose the ability to make sure that it's secure. If it was entirely in your control, then you wouldn't have the limitation of only using a string type. Simply put, this issue has been around for a long time, and no one has come up with a secure way of handling this. If you want to keep it secure, it's best handled through other means. Clearing out the string is meant to prevent someone from being able to find it through a memory dump. The best way to stop this if you can't use secure string is limit access to the machine the code is running on.



回答2:

Aside from the standard "You're stepping into unsafe territory" answer, which I hope explains itself, consider the following:

The CLR doesn't guarantee that there is only one instance of a string at any given point, and it doesn't guarantee that strings will be garbage collected. If I were to do the following:

var input = "somestring";
input += "sensitive info";
//do something with input
clearString(input, false);

What's the result of this? (Let's presume that I'm not using string literals, and these are instead inputs from some environment of some sort)

A string is created with the content of "somestring". Another string is created with content of "sensitive info", and yet another string is created with content of "somestringsensitive info". Only the latter string is cleared: "sensitive info" is not. It may or may not be immediately garbage collected.

Even if you're careful to ensure that you always clear out any string with sensitive information, the CLR still doesn't guarantee that only one instance of a string exists.

edit: With regard to your edit, simply pinning the string immediately may have the desired effect - no need to copy the string to another location or anything. You do need to do it immediately after receiving said string, and there are still other security issues to worry about. You cannot guarantee that, for example, the source of the string doesn't have a copy of it in ITS memory, without clearly understanding the source and exactly how it does things.

You also will not be able to mutate this string for obvious reasons (unless the mutated string is exactly the same size as the string), and you do need to be very careful that nothing you're doing can stomp on memory that isn't part of that string.

Also, if you pass it to other functions that you didn't write yourself, it may or may not be copied by that function.



回答3:

It's impossible to tell how many CLR and non-CLR functions your string passes through before it reaches your function where you're trying to clear it. These functions (managed and unmanaged) may create copies of the string for various reasons (possibly multiple copies).

You cannot possibly know all of these places and clear them all so realistically, you cannot guarantee that your password is cleared from memory. You should use SecureString instead but you need to understand that the above still applies: at some point in your program you will receive the password and you'll have to have it in memory (even if just for a short duration while you're moving it into a secure string). This means that your string will still go through chains of function calls that you don't control.



回答4:

As a user of SecureString I sometimes get input from a regular string and used to pin the incoming string memory to zero it out once I've put it in the SecureString, exactly like you are doing. Then I ran into a bizarre bug where the memory from a 3rd party library (Redis) was getting zero'ed. Turns out the 3rd party library had two instances of string which content was exactly identical to the test input regular string ("password"). Apparently .NET optimized all 3 strings to point to the same memory buffer. So when I pinned and zero'ed my string's 'own' memory, turned out I was also zero'ing the 3rd party library memory. And then Redis client library fails to parse connection strings with error that "password" is not a recognized key. So the lesson I learnt the hard way is to not zero the memory from a string, because it could also be the memory from another string with the same content.



回答5:

If you are really unable to use SecureString, and you're willing to write unsafe code, then you could write your own simple string class that uses unmanaged memory and ensures that all memory is zeroed before deallocation.

However, you can never truly ensure that your data is secure, as you never have full control over it. For example, a virus embedded deep enough could read that memory while the program is running, and these is also a possibility that the process is terminated, in which case the destructor code won't run, leaving the data in unallocated memory, which could be allocated to another process, and it would still initially contain your sensitive data; someone could easily use a tool such as visual studio to monitor memory of a debugged process, or write a program that allocates memory and searches it for sensitive data.