How to distinguish between strings in heap or lite

2020-08-09 07:53发布

I have a use case where I can get pointers of strings allocated either in memory or literals. Now the latter can't be freed so that's a problem if I pass the wrong one. Is there a way to know which one is allocated and which not?

char *b = "dont free me!";
if(!IS_LITERAL(b)) {
    free(b);
}

I imagine something like that.

My example:

Scenario 1: literal

char *b = "dont free me!";
scruct elem* my_element = mylib_create_element(b);
// do smth
int result = mylib_destroy_element(my_element); // free literal, very bad

Scenario 2: in heap

char *b = malloc(sizeof(char)*17); // example
strncpy(b, "you can free me!",17);

scruct elem* my_element = mylib_create_element(b);
// do smth
int result = mylib_destroy_element(my_element); // free heap, nice

How the user calls mylib_create_element(b); is not under my control. If he frees before mylib_destroy_element it can crash. So it has got to be mylib_destroy_element that cleans up.

标签: c
8条回答
Luminary・发光体
2楼-- · 2020-08-09 08:23

You can do the following:

  typedef struct 
{
 int is_literal;
 char * array;
} elem;

Every time you you allocate the elem.array on the heap simply set the is_literal to 0. When you set the array to be literal, set the flag to a non-zero value, e.g.:

elem foo;
foo.array = "literal";
foo.is_literal = 1 ;

or

elem bar;
bar.array = (char*) (malloc(sizeof(char) * 10)) ;
bar.is_literal = 0;

Then at the client side:

if(!bar.is_literal) {
free(bar.array);
}

Simple as that.

查看更多
forever°为你锁心
3楼-- · 2020-08-09 08:28

I've had a similar case recently. Here's what I did:

If you're making an API that accepts a string pointer and then uses it to create an object (mylib_create_element), a good idea would be to copy the string to a separate heap buffer and then free it at your discretion. This way, the user is responsible for freeing the string he used in the call to your API, which makes sense. It's his string, after all.

Note that this won't work if your API depends on the user changing the string after creating the object!

查看更多
可以哭但决不认输i
4楼-- · 2020-08-09 08:30

On most Unixes, there are values 'etext' and 'edata'. If your pointer is between 'etext' and 'edata', then it shall be statically initialized. Those values are not mentioned in any standard, so the usage is non portable and at your own risk.

Example:

#include<stdio.h>
#include<stdlib.h>

extern char edata;
extern char etext;

#define IS_LITERAL(b) ((b) >= &etext && (b) < &edata)

int main() {
    char *p1 = "static";
    char *p2 = malloc(10);
    printf("%d, %d\n", IS_LITERAL(p1), IS_LITERAL(p2));
}
查看更多
ゆ 、 Hurt°
5楼-- · 2020-08-09 08:33

This is exactly why the rule is that only the piece of code or module that created a string may free it. In other words, every string or piece of data is "owned" by the code unit that created it. Only the owner can free it. A function should never free data structures that it received as arguments.

查看更多
再贱就再见
6楼-- · 2020-08-09 08:38

Here is a practical way:

Although the C-language standard does not dictate this, for all identical occurrences of a given literal string in your code, the compiler generates a single copy within the RO-data section of the executable image.

In other words, every occurrence of the literal string "dont free me!" in your code is translated into the same memory address.

So at the point where you want to deallocate that string, you can simply compare its address with the address of the literal string "dont free me!":

if (b != "dont free me!") // address comparison
    free(b);

To emphasize this again, it is not imposed by the C-language standard, but it is practically implemented by any decent compiler of the language.


The above is merely a practical trick referring directly to the question at hand (rather than to the motivation behind this question).

Strictly speaking, if you've reached a point in your implementation where you have to distinguish between a statically allocated string and a dynamically allocated string, then I would tend to guess that your initial design is flawed somewhere along the line.

查看更多
在下西门庆
7楼-- · 2020-08-09 08:44

Back in the early days when a 80386 could have 8 megabytes of RAM maximum, and the ideas of making objects were being explained in every other magazine article, I did not like copying perfectly good literals into string objects (allocating and freeing the internal copy) and I asked Bjarne about that, since a crude string class was one of his examples of C++ fancy-stuff.

He said don't worry about it.

Is this having to do with literals vs other char* pointers? You can always own the memory. Imthink so, from your ideas of looking for different memory segments.

Or is it more generally that ownership may or may not be given, there is no way to tell, and need to store a flag: "hey, this is a heap object, but someone else is still using it and will take care of it later, OK?"

For the tractable case where it is "on the heap" or "not" (literals, globals, stack-based), you could have the free function know. If you supplied a matching set of allocate/maybe-free, it could be written to know what memory is under its control.

查看更多
登录 后发表回答