What are the barriers to understanding pointers an-第2页回答

2楼-- · 2018-12-31 01:21

I think it might actually be a syntax issue. The C/C++ syntax for pointers seems inconsistent and more complex than it needs to be.

Ironically, the thing that actually helped me to understand pointers was encountering the concept of an iterator in the c++ Standard Template Library. It's ironic because I can only assume that iterators were conceived as a generalization of the pointer.

Sometimes you just can't see the forest until you learn to ignore the trees.

0人赞添加讨论(0) 举报

旧人旧事旧时光

3楼-- · 2018-12-31 01:22

Pointers is a concept that for many can be confusing at first, in particular when it comes to copying pointer values around and still referencing the same memory block.

I've found that the best analogy is to consider the pointer as a piece of paper with a house address on it, and the memory block it references as the actual house. All sorts of operations can thus be easily explained.

I've added some Delphi code down below, and some comments where appropriate. I chose Delphi since my other main programming language, C#, does not exhibit things like memory leaks in the same way.

If you only wish to learn the high-level concept of pointers, then you should ignore the parts labelled "Memory layout" in the explanation below. They are intended to give examples of what memory could look like after operations, but they are more low-level in nature. However, in order to accurately explain how buffer overruns really work, it was important that I added these diagrams.

Disclaimer: For all intents and purposes, this explanation and the example memory layouts are vastly simplified. There's more overhead and a lot more details you would need to know if you need to deal with memory on a low-level basis. However, for the intents of explaining memory and pointers, it is accurate enough.

Let's assume the THouse class used below looks like this:

type
    THouse = class
    private
        FName : array[0..9] of Char;
    public
        constructor Create(name: PChar);
    end;

When you initialize the house object, the name given to the constructor is copied into the private field FName. There is a reason it is defined as a fixed-size array.

In memory, there will be some overhead associated with the house allocation, I'll illustrate this below like this:

---[ttttNNNNNNNNNN]---
     ^   ^
     |   |
     |   +- the FName array
     |
     +- overhead

The "tttt" area is overhead, there will typically be more of this for various types of runtimes and languages, like 8 or 12 bytes. It is imperative that whatever values are stored in this area never gets changed by anything other than the memory allocator or the core system routines, or you risk crashing the program.

Allocate memory

Get an entrepreneur to build your house, and give you the address to the house. In contrast to the real world, memory allocation cannot be told where to allocate, but will find a suitable spot with enough room, and report back the address to the allocated memory.

In other words, the entrepreneur will choose the spot.

THouse.Create('My house');

Memory layout:

---[ttttNNNNNNNNNN]---
    1234My house

Keep a variable with the address

Write the address to your new house down on a piece of paper. This paper will serve as your reference to your house. Without this piece of paper, you're lost, and cannot find the house, unless you're already in it.

var
    h: THouse;
begin
    h := THouse.Create('My house');
    ...

Memory layout:

    h
    v
---[ttttNNNNNNNNNN]---
    1234My house

Copy pointer value

Just write the address on a new piece of paper. You now have two pieces of paper that will get you to the same house, not two separate houses. Any attempts to follow the address from one paper and rearrange the furniture at that house will make it seem that the other house has been modified in the same manner, unless you can explicitly detect that it's actually just one house.

Note This is usually the concept that I have the most problem explaining to people, two pointers does not mean two objects or memory blocks.

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('My house');
    h2 := h1; // copies the address, not the house
    ...

    h1
    v
---[ttttNNNNNNNNNN]---
    1234My house
    ^
    h2

Freeing the memory

Demolish the house. You can then later on reuse the paper for a new address if you so wish, or clear it to forget the address to the house that no longer exists.

var
    h: THouse;
begin
    h := THouse.Create('My house');
    ...
    h.Free;
    h := nil;

Here I first construct the house, and get hold of its address. Then I do something to the house (use it, the ... code, left as an exercise for the reader), and then I free it. Lastly I clear the address from my variable.

Memory layout:

    h                        <--+
    v                           +- before free
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

    h (now points nowhere)   <--+
                                +- after free
----------------------          | (note, memory might still
    xx34My house             <--+  contain some data)

Dangling pointers

You tell your entrepreneur to destroy the house, but you forget to erase the address from your piece of paper. When later on you look at the piece of paper, you've forgotten that the house is no longer there, and goes to visit it, with failed results (see also the part about an invalid reference below).

var
    h: THouse;
begin
    h := THouse.Create('My house');
    ...
    h.Free;
    ... // forgot to clear h here
    h.OpenFrontDoor; // will most likely fail

Using h after the call to .Free might work, but that is just pure luck. Most likely it will fail, at a customers place, in the middle of a critical operation.

    h                        <--+
    v                           +- before free
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

    h                        <--+
    v                           +- after free
----------------------          |
    xx34My house             <--+

As you can see, h still points to the remnants of the data in memory, but since it might not be complete, using it as before might fail.

Memory leak

You lose the piece of paper and cannot find the house. The house is still standing somewhere though, and when you later on want to construct a new house, you cannot reuse that spot.

var
    h: THouse;
begin
    h := THouse.Create('My house');
    h := THouse.Create('My house'); // uh-oh, what happened to our first house?
    ...
    h.Free;
    h := nil;

Here we overwrote the contents of the h variable with the address of a new house, but the old one is still standing... somewhere. After this code, there is no way to reach that house, and it will be left standing. In other words, the allocated memory will stay allocated until the application closes, at which point the operating system will tear it down.

Memory layout after first allocation:

    h
    v
---[ttttNNNNNNNNNN]---
    1234My house

Memory layout after second allocation:

                       h
                       v
---[ttttNNNNNNNNNN]---[ttttNNNNNNNNNN]
    1234My house       5678My house

A more common way to get this method is just to forget to free something, instead of overwriting it as above. In Delphi terms, this will occur with the following method:

procedure OpenTheFrontDoorOfANewHouse;
var
    h: THouse;
begin
    h := THouse.Create('My house');
    h.OpenFrontDoor;
    // uh-oh, no .Free here, where does the address go?
end;

After this method has executed, there's no place in our variables that the address to the house exists, but the house is still out there.

Memory layout:

    h                        <--+
    v                           +- before losing pointer
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

    h (now points nowhere)   <--+
                                +- after losing pointer
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

As you can see, the old data is left intact in memory, and will not be reused by the memory allocator. The allocator keeps track of which areas of memory has been used, and will not reuse them unless you free it.

Freeing the memory but keeping a (now invalid) reference

Demolish the house, erase one of the pieces of paper but you also have another piece of paper with the old address on it, when you go to the address, you won't find a house, but you might find something that resembles the ruins of one.

Perhaps you will even find a house, but it is not the house you were originally given the address to, and thus any attempts to use it as though it belongs to you might fail horribly.

Sometimes you might even find that a neighbouring address has a rather big house set up on it that occupies three address (Main Street 1-3), and your address goes to the middle of the house. Any attempts to treat that part of the large 3-address house as a single small house might also fail horribly.

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('My house');
    h2 := h1; // copies the address, not the house
    ...
    h1.Free;
    h1 := nil;
    h2.OpenFrontDoor; // uh-oh, what happened to our house?

Here the house was torn down, through the reference in h1, and while h1 was cleared as well, h2 still has the old, out-of-date, address. Access to the house that is no longer standing might or might not work.

This is a variation of the dangling pointer above. See its memory layout.

Buffer overrun

You move more stuff into the house than you can possibly fit, spilling into the neighbours house or yard. When the owner of that neighbouring house later on comes home, he'll find all sorts of things he'll consider his own.

This is the reason I chose a fixed-size array. To set the stage, assume that the second house we allocate will, for some reason, be placed before the first one in memory. In other words, the second house will have a lower address than the first one. Also, they're allocated right next to each other.

Thus, this code:

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('My house');
    h2 := THouse.Create('My other house somewhere');
                         ^-----------------------^
                          longer than 10 characters
                         0123456789 <-- 10 characters

Memory layout after first allocation:

                        h1
                        v
-----------------------[ttttNNNNNNNNNN]
                        5678My house

Memory layout after second allocation:

    h2                  h1
    v                   v
---[ttttNNNNNNNNNN]----[ttttNNNNNNNNNN]
    1234My other house somewhereouse
                        ^---+--^
                            |
                            +- overwritten

The part that will most often cause crash is when you overwrite important parts of the data you stored that really should not be randomly changed. For instance it might not be a problem that parts of the name of the h1-house was changed, in terms of crashing the program, but overwriting the overhead of the object will most likely crash when you try to use the broken object, as will overwriting links that is stored to other objects in the object.

Linked lists

When you follow an address on a piece of paper, you get to a house, and at that house there is another piece of paper with a new address on it, for the next house in the chain, and so on.

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('Home');
    h2 := THouse.Create('Cabin');
    h1.NextHouse := h2;

Here we create a link from our home house to our cabin. We can follow the chain until a house has no NextHouse reference, which means it's the last one. To visit all our houses, we could use the following code:

var
    h1, h2: THouse;
    h: THouse;
begin
    h1 := THouse.Create('Home');
    h2 := THouse.Create('Cabin');
    h1.NextHouse := h2;
    ...
    h := h1;
    while h <> nil do
    begin
        h.LockAllDoors;
        h.CloseAllWindows;
        h := h.NextHouse;
    end;

Memory layout (added NextHouse as a link in the object, noted with the four LLLL's in the below diagram):

    h1                      h2
    v                       v
---[ttttNNNNNNNNNNLLLL]----[ttttNNNNNNNNNNLLLL]
    1234Home       +        5678Cabin      +
                   |        ^              |
                   +--------+              * (no link)

In basic terms, what is a memory address?

A memory address is in basic terms just a number. If you think of memory as a big array of bytes, the very first byte has the address 0, the next one the address 1 and so on upwards. This is simplified, but good enough.

So this memory layout:

    h1                 h2
    v                  v
---[ttttNNNNNNNNNN]---[ttttNNNNNNNNNN]
    1234My house       5678My house

Might have these two address (the leftmost - is address 0):

h1 = 4
h2 = 23

Which means that our linked list above might actuall look like this:

    h1 (=4)                 h2 (=28)
    v                       v
---[ttttNNNNNNNNNNLLLL]----[ttttNNNNNNNNNNLLLL]
    1234Home      0028      5678Cabin     0000
                   |        ^              |
                   +--------+              * (no link)

It is typical to store an address that "points nowhere" as a zero-address.

In basic terms, what is a pointer?

A pointer is just a variable holding a memory address. You can typically ask the programming language to give you its number, but most programming languages and runtimes tries to hide the fact that there is a number beneath, just because the number itself does not really hold any meaning to you. It is best to think of a pointer as a black box, ie. you don't really know or care about how it is actually implemented, just as long as it works.

0人赞添加讨论(0) 举报

琉璃瓶的回忆

4楼-- · 2018-12-31 01:24

I think the main barrier to understanding pointers is bad teachers.

Almost everyone are taught lies about pointers: That they are nothing more than memory addresses, or that they allow you to point to arbitrary locations.

And of course that they are difficult to understand, dangerous and semi-magical.

None of which is true. Pointers are actually fairly simple concepts, as long as you stick to what the C++ language has to say about them and don't imbue them with attributes that "usually" turn out to work in practice, but nevertheless aren't guaranteed by the language, and so aren't part of the actual concept of a pointer.

I tried to write up an explanation of this a few months ago in this blog post -- hopefully it'll help someone.

(Note, before anyone gets pedantic on me, yes, the C++ standard does say that pointers represent memory addresses. But it does not say that "pointers are memory addresses, and nothing but memory addresses and may be used or thought of interchangeably with memory addresses". The distinction is important)

0人赞添加讨论(0) 举报

余生请多指教

5楼-- · 2018-12-31 01:26

Some answers above have asserted that "pointers aren't really hard", but haven't gone on to address directly where "pointer are hard!" comes from. Some years back I tutored first year CS students (for only one year, since I clearly sucked at it) and it was clear to me that the idea of pointer is not hard. What's hard is understanding why and when you would want a pointer.

I don't think you can divorce that question - why and when to use a pointer - from explaining broader software engineering issues. Why every variable should not be a global variable, and why one should factor out similar code into functions (that, get this, use pointers to specialize their behaviour to their call site).

0人赞添加讨论(0) 举报

回忆，回不去的记忆

6楼-- · 2018-12-31 01:27

I think that what makes pointers tricky to learn is that until pointers you're comfortable with the idea that "at this memory location is a set of bits that represent an int, a double, a character, whatever".

When you first see a pointer, you don't really get what's at that memory location. "What do you mean, it holds an address?"

I don't agree with the notion that "you either get them or you don't".

They become easier to understand when you start finding real uses for them (like not passing large structures into functions).

0人赞添加讨论(0) 举报

浮光初槿花落

7楼-- · 2018-12-31 01:28

In my first Comp Sci class, we did the following exercise. Granted, this was a lecture hall with roughly 200 students in it...

Professor writes on the board: int john;

John stands up

Professor writes: int *sally = &john;

Sally stands up, points at john

Professor: int *bill = sally;

Bill stands up, points at John

Professor: int sam;

Sam stands up

Professor: bill = &sam;

Bill now points to Sam.

I think you get the idea. I think we spent about an hour doing this, until we went over the basics of pointer assignment.

0人赞添加讨论(0) 举报

What are the barriers to understanding pointers an

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间