What is the address of a function in a C++ program

As the function is set of instruction stored in one contiguous block of memory.

And address of a function (entry point) is the address of the first instruction in the function. (from my knowledge)

And thus we can say that the address of function and the address of the first instruction in the function will be the same (In this case the first instruction is the initialization of a variable.).

But the program below contradicts the above line.

code:

#include<iostream>
#include<stdio.h>
#include<string.h>
using namespace std;
char ** fun()
{
    static char * z = (char*)"Merry Christmas :)";
    return &z;
}
int main()
{
    char ** ptr = NULL;

    char ** (*fun_ptr)(); //declaration of pointer to the function
    fun_ptr = &fun;

    ptr = fun();

    printf("\n %s \n Address of function = [%p]", *ptr, fun_ptr);
    printf("\n Address of first variable created in fun() = [%p]", (void*)ptr);
    cout<<endl;
    return 0;
}

One output example is:

 Merry Christmas :) 
 Address of function = [0x400816]
 Address of first variable created in fun() = [0x600e10]

So, here the address of function and the address of first variable in function is not same. Why so?

I searched on google but can't come up with the exact required answer and being new to this language I exactly can't catch some of contents on net.

标签： c++ function function-pointers memory-address

10条回答

Fickle 薄情

2楼-- · 2019-01-22 02:21

In the text of your question you say:

And thus we can say that the address of function and the address of the first instruction in the function will be the same (In this case the first instruction is the initialization of a variable.).

but in the code you don't get the address of the first instruction in the function but the address of some local variable declared in the function.

A function is code, a variable is data. They are stored in different memory areas; they don't even reside in the same memory block. Due to security restrictions imposed by the OSes nowadays, the code is stored in memory blocks that are marked as read-only.

As far as I know, the C language doesn't provide any way to get the address of a statement in the memory. Even if it would provide such a mechanism, the start of the function (the function's address in memory) is not the same as the address of the machine code generated from the first C statement.

Before the code generated from the first C statement, the compiler generates a function prolog that (at least) saves the current value of the stack pointer and makes room for function's local variables. This means several assembly instructions before any code generated from the first statement of the C function.

0人赞添加讨论(0) 举报

小情绪 Triste *

3楼-- · 2019-01-22 02:23

variables declared within a function is not allocated where you see in the code automatic variables (variables defined locally in a function) is given a suitable place in the stack memory when function is about to be called , this is done during compile time by the compiler, thus the address of the first instruction is have nothing to do with variables it's about the executable instructions

0人赞添加讨论(0) 举报

家丑人穷心不美

4楼-- · 2019-01-22 02:24

What exactly is the address of a function in a C++ program?

Like other variables, an address of a function is the space allocated for it. In other words, it is the memory location where the instructions (machine code) for the operation performed by the function is stored.

To understand this, take a deep look on memory layout of a program.

A program's variables and executable code/instructions are stored at different segments of memory (RAM). Variables goes to any of STACK, HEAP, DATA and BSS segment while executable code goes to the CODE segment. Look at the general memory layout of a program

Now you can see that there are different memory segments for variables and instructions. They are stored at different memory locations. The function address is the address which is located at the CODE segment.

So, you are confusing the term first statement with first executable instruction. When function call is invoked, program counter is updated with the address of the function. Therefore, the function pointer points to the first instruction of the function stored in memory.

0人赞添加讨论(0) 举报

Viruses.

5楼-- · 2019-01-22 02:25

As you say, the address of function may be (it will depend on the system) the address of the first instruction of the function.

It is the answer. Instruction will not share address with variables in typical environment in which same address space is used for instructions and data.

If they share the samee address, instrution will be destroyed by assigning to the variables!

0人赞添加讨论(0) 举报

叼着烟拽天下

6楼-- · 2019-01-22 02:28

The address of a normal function is where the instructions start (if there is no vTable involved).

For the variables it depends:

static variables are stored in another place.
parameters are pushed on the stack or kept in registers.
local variables are also pushed on the stack or kept in registers.

Unless the function is inlined or optimized away.

0人赞添加讨论(0) 举报

我命由我不由天

7楼-- · 2019-01-22 02:30

Alright, this is going to be fun. We get to go from the extremely abstract concept of what a function pointer is in C++ all the way down to the assembly code level, and thanks to some of the particular confusions we're having, we even get to discuss stacks!

Let's start at the highly abstract side, because that's clearly the side of things you're starting from. you have a function char** fun() that you're playing with. Now, at this level of abstraction, we can look at what operations are permitted on function pointers:

We can test if two function pointers are equal. Two function pointers are equal if they point at the same function.
We can do inequality testing on those pointers, allowing us to do sorting of such pointers.
We can deference a function pointer, which results in a "function" type which is really confusing to work with, and I will choose to ignore it for now.
We can "call" a function pointer, using the notation you used: fun_ptr(). The meaning of this is identical to calling whatever function is being pointed at.

That's all they do at the abstract level. Underneath that, compilers are free to implement it however they see fit. If a compiler wanted to have a FunctionPtrType which is actually an index into some big table of every function in the program, they could.

However, this is typically not how it is implemented. When compiling C++ down to assembly/machine code, we tend to take advantage of as many architecture-specific tricks as possible, to save runtime. On real life computers, there is almost always an "indirect jump" operation, which reads a variable (usually a register), and jumps over to begin executing the code that's stored at that memory address. Its almost univeral that functions are compiled into contiguous blocks of instructions, so if you ever jump to the first instruction in the block, it has the logical effect of calling that function. The address of the first instruction happens to satisfy every one of the comparisons required by C++'s abstract concept of a function pointer and it happens to be exactly the value the hardware needs to use an indirect jump to call the function! That's so convenient, that virtually every compiler chooses to implement it that way!

However, when we start talking about why the pointer you thought you were looking at was the same as the function pointer, we have to get into something a bit more nuanced: segments.

Static variables are stored separate from the code. There's a few reasons for that. One is that you want your code as tight as possible. You don't want your code speckled with the memory spaces to store variables. It'd be inefficient. You'd have to skip over all sorts of stuff, rather than just getting to plow through it. There's also a more modern reason: most computers allow you to mark some memory as "executable" and some "writable." Doing this helps tremendously for dealing with some really evil hacker tricks. We try to never mark something as both executable and writable at the same time, in case a hacker cleverly finds a way to trick our program into overwriting some of our functions with their own!

Accordingly, there is typically a .code segment (using that dotted notation simply because its a popular way to notate it in many architectures). In this segment, you find all of the code. The static data will go in somewhere like .bss. So you may find your static string stored quite far away from the code that operates on it (typically at least 4kb away, because most modern hardware allows you to set execute or write permissions at the page level: pages are 4kb in lots of modern systems)

Now the last piece... the stack. You mentioned storing things on the stack in a confusing way, which suggests it may be helpful to give it a quick going over. Let me make a quick recursive function, because they are more effective at demonstrating what is going on in the stack.

int fib(int x) {
    if (x == 0)
        return 0;

    if (x == 1)
        return 1;

    return fib(x-1)+fib(x-2);
}

This function calculates the Fibonacci sequence using a rather inefficient but clear way of doing it.

We have one function, fib. This means &fib is always a pointer to the same place, but we're clearly calling fib many times, so each one needs its own space right?

On the stack we have what are called "frames." Frames are not the functions themselves, but rather they are sections of memory which this particular invocation of the function is allowed to use. Every time you call a function, like fib, you allocate a little more space on the stack for its frame (or, more pedantically, it will allocate it after you make the call).

In our case, fib(x) clearly needs to store the result of fib(x-1) while executing fib(x-2). It can't store this in the function itself, or even in the .bss segment because we don't know how many times it is going to get recursed. Instead, it allocates space on the stack to store its own copy of the result of fib(x-1) while fib(x-2) is operating in its own frame (using the exact same function, and the same function address). When fib(x-2) returns, fib(x) simply loads up that old value, which it is certain has not been touched by anyone else, adds the results, and returns it!

How does it do this? Virtually every processor out there has some support for a stack in hardware. On x86, this is known as the ESP register (extended-stack pointer). Programs generally agree to treat this as a pointer to the next spot in the stack where you can start storing data. You're welcome to move this pointer around to build yourself space for a frame, and move in. When you finish executing, you are expected to move everything back.

In fact, on most platforms, the first instruction in your function is not the first instruction in the final compiled version. Compilers inject a few extra ops to manage this stack pointer for you, so that you never even have to worry about it. On some platforms, like x86_64, this behavior is often even mandatory and specified in the ABI!

So in all we have:

.code segment - where your function's instructions are stored. The function pointer will point to the first instruction in here. This segment is typically marked "execute/read only," preventing your program from writing to it after it has been loaded.
.bss segment - where your static data will get stored, because it can't be part of the "execute only" .code segment if it wants to be data.
the stack - where your functions can store frames, which keep track of the data needed just for that one instantation, and nothing more. (Most platforms also use this to store the information about where to return to after a function finishes)
the heap - This didn't appear in this answer, because your question doesn't include any heap activities. However, for completeness, I've left it here so that it doesn't surprise you later.

0人赞添加讨论(0) 举报

1 2 下一页

What is the address of a function in a C++ program

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间