Just trying to understand how to address a single character in an array of strings. Also, this of course will allow me to understand pointers to pointers subscripting in general.
If I have char **a
and I want to reach the 3rd character of the 2nd string, does this work: **((a+1)+2)
? Seems like it should...
相关问题
- Multiple sockets for clients to connect to
- Do the Java Integer and Double objects have unnece
- how to split a list into a given number of sub-lis
- What is the best way to do a search in a large fil
- How to get the maximum of more than 2 numbers in V
Quote from the wikipedia article on C pointers -
In C, array indexing is formally defined in terms of pointer arithmetic; that is, the language specification requires that array[i] be equivalent to *(array + i). Thus in C, arrays can be thought of as pointers to consecutive areas of memory (with no gaps), and the syntax for accessing arrays is identical for that which can be used to dereference pointers. For example, an array can be declared and used in the following manner:
So, in response to your question - yes, pointers to pointers can be used as an array without any kind of other declaration at all!
Try
a[1][2]
. Or*(*(a+1)+2)
.Basically, array references are syntactic sugar for pointer dereferencing. a[2] is the same as a+2, and also the same as 2[a] (if you really want unreadable code). An array of strings is the same as a double pointer. So you can extract the second string using either a[1] or
*(a+1)
. You can then find the third character in that string (call it 'b' for now) with either b[2] or*(b + 2)
. Substituting the original second string for 'b', we end up with either a[1][2] or*(*(a+1)+2)
.Theres a brilliant C programming explanation in the book Hacking the art of exploitation 2nd Edition by Jon Erickson which discusses pointers, strings, worth a mention for the programming explanation section alone https://leaksource.files.wordpress.com/2014/08/hacking-the-art-of-exploitation.pdf.
Although the question has already been answered, someone else wanting to know more may find the following highlights from Ericksons book useful to understand some of the structure behind the question.
Headers
Examples of header files available for variable manipulation you will probably use.
stdio.h - http://www.cplusplus.com/reference/cstdio/
stdlib.h - http://www.cplusplus.com/reference/cstdlib/
string.h - http://www.cplusplus.com/reference/cstring/
limits.h - http://www.cplusplus.com/reference/climits/
Functions
Examples of general purpose functions you will probably use.
malloc() - http://www.cplusplus.com/reference/cstdlib/malloc/
calloc() - http://www.cplusplus.com/reference/cstdlib/calloc/
strcpy() - http://www.cplusplus.com/reference/cstring/strcpy/
Memory
"A compiled program’s memory is divided into five segments: text, data, bss, heap, and stack. Each segment represents a special portion of memory that is set aside for a certain purpose. The text segment is also sometimes called the code segment. This is where the assembled machine language instructions of the program are located".
"The execution of instructions in this segment is nonlinear, thanks to the aforementioned high-level control structures and functions, which compile into branch, jump, and call instructions in assembly language. As a program executes, the EIP is set to the first instruction in the text segment. The processor then follows an execution loop that does the following:"
"1. Reads the instruction that EIP is pointing to"
"2. Adds the byte length of the instruction to EIP"
"3. Executes the instruction that was read in step 1"
"4. Goes back to step 1"
"Sometimes the instruction will be a jump or a call instruction, which changes the EIP to a different address of memory. The processor doesn’t care about the change, because it’s expecting the execution to be nonlinear anyway. If EIP is changed in step 3, the processor will just go back to step 1 and read the instruction found at the address of whatever EIP was changed to".
"Write permission is disabled in the text segment, as it is not used to store variables, only code. This prevents people from actually modifying the program code; any attempt to write to this segment of memory will cause the program to alert the user that something bad happened, and the program will be killed. Another advantage of this segment being read-only is that it can be shared among different copies of the program, allowing multiple executions of the program at the same time without any problems. It should also be noted that this memory segment has a fixed size, since nothing ever changes in it".
"The data and bss segments are used to store global and static program variables. The data segment is filled with the initialized global and static variables, while the bss segment is filled with their uninitialized counterparts. Although these segments are writable, they also have a fixed size. Remember that global variables persist, despite the functional context (like the variable j in the previous examples). Both global and static variables are able to persist because they are stored in their own memory segments".
"The heap segment is a segment of memory a programmer can directly control. Blocks of memory in this segment can be allocated and used for whatever the programmer might need. One notable point about the heap segment is that it isn’t of fixed size, so it can grow larger or smaller as needed".
"All of the memory within the heap is managed by allocator and deallocator algorithms, which respectively reserve a region of memory in the heap for use and remove reservations to allow that portion of memory to be reused for later reservations. The heap will grow and shrink depending on how much memory is reserved for use. This means a programmer using the heap allocation functions can reserve and free memory on the fly. The growth of the heap moves downward toward higher memory addresses".
"The stack segment also has variable size and is used as a temporary scratch pad to store local function variables and context during function calls. This is what GDB’s backtrace command looks at. When a program calls a function, that function will have its own set of passed variables, and the function’s code will be at a different memory location in the text (or code) segment. Since the context and the EIP must change when a function is called, the stack is used to remember all of the passed variables, the location the EIP should return to after the function is finished, and all the local variables used by that function. All of this information is stored together on the stack in what is collectively called a stack frame. The stack contains many stack frames".
"In general computer science terms, a stack is an abstract data structure that is used frequently. It has first-in, last-out (FILO) ordering , which means the first item that is put into a stack is the last item to come out of it. Think of it as putting beads on a piece of string that has a knot on one end—you can’t get the first bead off until you have removed all the other beads. When an item is placed into a stack, it’s known as pushing, and when an item is removed from a stack, it’s called popping".
"As the name implies, the stack segment of memory is, in fact, a stack data structure, which contains stack frames. The ESP register is used to keep track of the address of the end of the stack, which is constantly changing as items are pushed into and popped off of it. Since this is very dynamic behavior, it makes sense that the stack is also not of a fixed size. Opposite to the dynamic growth of the heap, as the stack change s in size, it grows upward in a visual listing of memory, toward lower memory addresses".
"The FILO nature of a stack might seem odd, but since the stack is used to store context, it’s very useful. When a function is called, several things are pushed to the stack together in a stack frame. The EBP register—sometimes called the frame pointer (FP) or local base (LB) pointer —is used to reference local function variables in the current stack frame. Each stack frame contains the parameters to the function, its local variables, and two pointers that are necessary to put things back the way they were: the saved frame pointer (SFP) and the return address. The SFP is used to restore EBP to its previous value, and the return address is used to restore EIP to the next instruction found after the function call. This restores the functional context of the previous stack frame".
Strings
"In C, an array is simply a list of n elements of a specific data type. A 20-character array is simply 20 adjacent characters located in memory. Arrays are also referred to as buffers".
"In the preceding program, a 20-element character array is defined as str_a, and each element of the array is written to, one by one. Notice that the number begins at 0, as opposed to 1. Also notice that the last character is a 0".
"(This is also called a null byte.) The character array was defined, so 20 bytes are allocated for it, but only 12 of these bytes are actually used. The null byte Programming at the end is used as a delimiter character to tell any function that is dealing with the string to stop operations right there. The remaining extra bytes are just garbage and will be ignored. If a null byte is inserted in the fifth element of the character array, only the characters Hello would be printed by the printf() function".
"Since setting each character in a character array is painstaking and strings are used fairly often, a set of standard functions was created for string manipulation. For example, the strcpy() function will copy a string from a source to a destination, iterating through the source string and copying each byte to the destination (and stopping after it copies the null termination byte)".
"The order of the functions arguments is similar to Intel assembly syntax destination first and then source. The char_array.c program can be rewritten using strcpy() to accomplish the same thing using the string library. The next version of the char_array program shown below includes string.h since it uses a string function".
Find more information on C strings
http://www.cs.uic.edu/~jbell/CourseNotes/C_Programming/CharacterStrings.html
http://www.tutorialspoint.com/cprogramming/c_strings.htm
Pointers
"The EIP register is a pointer that “points” to the current instruction during a programs execution by containing its memory address. The idea of pointers is used in C, also. Since the physical memory cannot actually be moved, the information in it must be copied. It can be very computationally expensive to copy large chunks of memory to be used by different functions or in different places. This is also expensive from a memory standpoint, since space for the new destination copy must be saved or allocated before the source can be copied. Pointers are a solution to this problem. Instead of copying a large block of memory, it is much simpler to pass around the address of the beginning of that block of memory".
"Pointers in C can be defined and used like any other variable type. Since memory on the x86 architecture uses 32-bit addressing, pointers are also 32 bits in size (4 bytes). Pointers are defined by prepending an asterisk (*) to the variable name. Instead of defining a variable of that type, a pointer is defined as something that points to data of that type. The pointer.c program is an example of a pointer being used with the char data type, which is only 1byte in size".
"As the comments in the code indicate, the first pointer is set at the beginning of the character array. When the character array is referenced like this, it is actually a pointer itself. This is how this buffer was passed as a pointer to the printf() and strcpy() functions earlier. The second pointer is set to the first pointers address plus two, and then some things are printed (shown in the output below)".
"The address-of operator is often used in conjunction with pointers, since pointers contain memory addresses. The addressof.c program demonstrates the address-of operator being used to put the address of an integer variable into a pointer. This line is shown in bold below".
"An additional unary operator called the dereference operator exists for use with pointers. This operator will return the data found in the address the pointer is pointing to, instead of the address itself. It takes the form of an asterisk in front of the variable name, similar to the declaration of a pointer. Once again, the dereference operator exists both in GDB and in C".
"A few additions to the addressof.c code (shown in addressof2.c) will demonstrate all of these concepts. The added printf() functions use format parameters, which I’ll explain in the next section. For now, just focus on the programs output".
"When the unary operators are used with pointers, the address-of operator can be thought of as moving backward, while the dereference operator moves forward in the direction the pointer is pointing".
Find out more about Pointers & memory allocation
Professor Dan Hirschberg, Computer Science Department, University of California on computer memory https://www.ics.uci.edu/~dan/class/165/notes/memory.html
http://cslibrary.stanford.edu/106/
http://www.programiz.com/c-programming/c-dynamic-memory-allocation
Arrays
Theres a simple tutorial on multi-dimensional arrays by a chap named Alex Allain available here http://www.cprogramming.com/tutorial/c/lesson8.html
Theres information on arrays by a chap named Todd A Gibson available here http://www.augustcouncil.com/~tgibson/tutorial/arr.html
Iterate an Array
Linked Lists vs Arrays
Arrays are not the only option available, information on Linked List.
http://www.eternallyconfuzzled.com/tuts/datastructures/jsw_tut_linklist.aspx
Conclusion
This information was written simply to pass on some of what I have read throughout my research on the topic that might help others.
Iirc, a string is actually an array of chars, so this should work:
Almost, but not quite. The correct answer is:
because you need to first de-reference to one of the actual string pointers and then you to de-reference that selected string pointer down to the desired character. (Note that I added extra parenthesis for clarity in the order of operations there).
Alternatively, this expression:
will also work!....and perhaps would be preferred because the intent of what you are trying to do is more self evident and the notation itself is more succinct. This form may not be immediately obvious to people new to the language, but understand that the reason the array notation works is because in C, an array indexing operation is really just shorthand for the equivalent pointer operation. ie: *(a+x) is same as a[x]. So, by extending that logic to the original question, there are two separate pointer de-referencing operations cascaded together whereby the expression a[x][y] is equivalent to the general form of *((*(a+x))+y).
You don't have to use pointers.
Then:
That's a one and an l.
You could use pointers if you want...
or even
As someone pointed out above.
This is because array names are pointers.