Just a fundamental question regarding python and .join() method:
file1 = open(f1,"r")
file2 = open(f2,"r")
file3 = open("results","w")
diff = difflib.Differ()
result = diff.compare(file1.read(),file2.read())
file3.write("".join(result)),
The above snippet of code yields a nice output stored in a file called "results", in string format, showing the differences between the two files line-by-line. However I notice that if I just print "result" without using .join(), the compiler returns a message that includes a memory address. After trying to write the result to the file without using .join(), I was informed by the compiler that only strings and character buffers may be used in the .join() method, and not generator objects. So based off of all the evidence that I have adduced, please correct me if I am wrong:
result = diff.compare(file1.read(),file2.read())
<---- result is a generator object?
result
is a list of strings, with result
itself being the reference to the first string?
.join()
takes a memory address and points to the first, and then iterates over the rest of the addresses of strings in that structure?
A generator object is an object that returns a pointer?
I apologize if my questions are unclear, but I basically wanted to ask the python veterans if my deductions were correct. My question is less about the observable results, and more so about the inner workings of python. I appreciate all of your help.
join
is a method of strings. That method takes any iterable and iterates over it and joins the contents together. (The contents have to be strings, or it will raise an exception.)
If you attempt to write the generator object directly to the file, you will just get the generator object itself, not its contents. join
"unrolls" the contents of the generator.
You can see what is going with a simple, explicit generator:
def gen():
yield 'A'
yield 'B'
yield 'C'
>>> g = gen()
>>> print g
<generator object gen at 0x0000000004BB9090>
>>> print ''.join(g)
ABC
The generator doles out its contents one at a time. If you try to look at the generator itself, it doesn't dole anything out and you just see it as "generator object". To get at its contents, you need to iterate over them. You can do this with a for
loop, with the next
function, or with any of various other functions/methods that iterate over things (str.join
among them).
When you say that result "is a list of string" you are getting close to the idea. A generator (or iterable) is sort of like a "potential list". Instead of actually being a list of all its contents all at once, it lets you peel off each item one at a time.
None of the objects is a "memory address". The string representation of a generator object (like that of many other objects) includes a memory address, so if you print it (as above) or write it to a file, you'll see that address. But that doesn't mean that object "is" that memory address, and the address itself isn't really usable as such. It's just a handy identifying tag so that if you have multiple objects you can tell them apart.