Can anyone explain this bizarre bug iterating over

I had a loop of the form for thing in a_set:. It was working incorrectly because, occasionally and inconsistently, it would pull the same thing from the set twice. (This does not cause the program to crash. It just gets the wrong answer.) I was not able to determine anything that was deterministic about the wrong behavior; but my attempts to debug it made it very clear that the bizarreness was happening sometimes. In the cases in which I observed it most closely, there were 3 items in the set (before and after) and the loop executed 4 times, once with a repeat of one of the items. The items were references to objects of a class I had created (treated more like a C struct). The bad behavior went away when I changed the for statement to for thing in list(a_set):.

I am at a total loss to explain the wrong behavior. I am very certain that nothing in the body of the loop can cause what it is doing to happen twice or change the value of the thing variable. I am fairly certain that what is going on in the loop could not try to affect the composition of the set. Furthermore, even if it could, I believe that would cause a RuntimeError. I am at a complete loss for coming up with hypotheses about what could possibly be causing this. The lack of repeatability running the same code consecutively is especially mysterious. My attempts to recreate the symptom in a simpler scenario have failed. Nevertheless, I would feel silly about leaving the list() invocation in there just to solve a problem I cannot explain. Anyone else's hypothesizing would be welcome. I need ideas about what sorts of things I should be trying to eliminate in debugging it.

Update: I think this question was incorrectly put on hold based on a claim that it was off topic. The lack of reproducibility was the issue in this case, and I suspected that there was some nuance of the language that I was missing. Indeed, that does turn out to be the case, and MSeifert's answer put me on to what was causing it. However, it was not quite as simple as what he speculated, as I note in a comment on his answer.

I also confused the issue by saying the objects in the set were mutable. They are not. They are references to objects whose attributes are changeable. (That could have been inferred from what I wrote, but I was incorrectly using the word "mutable" in a general sense and not in the Python technical sense.) What is hashed is the address of the object, independent of the values of its attributes. Were those object references mutable, Python would never have let me put them in a set in the first place.

If the error went away when you added the list(a_set) it's very likely that you changed the set during the iteration. In general this throws a RuntimeError but in case you add as many elements as you remove it doesn't trigger:

a = {1,2,3}
for item in a:
    print(item)
    a.add(item+3)  # add one item
    a.remove(item) # remove one item

prints the numbers 1 to 31 (the amount is actually an implementation detail so you may see different amounts) and before and after the loop as well as at the beginning of each iteration the set contains 3 elements.

However if I add a list call it creates a copy (as list) of the original set and only iterates over the elements that were present in the original set:

a = {1,2,3}
for item in list(a):
    print(item)
    a.add(item+3)
    a.remove(item)

print(a)

prints:

1
2
3
set([4, 5, 6])   # totally changed!

In the comments you noted that the classes you have in the set are mutable, so even though you might think you remove and add the same element it may not be the same element anymore (from the point of view of the set). In general you shouldn't put mutable classes in a set or as keys in a dict because you have to be really careful that the mutability cannot affect the result of the __hash__ or __eq__ methods.

Just an example that iterates over a seemingly "random" number of set elements:

class Fun(object):
    def __init__(self, value):
        self.value = value

    def __repr__(self):
        return '{self.__class__.__name__}({self.value})'.format(self=self)

    def __eq__(self, other):
        return self.value == other.value

a = {Fun(1),Fun(2),Fun(3)}
for item in a:
    print(item)
    a.add(Fun(item.value+3))
    a.remove(item)

will actually show a "random" (not really random it just depends on the hashes of the instances and in this case the hash depends on the id of the class object which changes each time I run the code) number of Fun objects each time I run the snippet.