Perl memory management when overwriting objects

2019-08-07 12:48发布

问题:

My question is about how Perl manages the data of objects internally.

When creating objects in Perl, the new subroutine will typically return a reference to a blessed object.

Take the following code as an example:

# Create a new object
my $object = Object->new(%data1);

# Create a new object with the same variable
$object = Object->new(%data2);

From the first call to new we create an $object that references some blessed %data1

Visual representation:

The "\" symbolizes a reference.

$object ════> \{bless %data1}

This would look as follows in memory:

The "&" symbolizes an address

MEMORY:
----------------------------------
&{bless %data1} ════> bless %data1

Then with the second call to new the value of $object is changed to reference some other blessed %data2

Visual representation:

$object ══/ /══> \{bless %data1}  # The connection to %data1 is broken
   ║
   ╚═══════════> \{bless %data2}

Now memory would look like this:

MEMORY:
----------------------------------
&{bless %data1} ════> bless %data1
&{bless %data2} ════> bless %data2

The problem is now that $object no longer stores the reference \{bless %data1}, the address &{bless %data1} and any data stored at this address is lost forever. There is no possible way to access the data stored at that location from the script anymore.

My question is . . . Is Perl smart enough to remove the data stored in &{bless %data1} once the reference to this data is lost forever, or will Perl keep that data in memory potentially causing a memory leak?

回答1:

Given

package Object {
   sub new { my $class = shift; bless({ @_ }, $class) }
}

my $object = Object->new( a => 1, b => 2 );

Before the second assignment, you have

           +============+    +==========+
$object -->[ Reference ----->[ Blessed  ]
           +============+    [ Hash     ]
                             [          ]   +==========+
                             [ a: --------->[ 1        ]
                             [          ]   +==========+
                             [          ]
                             [          ]   +==========+
                             [ b: --------->[ 2        ]
                             [          ]   +==========+
                             +==========+

(The arrows represent pointers.)

Perl uses reference counting to determine when to free variables. As part of the assignment, the reference count of the variable currently referenced by the name (the Reference) will be decremented, causing it to be freed[1]. This will drop the reference count of the hash, causing it to be freed[1]. This will drop the reference count of the values, causing them to be freed[1].


In Perl, you get memory leaks when you have cyclic references.

{
   my $parent = Node->new();
   my $child = Node->new();
   $parent->{children} = [ $child ];
   $child->{parent} = $parent;
}

Before exiting the block, you have

       +-----------------------------------------------------+
       |                                                     |
       +-->+============+    +==========+                    |
           [ Reference ----->[ Blessed  ]                    |
$parent -->+============+    [ Hash     ]                    |
                             [          ]   +==========+     |
                             [ children --->[ Array    ]     |
                             [          ]   [          ]     |
                             +==========+   [ 0: ---------+  |
                                            [          ]  |  |
                                            +==========+  |  |
                                                          |  |
       +--------------------------------------------------+  |
       |                                                     |
       +-->+============+    +==========+                    |
           [ Reference ----->[ Blessed  ]                    |
$child --->+============+    [ Hash     ]                    |
                             [          ]                    |
                             [ parent: ----------------------+
                             [          ]
                             +==========+

After existing the block, you have

       +-----------------------------------------------------+
       |                                                     |
       |   +============+    +==========+                    |
       +-->[ Reference ----->[ Blessed  ]                    |
           +============+    [ Hash     ]                    |
                             [          ]   +==========+     |
                             [ children --->[ Array    ]     |
                             [          ]   [          ]     |
                             +==========+   [ 0: ---------+  |
                                            [          ]  |  |
                                            +==========+  |  |
                                                          |  |
       +--------------------------------------------------+  |
       |                                                     |
       |   +============+    +==========+                    |
       +-->[ Reference ----->[ Blessed  ]                    |
           +============+    [ Hash     ]                    |
                             [          ]                    |
                             [ parent -----------------------+
                             [          ]
                             +==========+

The memory didn't get freed because everything is still being referenced because there's a reference cycle. Since you have no way of accessing this structure (no variable names reference anything in it), it's a memory leak.


  1. Assuming nothing else references (points to) those variables.


回答2:

You're misunderstanding the way parameter passing works. $object becomes a newly-created reference whose contents may be affected by the data passed to the constructor new, but it won't be a reference to hashes %data1 or %data2 themselves as new is given only key/value contents of those hashes

The bottom line of your question seems to be whether Perl is smart enough to deallocate objects when they are no longer used, and the answer is that it is, yes

Perl keeps a count of references to each data item, and if that ever falls to zero (i.e. there is no longer any way to reach that data) then the data is considered to be available for reuse

The only case where Perl can cause a memory leack is when a data structure contains a reference to itself. In that case the number of external references may fall to zero, but the data keeps itself from being deleted by its own reference keeping the count from falling to zero

It is also much safer to avoid package variables, and use only lexical variables declared with my. Lexical variables will be destroyed automatically as they go out of scope, and so reduce the count of any reference that they may have contained. Package variables, declared with our, will exist for ther lifetime of the process and won't trigger this safeguard

If you explain a little more about why you need this information then I am sure you will get much better answers



回答3:

Perl uses a method called reference counting - it counts how many times a variable is referenced. It keeps that data in memory until the reference count drops to zero.

In your example, the first object created will disappear automatically as soon as you reassign $object. However there's a caveat - if within your object and the new process you create a circular reference, this won't happen. You can use weaken within Scalar::Util to deal with this.

You can watch it by creating a DESTROY method, which is called when an object is 'freed'.



回答4:

There's reference counting garbage collection. I'm not seeing anything in your code that would trip up said. Even if there were, there's weaken in Scalar::Util, among other options.