My question is about how Perl manages the data of objects internally.
When creating objects in Perl, the new subroutine will typically return a reference to a blessed object.
Take the following code as an example:
# Create a new object
my $object = Object->new(%data1);
# Create a new object with the same variable
$object = Object->new(%data2);
From the first call to new
we create an $object
that references some blessed %data1
Visual representation:
The "\" symbolizes a reference.
$object ════> \{bless %data1}
This would look as follows in memory:
The "&" symbolizes an address
MEMORY:
----------------------------------
&{bless %data1} ════> bless %data1
Then with the second call to new
the value of $object
is changed to reference some other blessed %data2
Visual representation:
$object ══/ /══> \{bless %data1} # The connection to %data1 is broken
║
╚═══════════> \{bless %data2}
Now memory would look like this:
MEMORY:
----------------------------------
&{bless %data1} ════> bless %data1
&{bless %data2} ════> bless %data2
The problem is now that $object
no longer stores the reference \{bless %data1}
, the address &{bless %data1}
and any data stored at this address is lost forever. There is no possible way to access the data stored at that location from the script anymore.
My question is . . . Is Perl smart enough to remove the data stored in &{bless %data1}
once the reference to this data is lost forever, or will Perl keep that data in memory potentially causing a memory leak?
Given
package Object {
sub new { my $class = shift; bless({ @_ }, $class) }
}
my $object = Object->new( a => 1, b => 2 );
Before the second assignment, you have
+============+ +==========+
$object -->[ Reference ----->[ Blessed ]
+============+ [ Hash ]
[ ] +==========+
[ a: --------->[ 1 ]
[ ] +==========+
[ ]
[ ] +==========+
[ b: --------->[ 2 ]
[ ] +==========+
+==========+
(The arrows represent pointers.)
Perl uses reference counting to determine when to free variables. As part of the assignment, the reference count of the variable currently referenced by the name (the Reference) will be decremented, causing it to be freed[1]. This will drop the reference count of the hash, causing it to be freed[1]. This will drop the reference count of the values, causing them to be freed[1].
In Perl, you get memory leaks when you have cyclic references.
{
my $parent = Node->new();
my $child = Node->new();
$parent->{children} = [ $child ];
$child->{parent} = $parent;
}
Before exiting the block, you have
+-----------------------------------------------------+
| |
+-->+============+ +==========+ |
[ Reference ----->[ Blessed ] |
$parent -->+============+ [ Hash ] |
[ ] +==========+ |
[ children --->[ Array ] |
[ ] [ ] |
+==========+ [ 0: ---------+ |
[ ] | |
+==========+ | |
| |
+--------------------------------------------------+ |
| |
+-->+============+ +==========+ |
[ Reference ----->[ Blessed ] |
$child --->+============+ [ Hash ] |
[ ] |
[ parent: ----------------------+
[ ]
+==========+
After existing the block, you have
+-----------------------------------------------------+
| |
| +============+ +==========+ |
+-->[ Reference ----->[ Blessed ] |
+============+ [ Hash ] |
[ ] +==========+ |
[ children --->[ Array ] |
[ ] [ ] |
+==========+ [ 0: ---------+ |
[ ] | |
+==========+ | |
| |
+--------------------------------------------------+ |
| |
| +============+ +==========+ |
+-->[ Reference ----->[ Blessed ] |
+============+ [ Hash ] |
[ ] |
[ parent -----------------------+
[ ]
+==========+
The memory didn't get freed because everything is still being referenced because there's a reference cycle. Since you have no way of accessing this structure (no variable names reference anything in it), it's a memory leak.
- Assuming nothing else references (points to) those variables.
You're misunderstanding the way parameter passing works. $object
becomes a newly-created reference whose contents may be affected by the data passed to the constructor new
, but it won't be a reference to hashes %data1
or %data2
themselves as new
is given only key/value contents of those hashes
The bottom line of your question seems to be whether Perl is smart enough to deallocate objects when they are no longer used, and the answer is that it is, yes
Perl keeps a count of references to each data item, and if that ever falls to zero (i.e. there is no longer any way to reach that data) then the data is considered to be available for reuse
The only case where Perl can cause a memory leack is when a data structure contains a reference to itself. In that case the number of external references may fall to zero, but the data keeps itself from being deleted by its own reference keeping the count from falling to zero
It is also much safer to avoid package variables, and use only lexical variables declared with my
. Lexical variables will be destroyed automatically as they go out of scope, and so reduce the count of any reference that they may have contained. Package variables, declared with our
, will exist for ther lifetime of the process and won't trigger this safeguard
If you explain a little more about why you need this information then I am sure you will get much better answers
Perl uses a method called reference counting - it counts how many times a variable is referenced. It keeps that data in memory until the reference count drops to zero.
In your example, the first object created will disappear automatically as soon as you reassign $object
. However there's a caveat - if within your object and the new
process you create a circular reference, this won't happen. You can use weaken
within Scalar::Util
to deal with this.
You can watch it by creating a DESTROY
method, which is called when an object is 'freed'.
There's reference counting garbage collection. I'm not seeing anything in your code that would trip up said. Even if there were, there's weaken in Scalar::Util, among other options.