Are keySet entries of a WeakHashMap never null?

2019-06-16 03:06发布

问题:

If I iterate over the key set of a WeakHashMap, do I need to check for null values?

WeakHashMap<MyObject, WeakReference<MyObject>> hm
    = new WeakHashMap<MyObject, WeakReference<MyObject>>();

for ( MyObject item : hm.keySet() ) {
    if ( item != null ) { // <- Is this test necessary?
        // Do something...
    } 
}

In other words, can elements of the WeakHashMap be collected while I am iterating over them?

EDIT

For the sake of this question, one can assume that no null entries is added in the hash map.

回答1:

Again from WeakHashMap javadoc:

A hashtable-based Map implementation with weak keys. An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use. More precisely, the presence of a mapping for a given key will not prevent the key from being discarded by the garbage collector, that is, made finalizable, finalized, and then reclaimed. When a key has been discarded its entry is effectively removed from the map, so this class behaves somewhat differently from other Map implementations.

Which I read as: Yep... When there are no remaining external references to a Key in WeakHaskMap, then that Key maybe GC'd, making the associated Value unreachable, so it to (presuming there are no external references directly to it) is elligible for GC.

I'm going to test this theory. It's only my interpretation of the doco... I don't have any experience with WeakHashMap... but I immediately see it's potential as "memory-safe" object-cache.

Cheers. Keith.


EDIT: Exploring WeakHashMap... specifically testing my theory that an external-references to the particular key would cause that key to be retained... which is pure bunkum ;-)

My test harness:

package forums;

import java.util.Set;
import java.util.Map;
import java.util.WeakHashMap;
import krc.utilz.Random;

public class WeakCache<K,V> extends WeakHashMap<K,V>
{
  private static final int NUM_ITEMS = 2000;
  private static final Random RANDOM = new Random();

  private static void runTest() {
    Map<String, String> cache = new WeakCache<String, String>();
    String key; // Let's retain a reference to the last key object
    for (int i=0; i<NUM_ITEMS; ++i ) {
      /*String*/ key = RANDOM.nextString();
      cache.put(key, RANDOM.nextString());
    }

    System.out.println("There are " + cache.size() + " items of " + NUM_ITEMS + " in the cache before GC.");

    // try holding a reference to the keys
    Set<String> keys = cache.keySet();
    System.out.println("There are " + keys.size() + " keys");

    // a hint that now would be a good time to run the GC. Note that this
    // does NOT guarantee that the Garbage Collector has actually run, or
    // that it's done anything if it did run!
    System.gc();

    System.out.println("There are " + cache.size() + " items of " + NUM_ITEMS + " remaining after GC");
    System.out.println("There are " + keys.size() + " keys");
  }

  public static void main(String[] args) {
    try {
      for (int i=0; i<20; ++i ) {
        runTest();
        System.out.println();
      }
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

The (rathering perplexing, I think) results of one test-run:

There are 1912 items of 2000 in the cache before GC.
There are 1378 keys
There are 1378 items of 2000 remaining after GC
There are 909 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 1961 items of 2000 remaining after GC
There are 1588 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 1936 items of 2000 remaining after GC
There are 1471 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 1669 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 1264 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 1770 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 1679 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 1774 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 1668 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 1834 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 2000 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 429 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 0 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 0 items of 2000 remaining after GC
There are 0 keys

It would appear that keys are still disappearing WHILE my code is executing... possibly a micro-sleep is required after the GC-hint... to give the GC time to do it's stuff. Anyway, this "volatility" is interesting behaviour.


EDIT 2: Yup, adding the line try{Thread.sleep(10);}catch(Exception e){} directly after the System.gc(); makes the results "more predictable".

There are 1571 items of 2000 in the cache before GC.
There are 1359 keys
There are 0 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 0 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 0 items of 2000 remaining after GC
There are 0 keys

There are 2000 items of 2000 in the cache before GC.
There are 2000 keys
There are 0 items of 2000 remaining after GC
There are 0 keys

.... and so on for 20 runs ...

Hmmm... A cache that just completely disappears when the GC kicks in... at arbitrary times in a real app... not much use... Hmmm... What is WeakHashMap for I wonder? ;-)


Last EDIT, I promise

Here's my krc/utilz/Random (used in the above test)

package krc.utilz;

import java.io.Serializable;
import java.nio.charset.Charset;

/**
 * Generates random values. Extends java.util.Random to do all that plus:<ul>
 * <li>generate random values in a given range, and
 * <li>generate Strings of random characters and random length.
 * </ul>
 * <p>
 * Motivation: I wanted to generate random Strings of random length for test 
 *  data in some jUnit tests, and was suprised to find no such ability in the
 *  standard libraries... so I googled it, and came up with Glen McCluskey's
 *  randomstring function at http://www.glenmccl.com/tip_010.htm. Then I thought
 *  aha, that's pretty cool, but if we just extended it a bit, and packaged it
 *  properly then it'd be useful, and reusable. Cool!
 * See: http://www.glenmccl.com/tip_010.htm
 * See: http://forum.java.sun.com/thread.jspa?threadID=5117756&messageID=9406164
 */
public class Random extends java.util.Random  implements Serializable
{

  private static final long serialVersionUID = 34324;
  public static final int DEFAULT_MIN_STRING_LENGTH = 5;
  public static final int DEFAULT_MAX_STRING_LENGTH = 25;

  public Random() {
    super();
  }

  public Random(long seed) {
    super(seed);
  }

  public double nextDouble(double lo, double hi) {
    double n = hi - lo;
    double i = super.nextDouble() % n;
    if (i < 0) i*=-1.0;
    return lo + i;
  }

  /**
   * @returns a random int between lo and hi, inclusive.
   */
  public int nextInt(int lo, int hi) 
    throws IllegalArgumentException
  {
    if(lo >= hi) throw new IllegalArgumentException("lo must be < hi");
    int n = hi - lo + 1;
    int i = super.nextInt() % n;
    if (i < 0) i = -i;
    return lo + i;
  }

  /**
   * @returns a random int between lo and hi (inclusive), but exluding values
   *  between xlo and xhi (inclusive).
   */
  public int nextInt(int lo, int hi, int xlo, int xhi) 
    throws IllegalArgumentException
  {
    if(xlo < lo) throw new IllegalArgumentException("xlo must be >= lo");
    if(xhi > hi) throw new IllegalArgumentException("xhi must be =< hi");
    if(xlo > xhi) throw new IllegalArgumentException("xlo must be >= xhi");
    int i;
    do {
      i = nextInt(lo, hi);
    } while(i>=xlo && i<=xhi);
    return(i);
  }

  /**
   * @returns a string (of between 5 and 25 characters, inclusive) 
   *  consisting of random alpha-characters [a-z]|[A-Z].
   */
  public String nextString()
    throws IllegalArgumentException
  {
    return(nextString(DEFAULT_MIN_STRING_LENGTH, DEFAULT_MAX_STRING_LENGTH));
  }

  /**
   * @returns a String (of between minLen and maxLen chars, inclusive) 
   *  which consists of random alpha-characters. The returned string matches
   *  the regex "[A-Za-z]{$minLen,$maxLan}". 
   * @nb: excludes the chars "[\]^_`" between 'Z' and 'a', ie chars (91..96).
   * @see: http://www.neurophys.wisc.edu/comp/docs/ascii.html
   */
  public String nextString(int minLen, int maxLen)
    throws IllegalArgumentException
  {
    if(minLen < 0) throw new IllegalArgumentException("minLen must be >= 0");
    if(minLen > maxLen) throw new IllegalArgumentException("minLen must be <= maxLen");
    return(nextString(minLen, maxLen, 'A', 'z', '[', '`'));
  }

  /**
   * @does: generates a String (of between minLen and maxLen chars, inclusive) 
   *  which consists of characters between lo and hi, inclusive.
   */
  public String nextString(int minLen, int maxLen, char lo, char hi)
    throws IllegalArgumentException
  {
    if(lo < 0) throw new IllegalArgumentException("lo must be >= 0");
    String retval = null;
    try {
      int n = minLen==maxLen ? maxLen : nextInt(minLen, maxLen);
      byte b[] = new byte[n];
      for (int i=0; i<n; i++)
        b[i] = (byte)nextInt((int)lo, (int)hi);
      retval = new String(b, Charset.defaultCharset().name());
    } catch (Exception e) {
      e.printStackTrace();
    }
    return retval;
  }

  /**
   * @does: generates a String (of between minLen and maxLen chars, inclusive) 
   *  which consists of characters between lo and hi, inclusive, but excluding
   *  character between 
   */
  public String nextString(int minLen, int maxLen, char lo, char hi, char xlo, char xhi) 
    throws IllegalArgumentException
  {
    if(lo < 0) throw new IllegalArgumentException("lo must be >= 0");
    String retval = null;
    try {
      int n = minLen==maxLen ? maxLen : nextInt(minLen, maxLen);
      byte b[] = new byte[n];
      for (int i=0; i<n; i++) {
        b[i] = (byte)nextInt((int)lo, (int)hi, (int)xlo, (int)xhi);
      }
      retval = new String(b, Charset.defaultCharset().name());
    } catch (Exception e) {
      e.printStackTrace();
    }
    return retval;
  }

}


回答2:

I'm not familiar with WeakHashMap, but you might have one null object. see this example:

public static void main(String[] args)
{
    WeakHashMap<Object, WeakReference<Object>> hm
    = new WeakHashMap<Object, WeakReference<Object>>();
    hm.put(null, null);
    for ( Object item : hm.keySet() ) {
        if ( item == null ) { 
          System.out.println("null object exists");  
        } 
    }
}


回答3:

From the WeakHashMap documentation, the key that is placed into the hash map is of a templated type, which means it is inherited from java.lang.object. As a result, it may be null. So, a key may be null.



回答4:

Assuming that you don't insert a null key value into a WeakHashMap, you do not need to check whether the key value of the iteration is null when iterating through the key set. You also do not need to check whether getKey() called on the Map.Entry instance of the iteration is null when iterating through the entry set. Both are guaranteed by the documentation, but it's somewhat indirect; it is the contract of Iterator.hasNext() that provides these guarantees.

The JavaDoc for WeakHashMap states:

Each key object in a WeakHashMap is stored indirectly as the referent of a weak reference. Therefore a key will automatically be removed only after the weak references to it, both inside and outside of the map, have been cleared by the garbage collector.

The JavaDoc for Iterator.hasNext() states:

Returns true if the iteration has more elements. (In other words, returns true if next() would return an element rather than throwing an exception.)

Because the key set and entry set views satisfy the Set contract (as required by the Map contract which WeakHashMap implements), the iterators returned by the Set.iterator() method must satisfy the Iterator contract.

When hasNext() returns true, the Iterator contract requires that the next call to next() on the Iterator instance must return a valid value. The only way for WeakHashMap to satisfy the Iterator contract is for the hasNext() implementation to keep a strong reference to the next key when it returns true, thereby preventing the weak reference to the key value held by the WeakHashMap from being cleared by the garbage collector, and, as a consequence, preventing the entry from being automatically removed from the WeakHashMap so that next() has a value to return.

Indeed, if you look at the source of WeakHashMap, you will see that the HashIterator inner class (used by the key, value, and entry iterator implementations) has a currentKey field that holds a strong reference to the current key value and a nextKey field that holds a strong reference to the next key value. The currentKey field allows HashIterator to implement Iterator.remove() in full compliance with that method's contract. The nextKey field allows HashIterator to satisfy the contract of hasNext().

That being said, suppose you want to gather up all key values in the map by calling toArray(), and then iterate through this snapshot of key values. There are a few cases to consider:

  1. If you call the no-argument toArray() method that returns Object[] or pass in a zero-length array, as in:

    final Set<MyObject> items = hm.keySet();
    for (final MyObject item : items.toArray(new MyObject[0])) {
        // Do something...
    }
    

    .. then you do not need to check whether item is null because in both cases, the array returned will be trimmed to hold the exact number of elements that were returned by the iterator.

  2. If you pass in an array of length >= the WeakHashMap's then-current size, as in:

    final Set<MyObject> items = hm.keySet();
    for (final MyObject item : items.toArray(new MyObject[items.size()])) {
        if (null == item) {
            break;
        }
        // Do something...
    }
    

    .. then the null check is necessary. The reason is that, in the time between when size() returned a value (which is used to create the array to store the keys) and toArray() finishes iterating through the keys of the WeakHashMap, an entry may have been automatically removed. This is the "room to spare" case mentioned in the JavaDoc for Collection.toArray():

    If this collection fits in the specified array with room to spare (i.e., the array has more elements than this collection), the element in the array immediately following the end of the collection is set to null. (This is useful in determining the length of this collection only if the caller knows that this collection does not contain any null elements.)

    Because you know that you have not inserted a null key value into the WeakHashMap, you can break upon seeing the first null value (if you see a null).

  3. A slight variant of the previous case, if you pass in an array of non-zero length, then you need the null check for the reason that the "room to spare" case might happen at runtime.