For optimal performance, usage pattern should be one where matches should be very common (esp. after "warm-up"), and as with most hash-based maps/sets, that hash codes are uniformly distributed. Also, collisions are slightly more expensive than with HashMap or HashSet, since hash codes are not used in resolving collisions; that is, equals() comparison is done with all symbols in same bucket index.
Finally, rehashing is also more expensive, as hash codes are not stored; rehashing requires all entries' hash codes to be recalculated. Reason for not storing hash codes is reduced memory usage, hoping for better memory locality.
Usual usage pattern is to create a single "master" instance, and either use that instance in sequential fashion, or to create derived "child" instances, which after use, are asked to return possible symbol additions to master instance. In either case benefit is that symbol table gets initialized so that further uses are more efficient, as eventually all symbols needed will already be in symbol table. At that point no more Symbol String allocations are needed, nor changes to symbol table itself.
Note that while individual SymbolTable instances are NOT thread-safe (much like generic collection classes), concurrently used "child" instances can be freely used without synchronization. However, using master table concurrently with child instances can only be done if access to master instance is read-only (ie. no modifications done).
addSymbol
will always return the same string reference. The symbol table performs the same task as String.intern()
with the following differences:
Source
by QName or by resource bundle name.
@author Clement Wong
addSymbol
will always return the same string reference. The symbol table performs the same task as String.intern()
with the following differences:
SymbolTable
has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the SymbolTable, and the initial capacity is simply the capacity at the time the SymbolTable is created. Note that the SymbolTable is open: in the case of a "hash collision", a single bucket stores multiple entries, which must be searched sequentially. The load factor is a measure of how full the SymbolTable is allowed to get before its capacity is automatically increased. When the number of entries in the SymbolTable exceeds the product of the load factor and the current capacity, the capacity is increased by calling the rehash
method.Generally, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the time cost to look up an entry (which is reflected in most SymbolTable operations, including addSymbol and containsSymbol).
The initial capacity controls a tradeoff between wasted space and the need for rehash
operations, which are time-consuming. No rehash
operations will ever occur if the initial capacity is greater than the maximum number of entries the Hashtable will contain divided by its load factor. However, setting the initial capacity too high can waste space.
If many entries are to be made into a SymbolTable
, creating it with a sufficiently large capacity may allow the entries to be inserted more efficiently than letting it perform automatic rehashing as needed to grow the table.
@see SymbolHash @author Andy Clark @author John Kim, IBM @version $Id: SymbolTable.java 985518 2010-08-14 16:02:52Z mrglavas $
addSymbol
will always return the same string reference. The symbol table performs the same task as String.intern()
with the following differences: addSymbol
will always return the same string reference. The symbol table performs the same task as String.intern()
with the following differences:
.
' — that is, they areresolved relative to the symbol table's {@link #root() root}. Once {@link #enter(String) created}, a scope remains in the symbol table and the corresponding AST node should be associated with that scope by setting the corresponding {@link Constants#SCOPE property} tothe scope's qualified name. Subsequent traversals over that node can then automatically {@link #enter(Node) enter} and {@link #exit(Node) exit} that scope. Alternatively, if traversing out oftree order, the current scope can be set {@link #setScope(SymbolTable.Scope) explicitly}. To support different name spaces within the same scope, this class can optionally {@link #toNameSpace mangle} and {@link #fromNameSpace unmangle} unqualified symbols. By convention, aname in any name space besides the default name space is prefixed by the name of the name space and an opening parenthesis '(
' and suffixed by a closing parenthesis ')
'.
@author Robert Grimm
@version $Revision: 1.34 $
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|