Daemon acting as a coordinator for creating and updating index cardinality statistics.
The need for updated statistics is currently determined when compiling a SELECT query. The unit of work is then scheduled with this daemon, and the work itself will be carried out in a separate thread. If the worker thread doesn't exist it is created, if it is idle the unit of work will be processed immediately, and if it is busy the unit of work has to wait in the queue.
The daemon code has a notion of a background task. If the update is run as a background task, it will try to affect other activity in the Derby database as little as possible. As far as possible, it will not set locks on the conglomerates it scans, and if it needs to take locks it will give up immediately if the locks cannot be obtained. In some cases it will also roll back to release locks already taken, ad then retry. Since we are accessing shared structures the background work may still interfere with the user activity in the database due to locking, but all such operations carried out by the daemon are of short duration.
The high level flow of an update to index statistics is:
- schedule update (the only action carried out by the user thread)
-
- for each index:
- scan index
- invalidate statements dependent on current statistics
- drop existing statistics
- add new statistics
List of possible improvements:
- Reduce potential impact of multiple invalidations (per table), probably by finding a way to invalidate only once after all indexes for a table have had their statistics updated. So far invalidation has proven to be the most difficult piece of the puzzle due to the interaction with the data dictionary and sensitivity to concurrent activity for the table.
Implementation notes: List of potential cleanups before going into a release:
- Consider removing all tracing code. May involve improving logging if parts of the trace output is valuable enough.