This is a spout that reads records from files. New records can be appended to these files while this spout is running. This will cause these new records to be emitted the next time around the event loop. It is also possible that we will hit the end of a file of records and notice that a new file has been started. In that case, we will move to that new file and start emitting the records we read from that file.
There are two modes in which a TailSpout can operate. In unreliable mode, transactions are read, but ack's and fail's are ignored so that if transactions are not processed, they are lost. The TailSpout will checkpoint it's own progress periodically and on deactivation so that when the TailSpout is restarted, it will start from close to where it left off. The restart will be exact with a clean shutdown and can be made arbitrarily precise with an unclean shutdown.
In the reliable mode, an in-memory table of unacknowledged transactions is kept. Any that fail are retransmitted and those that succeed are removed from the table. When a checkpoint is done, the file name and offset of the earliest unacknowledged transaction in that file will be retained. On restart, each of the files with unacknowledged transactions will be replayed from that point. Usually only one file will be replaced (the current one) or possibly the current and previous file. This will generally replay a few extra transactions, but if there is a cool-down period on shutdown, this will should very few extra transactions transmitted on startup.
One possible extension would be to allow the system to run in reliable mode, but not save the offsets for unacked transactions. This would be good during orderly shutdowns, but very bad in the event of an unorderly shutdown.