The basic idea is that when we originally connect, we give a list of as many servers as we can in the form of hostname, port. These servers are connected to and they are queried to find out about any other servers that might be in the cloud. Servers might have multiple routes, but we resolve this because each server returns a unique identifier.
When it comes time to log, we look in a topic=> server cache. If we find a preferred server there, we use it. Otherwise, we pick some server at random.
When we send a log message, the response may include a redirect to a different server. If so, we make sure that we have connected to all forms of that server's address and cache that for later. Then we retry the send to the new server. We may also cache that message for later retry.
If a log request results in an error, we try to get rid of the connection that caused us this grief. This might ultimately cause us to forget a host or even a cache entry, but we will attempt to re-open these connections later, usually due to a referral back to that host. During server cloud reorganizations, we may not re-open the same server.
|
|
|
|