Can we avoid the "address already in use" errors by using SO_REUSEADDR? Typical results are
[07-24 19:32:49] INFO ReuseExperimentMule2067TestCase [main]: Measuring average run length for 100 repeats without reuse and a pause of 100 ms [07-24 19:33:49] INFO ReuseExperimentMule2067TestCase [main]: Average run length: 57.3 +/- 33.15131973240282 [07-24 19:33:49] INFO ReuseExperimentMule2067TestCase [main]: Measuring average run length for 100 repeats with reuse and a pause of 100 ms [07-24 19:35:32] INFO ReuseExperimentMule2067TestCase [main]: Average run length: 100.0 +/- 0.0 [07-24 19:35:32] INFO ReuseExperimentMule2067TestCase [main]: Measuring average run length for 100 repeats without reuse and a pause of 10 ms [07-24 19:35:48] INFO ReuseExperimentMule2067TestCase [main]: Average run length: 96.8 +/- 7.332121111929359 [07-24 19:35:48] INFO ReuseExperimentMule2067TestCase [main]: Measuring average run length for 100 repeats with reuse and a pause of 10 ms [07-24 19:36:04] INFO ReuseExperimentMule2067TestCase [main]: Average run length: 100.0 +/- 0.0 [07-24 19:36:04] INFO ReuseExperimentMule2067TestCase [main]: Measuring average run length for 100 repeats without reuse and a pause of 1 ms [07-24 19:36:10] INFO ReuseExperimentMule2067TestCase [main]: Average run length: 75.8 +/- 37.690317058894586 [07-24 19:36:10] INFO ReuseExperimentMule2067TestCase [main]: Measuring average run length for 100 repeats with reuse and a pause of 1 ms [07-24 19:36:18] INFO ReuseExperimentMule2067TestCase [main]: Average run length: 100.0 +/- 0.0
which suggest that enabling address re-use could help with the issue. Note that if a single socket (ie a single port number) is reused for all tests we often zeroes eveywhere (even with waits of 2sec and similar between iterations/tests). This suggests that once the error occurs, the socket enters a long-lived "broken" state. All this is by AC on linux, dual CPU, Java 1.4 - I suspect results will vary like crazy in different contexts.