About rm and tm merge message threads #6087
PleaseGiveMeTheCoke
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Context
At pull request #6061 , rm and tm merge message threads had been united as one. Is this a safe, effective and reasonable improvement? Here's the theoretical and data support for doing so
Theoretical Support
Increase the number of merge messages sent at a time
In the case of two threads, the only messages merged are those from a single client, rm or tm. In the case of one thread, both rm and tm messages are merged.
Simpler and lighter design
tm message volume is relatively small, the default off does not open the merge thread, so from the design point of view, mergeSendExecutorService will be modified to static class attributes to retain only a copy is more appropriate
Reduce overhead
Merged threads do mostly asynchronous IO that does not involve blocking, so a single thread can focus more on sending asynchronous requests, eliminating the overhead of context switching
Reduced blocking and empty polling
Adding double judgment reduces unnecessary empty polling by prioritizing messages when the list of messages to be sent is not empty, and by allowing more sleep when the list of messages to be sent is empty.
Data Support
Test Tools
Performance test using JMH, CPU occupancy test using arthas
Test Code
Test Data
Before thread merging
Performance Testing Data
8 thread
Warmup Iteration 2: 12139329.543 ±(99.9%) 848561.019 ns/op
Warmup Iteration 3: 11963078.561 ±(99.9%) 960887.166 ns/op
Warmup Iteration 4: 12981660.479 ±(99.9%) 1189936.352 ns/op
Warmup Iteration 5: 12492056.457 ±(99.9%) 883896.984 ns/op
Iteration 1: 13048335.217 ±(99.9%) 1128320.673 ns/op
Iteration 2: 13633428.156 ±(99.9%) 1217889.924 ns/op
Iteration 3: 17895714.077 ±(99.9%) 906983.698 ns/op
Iteration 4: 14793398.420 ±(99.9%) 1110027.805 ns/op
Iteration 5: 13455755.155 ±(99.9%) 658577.878 ns/op
Result "io.seata.core.rpc.netty.v1.MergedThreadTest.sendRequest":
14565326.205 ±(99.9%) 7590835.199 ns/op [Average]
(min, avg, max) = (13048335.217, 14565326.205, 17895714.077), stdev = 1971315.795
CI (99.9%): [6974491.006, 22156161.404] (assumes normal distribution)
16 thread
Warmup Iteration 2: 19813489.277 ±(99.9%) 968142.331 ns/op
Warmup Iteration 3: 20198656.297 ±(99.9%) 749479.682 ns/op
Warmup Iteration 4: 20573418.230 ±(99.9%) 923624.110 ns/op
Warmup Iteration 5: 20253840.605 ±(99.9%) 1177409.617 ns/op
Iteration 1: 20197239.104 ±(99.9%) 888597.560 ns/op
Iteration 2: 20102682.120 ±(99.9%) 985867.184 ns/op
Iteration 3: 21468587.136 ±(99.9%) 664266.590 ns/op
Iteration 4: 20359510.680 ±(99.9%) 548540.336 ns/op
Iteration 5: 19561127.870 ±(99.9%) 980520.992 ns/op
Result "io.seata.core.rpc.netty.v1.MergedThreadTest.sendRequest":
20337829.382 ±(99.9%) 2693668.053 ns/op [Average]
(min, avg, max) = (19561127.870, 20337829.382, 21468587.136), stdev = 699537.039
CI (99.9%): [17644161.329, 23031497.435] (assumes normal distribution)
CPU Occupancy Testing Data
After the thread merge
Performance Testing Data
8 thread
Warmup Iteration 2: 11032606.896 ±(99.9%) 510359.137 ns/op
Warmup Iteration 3: 11430048.542 ±(99.9%) 878519.234 ns/op
Warmup Iteration 4: 10928250.943 ±(99.9%) 862919.770 ns/op
Warmup Iteration 5: 10542404.005 ±(99.9%) 1162601.368 ns/op
Iteration 1: 10176809.317 ±(99.9%) 491029.968 ns/op
Iteration 2: 10377410.748 ±(99.9%) 704883.549 ns/op
Iteration 3: 9895530.796 ±(99.9%) 1226249.260 ns/op
Iteration 4: 11388840.466 ±(99.9%) 928430.523 ns/op
Iteration 5: 11275567.639 ±(99.9%) 929608.316 ns/op
Result "io.seata.core.rpc.netty.v1.MergedThreadTest.sendRequest":
10622831.793 ±(99.9%) 2583784.694 ns/op [Average]
(min, avg, max) = (9895530.796, 10622831.793, 11388840.466), stdev = 671000.680
CI (99.9%): [8039047.099, 13206616.487] (assumes normal distribution)
16 thread
Warmup Iteration 2: 16809090.598 ±(99.9%) 788713.429 ns/op
Warmup Iteration 3: 16926928.220 ±(99.9%) 1060596.141 ns/op
Warmup Iteration 4: 17194963.678 ±(99.9%) 1115132.284 ns/op
Warmup Iteration 5: 17232139.011 ±(99.9%) 1122411.667 ns/op
Iteration 1: 17220331.141 ±(99.9%) 678882.820 ns/op
Iteration 2: 16812748.717 ±(99.9%) 980338.756 ns/op
Iteration 3: 16119182.067 ±(99.9%) 758937.643 ns/op
Iteration 4: 15774596.784 ±(99.9%) 790845.394 ns/op
Iteration 5: 16058259.592 ±(99.9%) 972153.339 ns/op
Result "io.seata.core.rpc.netty.v1.MergedThreadTest.sendRequest":
16397023.660 ±(99.9%) 2302378.216 ns/op [Average]
(min, avg, max) = (15774596.784, 16397023.660, 17220331.141), stdev = 597920.311
CI (99.9%): [14094645.445, 18699401.876] (assumes normal distribution)
CPU Occupancy Testing Data
Test Conclusion
Thread merging has little effect on CPU utilization, but has a more significant effect on request delivery time, reducing request time by about 25%.
Beta Was this translation helpful? Give feedback.
All reactions