Board Thread:Wikia Discussion/@comment-26091666-20151103015609/@comment-25637142-20151129141717

It displays what would be called as follow Factors:
 * Type of fleet (2 levels)
 * Difficulty (3 levels)
 * Debuff status (3 levels)

2 x 3 x 3 = 18 combinations (or cases)

Since I'm only care about type of fleet and split those combinations into two sets of data with 9 combinations each for comparison. This is called "blocking"

If the hypothesis "Both CTF and STF has the same debuff" is true, the difference of the samples for each of these paired combination should be the same within a statistical tolerance. With this small sample size, you should not expect all 8/9 cases to show up as "precisely the same".

Heck, I would expect some small differences between the compared medians since error range was not specified. I'm seeing none at the moment.