Quote Originally Posted by therother
You may well have a point, but IMO it does not undermine the fundamental reason for this test. We are attempting to test if the AI acts differently in the two cases, where the player is triggering AI in exactly the same way, with the only variable being the saving/loading of the game in one case.
The goal seems to be to confirm statistically that there is in fact a difference, and only then evaluate whether that difference has a gameplay impact, correct?