A study of Duncan Irschick at the university of Massachusetts drew my attention. It says :
Men Are More Accurate than Women When Hitting a Target with Force in the Dark
The story in itself is interesting, but what particularly struck me was that It was quote ” … a small study …” end quote.
I totally agree with that since they “…tested four male and three female adults”.
Yes, right : 4 men and 3 women.
The first reflection of somebody with a statistical/data mining background is : “How on earth can a self-respecting scientist publish results on differences between men and women with such a small sample ?”. I not only mean self-respecting, since he is also respected by oythers and a first-class scientist.
So there must be something else. Could it be that with such a small sample you can indeed do some thorough statistics ?
Let us try it out.
The case at hand is men and women hitting at something with a hammer. I do not know the details, but for the present purpose it is simple to use some fake data.
Let us take an extreme case :
If they must hit some target, suppose that the four men missed the target by 20, 18,22,and 21 centimeters respectively. The women, being much accurate missed only by 3,5 and 6 centimeters.
With a simple t-test we find out that the two means of 20.25 for the men and 4.67 for the women are significantly different (p=030003; two-tailed).
So if all 3 women are far better than all 4 men we have a proven case !
With one woman being a bit less accurate than one men we get the following : let us assume that the best man in stead of missing by 18 centimeters misses by 10 centimeters and the worst woman misses by 11 centimeters in stead of 6.
The difference is still significant (p=0.0227; two-tailed).
Let us try a third one : we take case 2 but make the two of the three worst men somewhat better and the second best women somewhat less accurate (men: 10,14,15,22; women 3,8,11) : it is not significant any more (p=0.0664; two-tailed)
So, even with such small samples it is perfectly acceptable to draw conclusions.
But for a data miner, who is used to work with millions of observations it still feels a bit weird !