In statistics, the BehrensâÂÂFisher distribution, named after Ronald Fisher and Walter Behrens, is a parameterized family of probability distributions arising from the solution of the BehrensâÂÂFisher problem proposed first by Behrens and several years later by Fisher. The BehrensâÂÂFisher problem is that of statistical inference concerning the difference between the means of two normally distributed populations when the ratio of their variances is not known (and in particular, it is not known that their variances are equal).
The BehrensâÂÂFisher distribution is the distribution of a random variable of the form
where T<sub>1</sub> and T<sub>2</sub> are independent random variables each with a Student's t-distribution, with respective degrees of freedom ν<sub>1</sub> = n<sub>1</sub> − 1 and ν<sub>2</sub> = n<sub>2</sub> − 1, and θ is a constant. Thus the family of BehrensâÂÂFisher distributions is parametrized by ν<sub>1</sub>, ν<sub>2</sub>, and θ.
Suppose it were known that the two population variances are equal, and samples of sizes n<sub>1</sub> and n<sub>2</sub> are taken from the two populations:
where "i.i.d" are independent and identically distributed random variables and N denotes the normal distribution. The two sample means are
The usual "pooled" unbiased estimate of the common variance σ<sup>2</sup> is then
where S<sub>1</sub><sup>2</sup> and S<sub>2</sub><sup>2</sup> are the usual unbiased (Bessel-corrected) estimates of the two population variances.
Under these assumptions, the pivotal quantity
has a t-distribution with n<sub>1</sub> + n<sub>2</sub> − 2 degrees of freedom. Accordingly, one can find a confidence interval for μ<sub>2</sub> − μ<sub>1</sub> whose endpoints are
where A is an appropriate quantile of the t-distribution.
However, in the BehrensâÂÂFisher problem, the two population variances are not known to be equal, nor is their ratio known. Fisher considered the pivotal quantity
This can be written as
where
are the usual one-sample t-statistics and
and one takes θ to be in the first quadrant. The algebraic details are as follows:
The fact that the sum of the squares of the expressions in parentheses above is 1 implies that they are the squared cosine and squared sine of some angle.
The BehrenâÂÂFisher distribution is actually the conditional distribution of the quantity (1) above, given the values of the quantities labeled cos θ and sin θ. In effect, Fisher conditions on ancillary information.
Fisher then found the "fiducial interval" whose endpoints are
where A is the appropriate percentage point of the BehrensâÂÂFisher distribution. Fisher claimed that the probability that μ<sub>2</sub> − μ<sub>1</sub> is in this interval, given the data (ultimately the Xs) is the probability that a BehrensâÂÂFisher-distributed random variable is between −A and A.
Bartlett showed that this "fiducial interval" is not a confidence interval because it does not have a constant coverage rate. Fisher did not consider that a cogent objection to the use of the fiducial interval.