What is the hypergeometric distribution?

The hypergeometric distribution is a discrete distribution that models the number of events in a fixed sample size when you know the total number of items in the population that the sample is from. Each item in the sample has two possible outcomes (either an event or a nonevent). The samples are without replacement, so every item in the sample is different. When an item is chosen from the population, it cannot be chosen again. Therefore, a particular item's chance of being selected increases on each trial, assuming that it has not yet been selected.

Use the hypergeometric distribution for samples drawn from relatively small populations, without replacement. For example, this distribution is used in Fisher's exact test to test the difference between two proportions and in acceptance sampling by attributes for sampling from an isolated lot of finite size.

The hypergeometric distribution is described by 3 parameters: population size, event count in population, and sample size.

For example, you receive one special order shipment of 500 labels. Suppose 2% of labels are defective. The event count in the population is 10 (.02 * 500). You sample 40 labels and want to determine the probability of 3 or more defective labels in that sample.

The probability of 3 of more defective labels in the sample is 0.0384.

Example of calculating hypergeometric probabilities

Suppose there are ten cars you would like to test drive (N = 10), and five of them have turbo engines (x = 5). If you test drive three of the cars (n = 3), what is the probability that two of the three cars that you drive will have turbo engines?

  1. Choose Calc > Probability Distributions > Hypergeometric.
  2. Choose Probability.
  3. In Population size (N), enter 10. In Event count in population (M), enter 5. In Sample size (n), enter 3.
  4. Choose Input constant, and enter 2.
  5. Click OK.

The probability that you will randomly select two cars with turbo engines when you test drive three of the ten cars you are interested in is 41.67%.

The difference between the hypergeometric and the binomial distributions

Both the hypergeometric distribution and the binomial distribution describe the number of times an event occurs in a fixed number of trials. In the binomial distribution, trials are independent. For the hypergeometric distribution, each trial changes the probability for each subsequent trial because there is no replacement.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy