Data considerations for 2 Proportions

To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.

The sample data should be selected randomly

In statistics, random samples are used to make generalizations, or inferences, about a population. If your data are not collected randomly, your results may not represent the population. For more information, go to Randomness in samples of data.

The data can contain only two categories, such as pass/fail and 1/0

If your data contain counts, such as the number of defects per unit, use 2-Sample Poisson Rate. For more information on data types, go to Data types you can analyze with a hypothesis test.

Each observation should be independent from all other observations

For observations to be independent, the probability of a particular outcome does not depend on any previous outcome. For example, if you flip a coin twice and record whether heads or tails is face up, the outcome of the second flip does not depend on the outcome of the first flip. If your observations are not independent, your results may not be valid. For more information, go to What are independent trials?.

Determine an appropriate sample size

Your sample should be large enough so that the following are true:

The estimates have enough precision.
The confidence intervals are narrow enough to be useful.
You have adequate protection against type I and type II errors.

To determine the appropriate sample size for your hypothesis test, go to Power and Sample Size for 2 Proportions.