Kappa measures the degree of agreement of the nominal or ordinal assessments made by multiple appraisers when assessing the same samples.

For example, 45 patients are assessed by two different doctors for a particular disease. How often will the doctors' diagnosis of the condition (positive or negative) agree? A different example of nominal assessments is inspectors rating defects on TV screens. Do they consistently agree on their classifications of bubbles, divets, and dirt?

Kappa values range from -1 to +1. The higher the value of kappa, the stronger the agreement.

When:

- Kappa = 1, perfect agreement exists.
- Kappa = 0, agreement is the same as would be expected by chance.
- Kappa < 0, agreement is weaker than expected by chance; this rarely occurs.

Usually a kappa value of at least 0.70 is required, but kappa values close to 0.90 are preferred.

When you have ordinal ratings, such as defect severity ratings on a scale of 1-5, Kendall's coefficients, which take ordering into consideration, are usually more appropriate statistics to determine association than kappa alone.

You can calculate Cohen's kappa only if one of the following two conditions is true:

- Two appraisers each evaluate one trial on each sample.
- One appraiser evaluates two trials on each sample.

Kendall's coefficient of concordance indicates the degree of association of ordinal assessments made by multiple appraisers when assessing the same samples. Kendall's coefficient is commonly used in attribute agreement analysis.

Kendall's coefficient values can range from 0 to 1. The higher the value of Kendall's, the stronger the association. Usually Kendall's coefficients of 0.9 or higher are considered very good. A high or significant Kendall's coefficient means that the appraisers are applying essentially the same standard when assessing the samples.

If you provide a known rating for each sample, Minitab also calculates Kendall's correlation coefficients. The correlation coefficients are specified for each appraiser to identify the agreement of each appraiser with the known standard; and an overall coefficient to represent all appraisers with the standards. The correlation coefficient helps you determine whether an appraiser is consistent but inaccurate.

Kendall's coefficient values can range from -1 to 1. A positive value indicates positive association. A negative value indicates negative association. The higher the magnitude, the stronger the association.

Use Kendall's correlation coefficient and their p-values to choose between two opposing hypotheses, based on your sample data:

- H
_{0}: There is no association between ratings of all appraisers and the known standard. - H
_{1}: Ratings by all appraisers are associated with the known standard.

The p-value provides the likelihood of obtaining your sample, with its particular Kendall's correlation coefficient, if the null hypothesis (H_{0}) is true. If the p-value is less than or equal to a predetermined level of significance (α-level), then you reject the null hypothesis and claim support for the alternative hypothesis.

- When your classifications are nominal (true/false, good/bad, crispy/crunchy/soggy), use kappa.
- When your classifications are ordinal (ratings made on a scale), in addition to kappa statistics, use Kendall's coefficient of concordance.
- When your classifications are ordinal and you have a known standard for each trial, in addition to kappa statistics, use Kendall's correlation coefficient.

Kappa statistics represent absolute agreement between ratings while Kendall's coefficients measure the associations between ratings. Therefore, kappa statistics treat all misclassifications equally, but Kendall's coefficients do not treat all misclassifications equally. For instance, Kendall's coefficients considers the consequences of misclassifying a perfect (rating = 5) object as bad (rating = 1) as more serious than misclassifying it as very good (rating = 4).