The primary difference is the way each command defines a match in replicated experiments, where each appraiser assesses each subject multiple times.
defines each replicate as 1 opportunity for a match. Conversely, defines each set of replicates as 1 opportunity for a match.For example, suppose each appraiser assesses each subject 3 times. If Appraiser A correctly matches the standard 2 out of 3 times for Subject A, then
awards Appraiser A 0 matches out of 1 opportunity assessing Subject A. But awards Appraiser A 2 matches out of 3 opportunities assessing Subject A.Likewise, if Appraiser B assesses Subject B and correctly matches the standard 3 out of 3 times, then
awards Appraiser B 1 match out of 1 opportunity assessing Subject B. But awards Appraiser B 3 matches out of 3 opportunities assessing Subject B.The different ways of counting matches may cause the same data to yield different agreement percentages, when analyzed using both commands.