TY - JOUR
T1 - An examination of interrater reliability for scoring the Rorschach Comprehensive System in eight data sets
AU - Meyer, Gregory J.
AU - Hilsenroth, Mark J.
AU - Baxter, Dirk
AU - Exner, John E.
AU - Fowler, James Chris
AU - Piers, Craig C.
AU - Resnick, Justin
PY - 2002/1/1
Y1 - 2002/1/1
N2 - In this article, we describe interrater reliability for the Comprehensive System (CS; Exner, 1993) in 8 relatively large samples, including (a) students, (b) experienced researchers, (c) clinicians, (d) clinicians and then researchers, (e) a composite clinical sample (i.e., a to d), and 3 samples in which randomly generated erroneous scores were substituted for (f) 10%, (g) 20%, or (h) 30% of the original responses. Across samples, 133 to 143 statistically stable CS scores had excellent reliability, with median intraclass correlations of. 85., 96., 97., 95., 93., 95., 89, and. 82, respectively. We also demonstrate reliability findings from this study closely match the results derived from a synthesis of prior research, CS summary scores are more reliable than scores assigned to individual responses, small samples are more likely to generate unstable and lower reliability estimates, and Meyer's (1997a) procedures for estimating response segment reliability were accurate. The CS can be scored reliably, but because scoring is the result of coder skills clinicians must conscientiously monitor their accuracy.
AB - In this article, we describe interrater reliability for the Comprehensive System (CS; Exner, 1993) in 8 relatively large samples, including (a) students, (b) experienced researchers, (c) clinicians, (d) clinicians and then researchers, (e) a composite clinical sample (i.e., a to d), and 3 samples in which randomly generated erroneous scores were substituted for (f) 10%, (g) 20%, or (h) 30% of the original responses. Across samples, 133 to 143 statistically stable CS scores had excellent reliability, with median intraclass correlations of. 85., 96., 97., 95., 93., 95., 89, and. 82, respectively. We also demonstrate reliability findings from this study closely match the results derived from a synthesis of prior research, CS summary scores are more reliable than scores assigned to individual responses, small samples are more likely to generate unstable and lower reliability estimates, and Meyer's (1997a) procedures for estimating response segment reliability were accurate. The CS can be scored reliably, but because scoring is the result of coder skills clinicians must conscientiously monitor their accuracy.
UR - http://www.scopus.com/inward/record.url?scp=0035987074&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0035987074&partnerID=8YFLogxK
U2 - 10.1207/S15327752JPA7802_03
DO - 10.1207/S15327752JPA7802_03
M3 - Article
C2 - 12067192
AN - SCOPUS:0035987074
SN - 0022-3891
VL - 78
SP - 219
EP - 274
JO - Journal of Personality Assessment
JF - Journal of Personality Assessment
IS - 2
ER -