Test-Level Output: Summary Statistics
After the Introduction, the report provides test-level summary statistics based on raw number-correct scores. This is done for the total score (all items) as well as the actual score (scored items only), pretest items only, and all domains or content areas. The following are definitions of the columns in this table.
Label |
Explanation |
Items |
number of items in that portion of the test |
Mean |
average number correct |
SD |
standard deviation, a measure of dispersion (a range of ± two SDs from the mean includes approximately 95% of the examinees, if their number-correct scores are normally distributed) |
Min score |
the minimum number of items an examinee answered correctly |
Max score |
the maximum number of items an examinee answered correctly |
Mean P |
average item difficulty statistic for that portion; also the average proportion-correct score if there are no omitted responses (not reported if there are no multiple choice items) |
Item Mean |
average of the item means for polytomous items (not reported if there are no polytomous items) |
Mean R |
average item-total correlation for that portion of the test |
The test-level summary table allows you to make important comparisons between these various parts of the test. For example, are the new pretest items of comparable difficulty to the current scored items? Are items in Domain 2 more difficult than Domain 1? Were the mean and standard deviation (SD) of the raw scores what should be expected? In the example below, the pretest items were relatively difficult, and the items in Domain 1 were relatively easy.
Example Summary Statistics
Test-Level Output: Reliability Analysis
The reliability analysis provides a table that summarizes the reliability statistics computed by Iteman. Coefficient α (alpha) and the SEM (based on α) are computed for all items, scored items only, pretest items only, and for each domain separately. Three forms of split-half reliability are computed. First the test is randomly divided into two halves and the Pearson product-moment correlation is computed between the total score for the two halves. Also provided is the split-half correlation between the total scores for the first half and the second half of the test, and the odd- and even-numbered items on the test. Since these correlations are computed using half the total number of items, the Spearman-Brown corrected correlations are also provided.
An introduction to these statistics, and many others, is provided in the Appendix.
Example Reliability Analysis
Test-Level Output: Scores
Next, a grouped frequency distribution figure is presented, showing the distribution of number-correct scores for the scored items.
Example Score Distribution
