• Smarter Balanced
  • Skip to Content
  • Introduction and Overview
    • Technical Report Approach
    • Peer Review Guidelines and Established Standards
    • Overview and Background of the Smarter Balanced Theory of Action
    • Six Principles Underlying the Smarter Balanced Theory of Action
    • Purpose of Smarter Balanced Assessment System
      • Summative Assessments
      • Interim Assessments
      • Formative Assessment Resources
    • Overview of Report Chapters
      • Chapter 1: Validity
      • Chapter 2: Reliability, Precision, and Errors of Measurement
      • Chapter 3: Test Fairness
      • Chapter 4: Test Design
      • Chapter: 5 Scores, Scales, and Norms
      • Chapter 6: Test Administration
      • Chapter 7: Reporting and Interpretation
      • Chapter 8: Trends in Test Scores
    • Acknowledgments
      • Technical Advisory Committee
      • Students with Disabilities Advisory Committee
      • Performance and Practice Committee
  • 1 Validity
    • 1.1 Introduction
    • 1.2 A Note on the Validity Evidence Presented in This Technical Report
    • 1.3 Purposes of the Summative Assessments
    • 1.4 Types of Validity Evidence
    • 1.5 Validity Evidence
      • 1.5.1 Validity Evidence Supporting Purpose 1
      • 1.5.2 Validity Evidence Supporting Purposes 2 and 3
      • 1.5.3 Validity Evidence Supporting Purpose 4
      • 1.5.4 Validity Evidence Supporting Purpose 5
      • 1.5.5 Validity Evidence Supporting Purpose 6
      • 1.5.6 Validity Evidence Supporting Purpose 7
    • 1.6 Conclusion for Summative Test Validity Results
  • 2 Reliability, Precision, and Errors of Measurement
    • 2.1 Introduction
    • 2.2 Measurement Bias
    • 2.3 Reliability
      • 2.3.1 General Population
      • 2.3.2 Demographic Groups
      • 2.3.3 Paper/Pencil Tests
    • 2.4 Classification Accuracy
      • 2.4.1 English Language Arts/Literacy
      • 2.4.2 Mathematics
    • 2.5 Standard Errors of Measurement (SEMs)
  • 3 Test Fairness
    • 3.1 Introduction
    • 3.2 Definitions for Validity, Bias, Sensitivity, and Fairness
      • 3.2.1 Validity
      • 3.2.2 Bias
      • 3.2.3 Sensitivity
      • 3.2.4 Fairness
    • 3.3 Bias and Sensitivity Guidelines
      • 3.3.1 Item Development
      • 3.3.2 Guidelines for General Accessibility
    • 3.4 Test Delivery
    • 3.5 Meeting the Needs of Traditionally Underrepresented Populations
      • 3.5.1 Students Who Are ELs
      • 3.5.2 Students with Disabilities
    • 3.6 The Individual Student Assessment Accessibility Profile (ISAAP)
    • 3.7 Usability, Accessibility, and Accommodations Guidelines
    • 3.8 Provision of Specialized Tests or Pools
    • 3.9 Differential Item Functioning (DIF)
      • 3.9.1 Method of Assessing DIF
      • 3.9.2 Item DIF in the 2018-19 Summative Assessment Pool
    • 3.10 Test Fairness and Implications for Ongoing Research
  • 4 Test Design
    • 4.1 Introduction
    • 4.2 Content Structure
    • 4.3 Synopsis of Assessment System Components
    • 4.4 Evidence-Centered Design in Constructing Smarter Balanced Assessments
    • 4.5 Test Blueprints
    • 4.6 Operational Summative Assessment Blueprints and Specifications
    • 4.7 CAT and Performance Task Test Components
    • 4.8 Content Alignment
    • 4.9 Adaptive Test Design and Algorithm
    • 4.10 Item Attributes
    • 4.11 Test Operation Walkthrough
      • 4.11.1 Preparation
      • 4.11.2 Initialization
      • 4.11.3 Item Selection
      • 4.11.4 Expanded Pool Items
      • 4.11.5 Item Review
      • 4.11.6 Termination
      • 4.11.7 Test Scoring
    • 4.12 Item and Task Development
      • 4.12.1 Item and Task Specifications
      • 4.12.2 Performance Task Design
      • 4.12.3 The Item/Task Pool Specification
      • 4.12.4 Item Writing
      • 4.12.5 Training
      • 4.12.6 Educator Participation
      • 4.12.7 Member-Managed Item Development
      • 4.12.8 Item Reviews
    • 4.13 Field Testing
      • 4.13.1 Accessibility
      • 4.13.2 CAT Field Testing
      • 4.13.3 Field Testing of Performance Tasks
      • 4.13.4 Item Scoring
      • 4.13.5 Item Quality Control Criteria
    • 4.14 2018-19 Summative Item Pool
    • 4.15 Blueprint Fidelity
    • 4.16 Item Exposure
    • 4.17 Summary of Test Design
  • 5 Scores, Scales, and Norms
    • 5.1 Introduction
    • 5.2 Item Response Theory
      • 5.2.1 Calibration and Scaling
      • 5.2.2 Vertical Scale
      • 5.2.3 Transforming the Theta Metric to the Scale Score
      • 5.2.4 Minimum and Maximum Scale Scores
      • 5.2.5 Calibrating Field Test Items
    • 5.3 Achievement Level Setting
      • 5.3.1 Pre-Step: Development of the Achievement Level Descriptors
      • 5.3.2 Step 1: Distributed Standard Setting (Online Panel)
      • 5.3.3 Step 2: In-Person Panel
      • 5.3.4 Step 3: Cross-Grade Review (Vertical Articulation Committee)
      • 5.3.5 Step 4: Member Approval
      • 5.3.6 Step 5: Interpolating High School Cut Points
    • 5.4 Results for the 2018-19 Assessments
      • 5.4.1 Overall Score and Achievement Level Results
      • 5.4.2 Claim-Level Results
      • 5.4.3 Percentile Tables for Overall Scale Scores
      • 5.4.4 Percentile Tables for Claim-Level Scale Scores
      • 5.4.5 Modes of Administration
      • 5.4.6 Evaluation of Vertical Scales
  • 6 Test Administration
    • 6.1 Introduction
    • 6.2 Test Administration Window
    • 6.3 Duration and Timing Information
    • 6.4 Test Administration Manual
    • 6.5 Clear Directions to Ensure Uniform Administration
    • 6.6 Detailed Instructions for Test Administrators and Test Takers
    • 6.7 Responsibilities of Test Administrators
    • 6.8 Universal Tools, Designated Supports, and Accommodations
  • 7 Reporting and Interpretation
    • 7.1 Introduction
    • 7.2 Overall Test Scores
    • 7.3 Standard Error
    • 7.4 Subscores
      • 7.4.1 ELA/Literacy Subscores/Claims
      • 7.4.2 Mathematics Subscores/Claims
      • 7.4.3 Performance Classifications
    • 7.5 Types of Reports
      • 7.5.1 Individual Student Report (ISR)
      • 7.5.2 Roster Reports
      • 7.5.3 Aggregate Reports
    • 7.6 Summary
  • 8 Change in Test Scores from Previous Year
    • 8.1 Introduction
    • 8.2 Change in Student Performance
    • 8.3 Change in Student Demographics
    • 8.4 Change in Testing Times
    • 8.5 Change in the Item Pool
  • References
  • Appendix A: Item Development Process
  • Appendix B: Test Design Development Activity and Outcomes
  • Appendix C: Reporting Achievement Levels
  • SmarterBalanced.org
  • Smarter Balanced Validity Research
  • ©2020 The Regents of the University of California

2018-19 Summative Technical Report

Chapter 8 Change in Test Scores from Previous Year

8.1 Introduction

This chapter reports only the differences between the 2017–18 and 2018-19 summative test administrations and results. The term “change” is used to describe the differences between the two administrations. The study of differences is confined here to a two-year time frame in order to include as many states and as much data as possible without confounding change within a fixed set of states with changes in membership or member participation. Adding time points generally reduces the number of states and data that can be included if all time points are to represent the same members and level of participation.

Readers may find it possible to discern longer-range trends in student performance and other aspects of the Smarter Balanced assessment by studying separate but consecutive trend reports where each is based on two- or three-year time points. A trend report issued in 2018 included three time points: the 2015–16, 2016–17, and 2017–18 assessments. The present report and technical reports in the future will include results for two consecutive annual summative assessments.

States included in some or all of the analyses performed for this chapter are shown in Table 8.1. To be included in analyses for a given grade, a state had to administer the test to students in that grade in both administrations. Some analyses, such as for test duration, had stricter requirements, which are described in the sections containing results of those analyses. Besides having to participate in both administrations for a given grade, member jurisdictions are only included if they provided their student data to Smarter Balanced both years and student performance was reported on, or could be transformed to, the Smarter Balanced reporting scale.

To protect confidentiality, results are never reported for a single state. Therefore, results that include Idaho are reported at the high school level, which will also include grade 11 students from the additional states indicated in Table 8.1.

Table 8.1: STATES INCLUDED IN ANALYSES OF STUDENT DATA
Grade ELA CAT ELA PT Math CAT Math PT
3 to 7 CA,DE,HI,ID,OR,SD,VI,VT,WA CA,DE,HI,ID,OR,SD,VI,VT,WA CA,DE,HI,ID,OR,SD,VI,VT,WA CA,DE,ID,OR,SD,VI,VT,WA
8 CA,DE,HI,ID,OR,SD,VT,WA CA,DE,HI,ID,OR,SD,VT,WA CA,DE,HI,ID,OR,SD,VT,WA CA,DE,ID,OR,SD,VT,WA
11 CA,HI,ID,OR,SD,VI,WA CA,HI,ID,OR,SD,VI,WA CA,HI,ID,OR,SD,VI,WA CA,ID,OR,SD,VI,WA

8.2 Change in Student Performance

Table 8.2 shows student mean scale scores and standard deviations for overall performance by grade for the 2017–18 and 2018-19 administrations and their difference. Mean ELA/literacy scale scores increased for all grades. Mean mathematics scale scores increased at lower grades and decreased slightly at grades 8 and high school. Effect sizes were generally quite small. Effect sizes are the ratio of the change to the standard deviation of the scale score in the previous year (2017–18). The standard deviation of scale scores tends to increase with grade in both subjects.

Table 8.3 shows the change in percent proficient. Percent proficient is the percent of students at or above the level 3 cut score. Patterns of change in percent proficient are similar to those for change in student scale scores. The percent proficient increased at lower grades in both subjects and increased only slightly or decreased in grades 8 and high school.

Table 8.2: CHANGE IN STUDENT SCALE SCORES
Subject Grade N 2018 Mean 2018 SD 2018 N 2019 Mean 2019 SD 2019 Change Effect Size
ELA/Lit. 3 760,057 2425 90.7 767,867 2426 91.4 1 1e-06
4 791,917 2466 97.0 764,422 2468 97.2 2 2e-06
5 801,703 2500 98.5 796,299 2504 98.6 4 5e-06
6 806,524 2521 98.2 801,488 2522 97.5 2 2e-06
7 789,642 2546 102.0 808,058 2549 103.5 3 4e-06
8 675,251 2564 103.0 683,061 2564 104.8 0 1e-06
HS 619,755 2598 117.0 621,955 2600 117.8 3 4e-06
Math 3 762,712 2433 84.8 770,706 2435 85.0 2 3e-06
4 794,458 2471 86.4 767,104 2474 86.7 3 4e-06
5 803,725 2495 94.7 798,483 2498 95.8 3 4e-06
6 808,325 2514 108.6 803,387 2515 109.3 1 1e-06
7 791,064 2528 115.2 809,585 2530 115.9 2 2e-06
8 675,393 2546 126.2 683,709 2544 127.2 -2 -3e-06
HS 647,424 2565 127.4 637,170 2564 129.8 -1 -1e-06
Table 8.3: CHANGE IN PERCENT PROFICIENT
Subject Grade Prof Pct 2018 Prof Pct 2019 Change
ELA/Lit. 3 48.6 48.9 0.2
4 49.5 50.0 0.6
5 51.0 52.5 1.5
6 48.3 48.9 0.6
7 50.8 51.7 0.9
8 51.3 51.0 -0.2
HS 58.5 59.2 0.7
Math 3 49.7 50.6 0.9
4 44.4 45.8 1.4
5 37.9 39.2 1.3
6 38.4 38.9 0.5
7 38.7 39.1 0.4
8 38.5 37.9 -0.6
HS 32.9 32.4 -0.5

8.3 Change in Student Demographics

Student demographics by year, and year-to-year change in demographics, are shown in Table 8.4 and Table 8.5. The years in the column headings of this table represent the spring year of the administration. The heading “2018” represents the 2017–18 administration.

The numbers in Table 8.4 and Table 8.5 are the weighted averages across grade and members. Differences among grades in how much and in what direction change occurred will not be evident in this table. Where change appears to be zero for a given demographic group, it is possible that one or more grades will show significant change for the group. It is also possible that different grades may show change in different directions. Despite these possibilities, the numbers in Table 8.4 and Table 8.5 generally hold up across grades.

Readers of this report will most likely be interested primarily in results for a given state, in addition to overall and even grade-specific results. Table 8.4 and Table 8.5 are intended primarily to suggest directions for more specific investigations based on state-specific data.

Overall, Table 8.4 and Table 8.5 show that there have been no substantial changes in student demographics that would account for any substantial changes in overall student performance. However, as noted with the examples given above, Table 8.4 and Table 8.5 do indicate that there may be changes in the achievement of specific demographic groups worth investigating at the state level using state-specific and grade-specific data.

Table 8.4: CHANGE IN ELA/LITERACY STUDENT DEMOGRAPHICS
Group Total Pct 2018 Total Pct 2019 MeanSS 2018 MeanSS 2019 Prof Pct 2018 Prof Pct 2019 Change Total Pct Change MeanSS Change Prof Pct
Total 100.0 100.0 2514 2516 50.9 51.6 0.0 2.1 0.7
Female 48.9 48.9 2528 2529 56.1 56.5 0.0 1.2 0.4
Male 51.1 51.1 2501 2504 45.9 46.8 0.0 2.9 0.9
American Indian or Alaska Native 0.9 0.9 2469 2470 31.5 31.7 0.0 0.0 0.2
Asian 8.0 7.9 2577 2579 74.9 75.3 0.0 2.3 0.4
Black/African American 6.7 6.7 2459 2460 29.6 30.2 0.0 1.9 0.6
Native Hawaiian or Pacific Islander 1.0 1.0 2489 2488 38.8 38.1 0.0 -1.8 -0.7
Hispanic/Latino Ethnicity 39.3 39.5 2488 2492 39.2 40.6 0.1 3.7 1.4
White 36.1 35.7 2537 2539 61.8 62.0 -0.4 1.2 0.2
Two or More Races 4.4 4.6 2534 2536 60.4 61.3 0.2 2.8 1.0
Unidentified Race 3.8 4.1 2520 2518 52.9 51.5 0.3 -3.8 -1.4
LEP Status 15.2 14.4 2415 2418 14.0 14.0 -0.8 2.2 0.0
IDEA Indicator 11.4 11.8 2419 2423 15.6 16.5 0.4 3.4 0.8
Section 504 Status 2.0 2.0 2527 2537 50.1 53.8 0.0 9.1 3.6
Economic Disadvantage Status 55.6 56.0 2482 2485 37.7 38.7 0.4 2.6 1.0
Table 8.5: CHANGE IN MATHEMATICS STUDENT DEMOGRAPHICS
Group Total Pct 2018 Total Pct 2019 MeanSS 2018 MeanSS 2019 Prof Pct 2018 Prof Pct 2019 Change Total Pct Change MeanSS Change Prof Pct
Total 100.0 100.0 2505 2507 40.2 40.8 0.0 1.2 0.5
Female 48.9 48.9 2507 2508 39.9 40.2 0.0 0.7 0.4
Male 51.1 51.1 2504 2506 40.6 41.3 0.0 1.8 0.7
American Indian or Alaska Native 0.9 0.9 2458 2458 21.7 21.6 0.0 -0.9 -0.1
Asian 8.0 8.0 2590 2593 71.3 72.0 0.0 3.3 0.7
Black/African American 6.7 6.7 2442 2444 18.8 19.2 0.0 0.9 0.5
Native Hawaiian or Pacific Islander 1.1 1.0 2477 2475 27.8 27.4 0.0 -1.9 -0.4
Hispanic/Latino Ethnicity 39.3 39.5 2474 2477 27.1 28.3 0.2 2.8 1.2
White 36.2 35.7 2531 2531 51.4 51.4 -0.5 0.2 0.1
Two or More Races 4.4 4.5 2525 2527 49.2 50.2 0.1 2.6 1.0
Unidentified Race 3.8 4.1 2506 2503 40.2 39.0 0.3 -4.0 -1.1
LEP Status 15.4 14.7 2419 2420 13.8 13.8 -0.7 0.7 0.0
IDEA Indicator 11.3 11.7 2407 2409 12.3 12.7 0.4 2.1 0.4
Section 504 Status 2.0 2.0 2523 2524 39.7 39.7 0.1 1.7 0.0
Economic Disadvantage Status 55.6 56.0 2471 2473 26.9 27.7 0.4 1.6 0.8

8.4 Change in Testing Times

Table 8.6 and Table 8.7 show changes in test start dates and test duration from 2017–2018 to 2018-19. All of the member jurisdictions listed in Table 8.1 were included in Table 8.6. Table 8.7 excluded member jurisdictions HI and OR because they did not use the same blueprint for both administrations.

Table 8.6 shows that the test was started slightly later, on average, in 2018-19 compared to 2017–18 at all grades except high school. The changes were generally small, however—less than a day for grades 3 to 8. Such changes may have been due to year-to-year differences in school calendars or school cancellations due to weather or other such differences.

Table 8.7 shows that on average, students spent slightly more time taking the mathematics test and about the same amount of time taking the ELA/literacy test in 2018–19 compared to the previous administration. There was minor grade-to-grade variation in these changes.

Table 8.6: CHANGE IN TEST START DAY*
Subject Grade Mean 2018 Min 2018 Max 2018 Mean 2019 Min 2019 Max 2019 DIFbwMeans
ELA/Literacy 3 119.7 38 186 120.2 52 189 0.5
4 119.7 38 191 120.2 51 192 0.4
5 119.7 38 183 120.0 50 196 0.3
6 119.6 38 184 120.0 50 191 0.3
7 118.7 38 184 119.0 51 182 0.3
8 118.6 38 187 118.8 26 190 0.2
HS 111.8 -32 194 108.2 -34 180 -3.6
Mathematcs 3 127.8 38 191 128.3 51 187 0.4
4 128.1 38 192 128.6 52 192 0.5
5 128.1 38 186 128.4 59 196 0.4
6 127.0 38 185 127.5 50 182 0.5
7 125.4 38 184 125.9 59 183 0.5
8 125.4 38 187 125.8 26 192 0.4
HS 118.7 -32 180 115.4 -34 180 -3.4
* Day = number of days before (-) or after December 31, 2018.
Table 8.7: CHANGE IN TEST DURATION (IN MINUTES)
Subject Group CAT 2018 PT 2018 Total 2018 CAT 2019 PT 2019 Total 2019 Change CAT Change PT Change Total
English Language Arts/Literacy 3 101.4 124.2 225.0 99.0 121.2 220.2 -2.4 -3.0 -4.8
4 106.2 132.6 238.8 103.2 131.4 234.6 -3.0 -1.2 -4.2
5 106.8 129.0 235.8 103.2 129.6 232.8 -3.6 0.6 -3.0
6 110.4 130.2 241.2 118.2 125.4 243.6 7.8 -4.8 2.4
7 102.0 114.6 216.6 102.0 117.6 219.6 0.0 3.0 3.0
8 98.4 114.0 212.4 103.2 112.8 216.0 4.8 -1.2 3.6
11 84.6 82.8 167.4 84.6 84.6 169.2 0.0 1.8 1.8
Average NA NA NA NA NA NA 0.5 -0.7 -0.2
Mathematics 3 85.8 45.6 131.4 87.0 46.8 133.8 1.2 1.2 2.4
4 88.8 45.0 133.2 94.2 47.4 141.0 5.4 2.4 7.8
5 91.2 67.8 158.4 97.8 67.2 165.0 6.6 -0.6 6.6
6 103.2 58.2 161.4 106.2 59.4 166.2 3.0 1.2 4.8
7 92.4 34.2 126.6 94.2 36.0 130.2 1.8 1.8 3.6
8 102.6 42.6 145.2 103.8 42.0 146.4 1.2 -0.6 1.2
11 75.6 30.6 106.2 76.2 34.8 111.0 0.6 4.2 4.8
Average NA NA NA NA NA NA 2.8 1.4 4.5

8.5 Change in the Item Pool

There are two very important reasons why one would not expect year-to-year changes in the item pool to cause changes in student achievement in Smarter Balanced assessments. The first is that Smarter Balanced equates student measures across years through the use of industry-standard scale construction and linking methods. These methods rely on Item Response Theory models in which student achievement can be measured independently of item difficulty. In Item Response Theory, a hard test and an easy test should produce the same measure for the same student.

The second reason is that with computer adaptive testing, tests delivered to a student from two different item pools will have practically the same level of difficulty, and therefore the same measurement precision. Therefore, one would not expect the average scale score or achievement level percentages for a population of students to vary with modest changes in the difficulty of the item pool.

In sum, Smarter Balanced test users can be confident that year-to-year changes in the item pool’s difficulty will not substantially change student achievement. This confidence is linked to the methods Smarter Balanced uses to construct and maintain the measurement scale (Item Response Theory) and to select items for students (CAT).

Nevertheless, year-to-year changes in the item pool may be of interest. The general public may not be as willing to accept the arguments given above for why one would not expect changes in the item pool to cause substantial changes in measures of student achievement. Therefore, it may be comforting to policymakers and others to see evidence that the item pool is not changing drastically from year to year.

Table 8.8 and Table 8.9 show changes in the CAT and PT item pools from 2017–18 to 2018-19. Compared to the previous year (2017–18), there were fewer performance task (PT) items in ELA/literacy, but more in mathematics. In both subjects, there were slightly fewer CAT items. In both subjects and all grades, change in the average difficulty (IRTb) of the CAT item pool was extremely small. The difficulty of the PT item pools decreased substantially in most grades within both subjects. This change was intentional in response to the fact that PT pools have historically been somewhat too difficult relative to examinees’ achievement. Changes in the average item discrimination parameter (IRTa) were generally insubstantial, although the PT pools showed more change than the CAT pools, as one might expect from the fact that the PT pools showed larger changes in average item difficulty as well.

Table 8.10 and Table 8.11 show the overlap of the 2017–18 and 2018-19 item pools. With only one exception involving a single item, there were no new CAT items in the 2018-19 pool. All the CAT items in the 2018–19 pool were also in the 2017–18 pool. Comparing Table 8.10 and Table 8.11, it can be seen that the PT pools generally changed in two ways: 1) items from the 2018 pool were dropped, and 2) new items were added. The newly added items were generally easier than the average difficulty of the corresponding 2018 PT pool and, though not shown, easier than the dropped items. These contrasts explain the general tendency of PT pools to be easier in 2019 than they were in 2018.

Table 8.8: CHANGE IN THE CAT ITEM POOL
Subject Group CATN 2018 CATIRTa 2018 CATIRTb 2018 CATN 2019 CATIRTa 2019 CATIRTb 2019 DIFCAT N DIFCAT IRTa DIFCAT IRTb
ELA 3 940 0.664 -0.434 867 0.662 -0.526 -73 -0.002 -0.092
4 898 0.593 0.123 823 0.586 0.044 -75 -0.007 -0.079
5 881 0.602 0.502 787 0.598 0.409 -94 -0.004 -0.093
6 819 0.555 0.969 811 0.556 0.957 -8 0.001 -0.012
7 744 0.536 1.250 735 0.535 1.252 -9 -0.001 0.002
8 819 0.537 1.278 815 0.537 1.286 -4 0.000 0.008
11 2,631 0.491 1.771 2,612 0.491 1.764 -19 0.000 -0.007
MATH 3 1,269 0.827 -0.771 1,234 0.827 -0.790 -35 0.000 -0.019
4 1,341 0.816 -0.089 1,325 0.818 -0.094 -16 0.002 -0.005
5 1,291 0.759 0.573 1,268 0.759 0.565 -23 0.000 -0.008
6 1,187 0.689 1.153 1,147 0.690 1.140 -40 0.001 -0.013
7 1,085 0.717 1.877 1,047 0.716 1.871 -38 -0.001 -0.006
8 972 0.581 2.315 915 0.574 2.284 -57 -0.007 -0.031
11 2,736 0.581 2.585 2,610 0.577 2.568 -126 -0.004 -0.017
Table 8.9: CHANGE IN THE PT ITEM POOL
Subject Group PTN 2018 PTIRTa 2018 PTIRTb 2018 PTN 2019 PTIRTa 2019 PTIRTb 2019 DIFPT N DIFPT IRTa DIFPT IRTb
ELA 3 48 0.702 0.394 38 0.668 -0.022 -10 -0.034 -0.416
4 63 0.635 0.586 44 0.624 0.160 -19 -0.011 -0.426
5 73 0.690 0.937 50 0.676 0.455 -23 -0.014 -0.482
6 47 0.839 1.131 38 0.796 0.724 -9 -0.043 -0.407
7 60 0.772 1.349 48 0.800 1.007 -12 0.028 -0.342
8 68 0.694 1.511 50 0.689 1.163 -18 -0.005 -0.348
11 80 0.593 1.972 56 0.597 1.518 -24 0.004 -0.454
MATH 3 80 0.890 -0.521 95 0.887 -0.660 15 -0.003 -0.139
4 94 0.856 -0.058 116 0.849 -0.128 22 -0.007 -0.070
5 85 0.758 1.012 105 0.744 0.784 20 -0.014 -0.228
6 71 0.734 0.789 92 0.705 0.779 21 -0.029 -0.010
7 85 0.893 1.563 92 0.843 1.444 7 -0.050 -0.119
8 58 0.878 1.809 79 0.780 1.567 21 -0.098 -0.242
11 61 0.662 2.674 70 0.629 2.662 9 -0.033 -0.012
Table 8.10: OVERLAP OF CAT ITEM POOLS
Subject Grade CATN Common CATIRTa Common CATIRTb Common CATN New CATIRTa New CATIRTb New
ELA 3 867 0.662 -0.526 NA NA NA
4 823 0.586 0.044 NA NA NA
5 787 0.598 0.409 NA NA NA
6 811 0.556 0.957 NA NA NA
7 735 0.535 1.252 NA NA NA
8 815 0.537 1.286 NA NA NA
11 2,612 0.491 1.764 NA NA NA
MATH 3 1,234 0.827 -0.790 NA NA NA
4 1,324 0.817 -0.094 1 0.934 0.227
5 1,268 0.759 0.565 NA NA NA
6 1,147 0.690 1.140 NA NA NA
7 1,047 0.716 1.871 NA NA NA
8 915 0.574 2.284 NA NA NA
11 2,610 0.577 2.568 NA NA NA
Table 8.11: OVERLAP OF PT ITEM POOLS
Subject Grade PTN Common PTIRTa Common PTIRTb Common PTN New PTIRTa New PTIRTb New
ELA 3 28 0.728 -0.011 10 0.500 -0.054
4 36 0.651 0.158 8 0.502 0.172
5 40 0.733 0.409 10 0.449 0.640
6 28 0.904 0.714 10 0.494 0.751
7 38 0.852 0.981 10 0.605 1.107
8 40 0.757 1.013 10 0.414 1.759
11 46 0.604 1.579 10 0.564 1.237
MATH 3 75 0.887 -0.546 20 0.883 -1.089
4 94 0.856 -0.058 22 0.818 -0.427
5 85 0.758 1.012 20 0.686 -0.188
6 70 0.733 0.802 22 0.615 0.704
7 77 0.879 1.519 15 0.658 1.057
8 57 0.874 1.819 22 0.536 0.914
11 54 0.641 2.773 16 0.592 2.288