Introduction and Overview

Technical Report Approach

This report is intended to provide evidence in support of the validity and reliability of the Smarter Balanced interim assessments for the 2018-19 school year. The interim assessments are routinely updated by replacing items and adding more tests, but such updates subsequent to the 2018-19 school year are not represented in this report. In particular, focused interim assessment blocks (FIABs) and Tools for Teachers became available for the 2019-2020 school year and are not included in this report. Present tense descriptions of interim assessments in this report should not be taken to apply to interim assessments after 2018-19. There are two types of interim assessments: interim comprehensive assessments (ICAs) and the interim assessment blocks (IABs). Information about the overall system, which includes the summative assessment and Digital Library¹, is provided for context.

At the outset, it should be recognized that demonstration of validity and reliability is an ongoing process. Validity and reliability evidence provided here include evidence from the initial pilot and the field test phases as well as evidence from more recent operational assessments. Members do not provide interim assessment response data or scores to the Consortium for analysis. Consequently, much of the evidence in this report focuses on the development of test items and characteristics of test forms.

Interim tests are not secure. They may be administered in a standard manner or teachers may use interim items or tasks as a basis for class discussion or individual feedback. The interim assessments may be administered to the same students several times during the year and teachers may administer individual assessments at any grade level (e.g., a teacher may administer a grade 4 interim assessment to students in grade 3). Reliability and validity evidence provided in this technical report holds only for tests administered the first time in a standardized setting, assuming that students have not been exposed to the items and each student does his or her own work. When teachers use interim items or tasks as a basis for class discussion or individual feedback, student scores may not have the same properties described in this report.

Smarter Balanced provides a customizable Online Summative Assessment Test Administration Manual (TAM) (Smarter Balanced 2017j) for summative assessments that may be used for standardized interim assessment administrations. Some states have developed interim assessment test administration manuals. Members can usually find the appropriate customized version on their state’s assessment portal. Beginning with the 2019–20 administration, Smarter Balanced will provide a customizable Interim Assessment Guide that members may use to create or update their state-specific interim assessment TAMs.

The Standards for Educational and Psychological Testing (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on Measurement in Education [NCME], 2014), hereafter referred to as the Standards, was used as the foundation for developing the necessary validity evidence. With respect to Smarter Balanced, this information is necessary for understanding the degree to which the Consortium is meeting its goals, and in some cases, what further tasks remain to improve the system as it evolves operationally.

Overview of the Smarter Balanced Assessment System

The Smarter Balanced Assessment Consortium’s (Smarter Balanced) assessment system includes a set of balanced components designed to meet diverse student needs for all Consortium members. This system provides valid, reliable, and fair assessments of the deep disciplinary understanding and higher-order thinking skills increasingly demanded by a knowledge-based global economy. The system is based on the belief that assessment must support ongoing improvements in instruction and learning experiences for students that lead to outcomes valued by all stakeholders. Smarter Balanced supports the goals of its members who seek to ensure that all students leave high school prepared for postsecondary success in college or a career through a planned sequence of educational experiences and opportunities. The system was grounded in the strong foundational assessments, policies, and procedures of its members, including supports and resources from institutions of higher education (IHEs) and workplace representatives. The Consortium expanded on these proven successes to create a high-quality, balanced assessment system based on the Common Core State Standards (CCSS) in English language arts/literacy (ELA/literacy) and mathematics. This report focuses on both interim assessment types—the interim comprehensive assessments (ICAs) and the interim assessment blocks (IABs). Information about the overall system is provided for context.

Smarter Balanced staff provides expert guidance and facilitates member-driven decisions regarding the maintenance and enhancement of the system as required to fulfill its mission to improve teaching and learning. Smarter Balanced members retain flexibility regarding how to customize the system so that it may best be used as part of their approach to improving their local educational systems. The Smarter Balanced assessment system strategically uses a variety of item types, including performance tasks, to sample the full range of the CCSS. The Consortium also deploys essential accessibility resources that are embedded in the test to ensure fair and accurate assessment of all students, including students with disabilities, English language learners, and low- and high-performing students. The Smarter Balanced system includes the following components:

Summative assessments that determine students’ progress toward college and career readiness in ELA/literacy and mathematics. The summative assessments are given at the end of the school year and consist of two parts: a computer adaptive test and a performance task. These secure summative assessments incorporate a variety of item types, including technology-enhanced items, items that require constructed response, and performance tasks. Items are deliberately designed to measure specific content. The assessments include writing at every grade and ask students to solve multi-step, real-world problems in mathematics.
Interim assessments that allow teachers to check student progress throughout the year, providing them with information they can use to improve instruction and help students meet the challenge of college and career readiness standards. These tools are used at the discretion of schools and districts. Teachers can employ them to check students’ progress at mastering specific concepts at strategic points during the school year. There are two types of interim assessments: 1) interim comprehensive assessments (ICAs) test the same content as summative assessments and report scores on the same scale as the summative assessment; 2) interim assessment blocks (IABs) focus on smaller sets of related concepts and provide more detailed information for instructional purposes. The interim assessments incorporate items that are developed along with and according to the same processes as the items in the summative assessment. This means that items are not identified as intended for interim or summative assessments during the item development process. The interim assessments provide more flexible administration options to assist educators in determining what students know and can do in relation to the CCSS. In contrast to the summative assessment, these interim assessments are only available in fixed-form format.
A Digital Library² that is an online collection of high-quality instructional and professional learning resources contributed by educators for educators. These resources are aligned with the intent of the CCSS and help educators implement the formative assessment process to improve teaching and learning. Educators can use the materials to engage in professional learning communities, differentiate instruction for diverse learners, engage students in their own learning, improve assessment literacy, and design professional development opportunities. The Digital Library also allows educators to comment on and rate resources and share their expertise with colleagues across the country in online discussion forums.
Open-sourced technology that members can use to deliver assessments and report results to educators, parents, and students.
Cross-member communications to inform stakeholders about Smarter Balanced activities and to ensure a common focus on the goal of college and career readiness for all students.

The innovative and efficient use of technology serves as a central feature of this balanced assessment system. Some central notions concerning technology use are:

the Smarter Balanced system uses computer adaptive testing to increase the precision and efficiency of the summative tests,
the expanded use of technology enables the development of innovative and realistic item types that measure student achievement across a wide performance continuum, providing opportunities for educator and administrator professional development and local capacity building, and
the use of an interoperable electronic platform and leveraging of cross-member state resources. Through this feature, Smarter Balanced delivers assessments and produces standardized reports that are cost-effective, timely, and useful for a range of audiences in tracking and analyzing student progress toward college and career readiness at the individual student, demographic group, classroom, school, district, and state levels.

In summary, the Smarter Balanced learning and assessment system is grounded in a sound Theory of Action. This system promotes research-supported classroom practice and incorporates a balanced set of technology-enabled tools, innovative assessments, and classroom support materials intended to work coherently to facilitate teaching and learning.

Overview and Background of the Smarter Balanced Theory of Action

The Smarter Balanced Assessment Consortium supports the development and implementation of learning and assessment systems that reshape education in member states to improve student outcomes. Through expanded use of technology and targeted professional development, the Theory of Action calls for the integration of learning and assessment systems, leading to more informed decision-making and higher-quality instruction and ultimately increasing the number of students who are well prepared for college and careers.

The ultimate goal of Smarter Balanced is to ensure that all students leave high school prepared for postsecondary success in college or a career as a result of increased student learning and improved teaching. This approach suggests that enhanced learning will result when high-quality assessments support ongoing improvements in instruction and learning. A quality assessment system strategically “balances” summative, interim, and formative components (Darling-Hammond & Pecheone, 2010). An assessment system must provide valid measurement across the full range of performance on common academic content, including assessment of deep disciplinary understanding and higher-order thinking skills increasingly demanded by a knowledge-based economy.

Six Principles Underlying the Smarter Balanced Theory of Action

The Smarter Balanced assessment is guided by a set of six principles shared by systems in high-achieving nations and in some high-achieving states in the U.S.

Assessments are grounded in a thoughtful standards-based curriculum and are managed as part of an integrated system of standards, curriculum, assessment, instruction, and teacher development. Curriculum and assessments are organized around a well-defined set of learning progressions along multiple dimensions within subject areas. Formative assessment processes and tools and interim assessments are conceptualized in tandem with summative assessments, all of which are linked to the Common Core State Standards (CCSS) and supported by a unified technology platform.
Assessments produce evidence of student performance on challenging tasks that represent the CCSS. Instruction and assessments seek to teach and evaluate knowledge and skills that generalize and can transfer to higher education and multiple work domains. These assessments emphasize deep knowledge of core concepts and ideas within and across the disciplines—along with analysis, synthesis, problem-solving, communication, and critical thinking—thereby requiring a focus on complex performances and specific concepts, facts, and skills.
Teachers are integrally involved in the development and scoring of assessments. While many assessment components are efficiently scored with computer assistance, teachers must also be involved in the formative and summative assessment systems. This is in order to understand and teach in a manner that is consistent with the full intent of the standards while becoming more skilled in their own classroom assessment practices.
The development and implementation of the assessment system is a state-led effort with a transparent and inclusive governance structure. Assessments are structured to improve teaching and learning. Assessments are designed to develop an understanding of learning standards, what constitutes high-quality work, to what degree students are approaching college and career readiness, and what is needed for further student learning.
Assessment, reporting, and accountability systems provide useful information on multiple measures that is educative for all stakeholders. Reporting of assessment results is timely and meaningful in order to guide curriculum and professional development decisions. Results can offer specific information about areas of performance so that teachers can follow up with targeted instruction, students can better target their own efforts, and administrators and policymakers can fully understand what students know and can do.
Design and implementation strategies adhere to established professional standards. The development of an integrated, balanced assessment system is an enormous undertaking, requiring commitment to established quality standards in order for the system to be credible, fair, and technically sound. Smarter Balanced continues to be committed to developing an assessment system that meets critical elements required by U.S. DOE Peer Review Guidance, relying heavily on the Standards as its core resource for quality design.

The primary rationale of the Smarter Balanced assessments is that these six principles can interact to improve the intended student outcomes (i.e., college and career readiness).

Purpose of the Smarter Balanced Assessment System

The Smarter Balanced purpose statements are organized into three categories: (a) summative assessments, (b) interim assessments, and (c) formative assessments and Digital Library³ tools and resources. This report provides information about the interim assessments. The purposes of the summative and formative assessments and Digital Library resources are also stated in this section to provide context for interim assessments as a component of the assessment system.

Summative Assessments

The purposes of the Smarter Balanced summative assessments are to provide valid, reliable, and fair information about:

students’ ELA/literacy and mathematics achievement with respect to the CCSS in grades 3 to 8 and high school;
whether students prior to grade 11 have demonstrated sufficient academic proficiency in ELA/literacy and mathematics to be on track for achieving college readiness;
whether grade 11 students have sufficient academic proficiency in ELA/literacy and mathematics to be ready to take credit-bearing, transferable college courses after completing their high school coursework;
students’ annual progress toward college and career readiness in ELA/literacy and mathematics;
how instruction can be improved at the classroom, school, district, and state levels;
students’ ELA/literacy and mathematics proficiencies for federal accountability purposes and potentially for state and local accountability systems; and
student achievement in ELA/literacy and mathematics that is equitable for all students and targeted student groups.

Interim Assessments

The purposes of the Smarter Balanced interim assessments are to provide valid, reliable, and fair information about:

student progress toward the mastery of the skills in ELA/literacy and mathematics measured by the summative assessment;
student performance at the claim or cluster of assessment targets so teachers and administrators can better measure students’ performance against end-of-year expectations and adjust instruction accordingly;
individual and group (e.g., school, district) performance at the claim level in ELA/literacy and mathematics to determine whether teaching and learning are on target;
teacher-moderated scoring of student responses to constructed-response items as a professional development vehicle to enhance teacher capacity to evaluate student work aligned to the standards; and
student progress toward the mastery of skills measured in ELA/literacy and mathematics across all students and targeted student groups.

Formative Assessments and Digital Library Resources

The purposes of the Smarter Balanced formative assessment resources are to provide tools and resources to:

improve teaching and learning;
help teachers monitor their students’ progress throughout the school year;
illustrate how teachers and other educators can use assessment data to engage students in monitoring their own learning;
help teachers and other educators align instruction, curricula, and assessments to the learning standards and end-of-year expectations;
assist teachers and other educators in using the summative and interim assessments to improve instruction at the individual student, classroom, and school levels; and
offer professional development and resources for how to use assessment information to improve instruction and decision-making in the classroom.

Overview of Report Chapters

The structure of the interim assessment technical report follows that of the summative with the exception of no chapter on scales scores and norms (Chapter 5 in the summative technical report). The chapters shown below are also found in the interim technical report (but are not numbered the same) and include the same essential elements as prescribed by the Standards (AERA, APA, & NCME, 2014):

Chapter Number	Chapter Title
1	Validity
2	Reliability, Precision, and Errors of Measurement
3	Test Fairness
4	Test Design
5	Test Administration
6	Reporting and Interpretation

Brief synopses of the chapters contained in this interim assessment technical report are given below in order to direct further review. At the suggestion of our members, we have written practical descriptions of the purpose of evidence in each chapter to provide context for teachers, parents, and other stakeholders.

Chapter 1: Validity

In a sense, all of the information in this technical report provides validity evidence. Chapter 1 is special in that it provides information about test purposes and the overall approach to showing how scores are appropriate for those purposes. The information in this chapter answers questions such as:

For what purpose was the interim assessment designed to be used?
What evidence shows that test scores are appropriate for these uses?
What are the intended test score interpretations for specific uses?

Evidence bearing on these questions does not change with each administration or testing cycle. Therefore, the validity information presented in Chapter 1 repeats and supplements the validity information in Chapter 1 of previous technical reports.

Chapter 2: Reliability, Precision, and Errors of Measurement

The degree of accuracy and precision of scores contributes to evidence about appropriate interpretations and uses of test scores. Decisions must be made with full understanding of measurement error and reliability. Chapter 2 presents information about how the test performs in terms of measurement precision, reliability, classification consistency, and other technical criteria. The information is based on simulation studies and operational test data from the item pool and school year identified in the title of this report. Information presented in this chapter can answer questions such as:

How accurate and reliable are Smarter Balanced interim test scores?
Are Smarter Balanced test scores equally accurate and reliable for all students?

Chapter 3: Test Fairness

Test fairness concerns whether test scores can be interpreted in the same way for all students regardless of race, gender, special needs, and other characteristics. Evidence for test fairness includes documentation of industry-standard procedures for item development and review, appropriate use of professional judgment (e.g., bias review of items), and statistical procedures for detecting potential bias in test items. Information presented in Chapter 3 can answer questions such as:

How were test questions and tasks developed to ensure fairness to all students?
How is the test administered so that each student can demonstrate their skills?
How does one know that the test is fair to all students?

Chapter 4: Test Design

Test design is predominantly focused on the content validity of the test. Tasks and items must represent the domain of knowledge and skill as intended. For Smarter Balanced assessments, test design includes the relationship of claims and targets to the underlying CCSS, item development, test blueprints, and scoring procedures. Information in Chapter 4 can answer questions such as:

What’s on the test? Is it consistent with stated test purposes?
Does each student get a set of questions that fully represents the content domain?

Chapter 5: Test Administration

Part of test validity rests on the assumption that assessments are administered in a standard manner. Unlike the summative assessment, which is administered in only standardized fashion, interim assessments can be administered in non-standard as well as standardized fashion. For standardized administration, the Consortium provides a common administration template that members may customize up to a point for specific use. Chapter 5 describes the customizable Smarter Balanced Online Test Administration Manual. The information in this chapter can answer questions such as:

What are the conditions for test administration to assure that every student was afforded the same chance for success?
How was the test administered to allow for accessibility for all students?

Chapter 6: Reporting and Interpretation

Reports based on test scores are among the most public-facing features of an assessment program. They must be useful and accurate, supporting the decisions and purposes for which the assessment was designed while discouraging inappropriate conclusions and comparisons. Chapter 6 provides examples of the Smarter Balanced suite of reports and interpretive information and discusses intended uses of report information. Information in this chapter can answer questions such as:

What information do Smarter Balanced reports of the interim assessments contain?
What do scores mean?
How can the reports best be used by teachers and parents?

Acknowledgments

Below is a partial list of individuals and groups that contributed time and expertise to the work of the Consortium.

Technical Advisory Committee

The Technical Advisory Committee (TAC) provides guidance on technical assessment matters pertaining to validity and reliability, accuracy, and fairness. Members of the TAC are highly regarded national experts who have been widely published in their fields. Areas of expertise include: assessment design, computer adaptive testing (CAT), assessment accommodations, uses of tests, mathematics, and English language arts/literacy. The following is a list of committee members as of January 1, 2019.

Randy Bennett, Ph.D. - ETS
Derek C. Briggs, Ph.D. - University of Colorado
Susan M. Brookhart, Ph.D. - Duquesne University
Gregory J. Cizek, Ph.D. - University of North Carolina
Shelbi Cole, Ph.D. - Student Achievement Partners
David T. Conley, Ph.D. - University of Oregon
Brian Gong, Ph.D. - The Center for Assessment
Edward Haertel, Ph.D. - Stanford University
Gerunda Hughes, Ph.D. - Howard University
G. Gage Kingsbury, Ph.D. - Psychometric Consultant
James W. Pellegrino, Ph.D. - University of Illinois, Chicago
Barbara Plake, Ph.D. - University of Nebraska, Lincoln
W. James Popham, Ph.D. - UCLA, Emeritus
Guillermo Solano-Flores, Ph.D. - Stanford University
Martha Thurlow, Ph.D. - University of Minnesota/NCEO
Sheila Valencia, Ph.D. - University of Washington

Students with Disabilities Advisory Committee

The Students with Disabilities Advisory Committee is comprised of national experts in learning disabilities, assistive technology, and accessibility and accommodations policy. This committee provides feedback to Smarter Balanced staff, workgroups, and contractors to ensure that the assessments provide valid, reliable, and fair measures of achievement and growth for students with disabilities. The following is a list of committee members.

Donald D. Deshler, Ph.D.
Barbara Ehren, Ed.D.
Cheryl Kamei-Hannan, Ph.D.
Jacqueline F. Kearns, Ed.D.
Susan Rose, Ph.D.
Jim Sandstrum
Ann C. Schulte, Ph.D.
Richard Simpson, Ed.D.
Stephen W. Smith, Ph.D.
Martha L. Thurlow, Ph.D.

Performance and Practice Committee

The Performance and Practice Committee is comprised of nearly 20 educators from around the nation who were nominated by state chiefs. This committee assesses the efficiency of Smarter Balanced assessments to meet their designed purpose and to deepen overall stakeholder investment. The following is a list of committee members and their member affiliation.

Kandi Greaves (Vermont)
Mary Jo Faust (Delaware)
Shannon Mashinchi (Oregon)
Susan Green (California)
Steve Seal (California)
Tanya Golden (California)
Crista Anderson (Montana)
Melissa Speetjens (Hawaii)
Mike Nelson (Idaho)
Abby Olinger Quint (Connecticut)
Michelle Center (California)
Todd Bloomquist (Oregon)
Jim O’Neill (Montana)
Jen Paul (Michigan)
Toni Wheeler (Washington)
Joe Willhoft (Consultant)
Susan Brookhart (Technical Advisory Committee)

The Digital Library was retired May 2020. Tools for Teachers was launched in June 2020 and is the new home for formative assessment instructional supports.↩
The Digital Library was retired May 2020. Tools for Teachers was launched in June 2020 and is the new home for formative assessment instructional supports.↩
The Digital Library was retired May 2020. Tools for Teachers was launched in June 2020 and is the new home for formative assessment instructional supports.↩

2018–19 Interim Technical Report