Student Growth Percentiles (SGP) Data Sets

data sgp

Student Growth Percentiles (SGP) is a method for analyzing normative student achievement data to assess student progress over time. Student growth models use statistical methods based on conditional density to describe the relationships between students’ assessment scores over time and to estimate percentile growth projections/trajectories from these relationships. These percentiles are used to report a students’ performance relative to their academic peers. SGP is an alternative to traditional achievement reports and can be used by teachers, schools, districts, and families to monitor a students’ academic progress over time.

SGPs are based on up to two years of historical MCAS data and compare a students’ current assessment score with the average of their academic peers in the same grade across the state. Academic peers are defined as all students statewide in the same grade and content area who have similar assessment score histories. Academic peers are grouped by demographics, such as gender and socioeconomic status, and by educational programs (e.g., sheltered English immersion and special education).

The SGPdata package provides 4 examplar data sets for running SGP analyses. The first, sgpData, specifies data in the WIDE format that is utilized by lower level SGP functions, such as studentGrowthPercentiles and studentGrowthProjections. The other two, sgptData_LONG and sgptData_INSTRUCTOR_NUMBER, specify data in the LONG format that is utilized by higher level wrapper functions like abcSGP, prepareSGP, and analyzeSGP.

In addition, SGPdata includes a user guide and technical documents to assist users in understanding the SGP process. Additional training resources on the SGP model are also available from OSPI at the Student Growth School and District Resources webpage.

To run SGP analyses with the SGPdata package, you will need a computer that has the R software installed. This is a free, open source statistical programming language that runs on many operating systems. You can find additional information about the R software and resources for getting started at https://cran.r-project.org/.

The sgptData_LONG data set is an anonymized panel dataset that contains assessment data in the LONG format for 8 windows (3 windows annually) of 3 content areas. The dataset is associated with teacher identifiers and demographic variables to produce teacher level aggregates by the summarizeSGP function. The sgptData_INSTRUCTOR_NUMBER data set additionally contains an insturctor lookup table that is utilized to create insturctor level SGP aggregates by the prepareSGP function.

The sgpData_INSTRUCTOR_NUMBER variable is an integer that represents the instructor number of a student’s test record. Teachers can have more than one student associated with their test record and each student can be assigned to multiple instructors over the course of a testing year. Therefore, each student may have a unique sgptData_INSTRUCTOR_NUMBER for each of the tests that they take during the school year. In order to ensure that each student’s data is associated with the correct teacher, this variable is a key to the SGP analysis processes. Without this key, the results of an SGP analysis will be inconsistent and not indicative of the student’s true performance on the test.