Date of Award

Spring 4-18-2024

Degree Type

Thesis

Degree Name

MS Data Science

Department

Mathematics and Computer Science

Advisor

Manfred Minimair, PhD

Committee Member

Nathan Kahl, PhD

Committee Member

Kobi Abayomi, PhD

Keywords

regression models, reading complexity, readability, reading scores, excerpt

Abstract

Reading is an essential skill when pursuing academic success. In order to assist students in developing their critical reading skills, it is essential to offer texts that are both within their capability, yet challenging enough to push them beyond their boundaries. At the moment, the majority of excerpts are matched to readability scores utilizing tools such as the Flesch-Kincaid Grade Level scoring in order to determine reader level. The core issue with reading scores structured around the Flesch-Kincaid Grade Level scoring is that the scoring process is not readily available to the public while also lacking validation studies. In this project, reading complexity will be determined by applying exploratory data analysis (EDA) and training a readability score that utilizes regression models. This reading score will be utilized in order to determine a passage’s reading complexity. A positive target value will denote a simple excerpt of text while a negative target value will denote a more difficult or complex excerpt of text. The focus of this M.S. thesis is to establish an understanding of the ‘Target’ score in order to establish readability scores as a trusted resource. We will delve deeper into the ‘Target’ variable and textual excerpts by use of EDA to gather insight on how data should be trained by use of regression models. The regression models output readability and accuracy scores that determine how accurately each regression model is able to determine a proper readability score. The accuracy of a given regression model determines the validity of the regression model’s ability to determine readability.

KEYWORDS

regression models, reading complexity, readability, reading scores, excerpt

Included in

Data Science Commons

Share

COinS