Student Shadow Observation Protocol (SSOP)
One of my favorite projects that I worked on during high school was one that I was asked to help with, but ultimately loved so much that I took the whole thing and drove it into something large-scale.
One day in September of my senior year, I was approached by an EOP teacher, since I was known for being… okay with computers, and had been referred to by my Stats teacher at the time. I was asked to help fix a google sheet of a previous, hard-coded, difficult to understand system for monitoring 1-on-1 student activity. Of course, I helped and fixed the bug in the google sheet (Previously, someone was trying to read data from another sheet that wasn’t attached, making it inaccessible), but I also asked, “Wouldn’t this be nice if it was done in a way that ANYONE can read this data?”, since it had been really confusing for the teachers who had little technological experience and might’ve been on the older side. Of course, I was really interested in the idea of giving the whole thing a facelift and reconstructing it from the ground up. so that’s what I did.
Initial Work
Everything starts with the data. We needed a simple way to store the observations teachers made. A Pandas DataFrame is perfect for this. It’s essentially a smart spreadsheet that Python can work with.
import pandas as pd file_path = 'SSOP_Observation_Log.xlsx' # Use pandas to read the Excel file directly into a DataFrame. df = pd.read_excel(file_path) print("loaded data") # this was a debug print that showed a little summary that i just never deleted. print(df.head())
Machine learning models don't understand words like "listening" or "talking." I had to find some way to… tokenize it. Naturally, I used the Tokenizer from TensorFlow/Keras for this. It RIPPED. very useful tool. This was one of my first “professional” or practical things I built using TF / Python in general, and it really opened my eyes to the optimized project.
We define a simple neural network using TensorFlow/Keras. This model will learn to associate the numerical sequences with the correct categories ('On-Task', 'Off-Task', etc.).
The goal was to create a system that was both powerful on the backend and simple on the frontend. The data collection method had to be accessible for teachers who were not technologically inclined. We set up a shadowing system where an observer would log a student's activity in five-minute intervals using plain, natural language. They could write anything from "listening to teacher" and "on-topic" to "talking with friends." This approach removed the need for rigid forms and allowed for more nuanced observations.
The core of the project was the data processing pipeline I built using Python. I used the Pandas library to handle the initial data ingestion and organization, making proper documentation and hosting tutorials for using templated sheets per day, per student, that were then fed in to the protocol when they were finished. The natural language entries were then fed into previously mentioned TensorFlow model, who’s automated classification was, suprising to me at the time, really efficient at turning qualitative observations into quantitative data.
Finally, the aggregated data was exported into spreadsheets. These reports gave teachers a clear, period-by-period breakdown of how each student spent their day, calculated as percentages. The insights were immediate. Teachers could see which parts of a lesson had high engagement and, more importantly, which parts were less accessible to ELL students. For example, the data might reveal a consistent drop in engagement during a specific lecture segment, prompting the teacher to redesign it with more visual aids or interactive components. The SSOP provided concrete, actionable data that helped educators adjust their methods to better support every student. It was a fulfilling project that demonstrated how a targeted application of machine learning can create a direct, positive impact in education.