This blog post was written by Gaelan Venturi, Senior Computer Science Instructor at the Governor’s School for Science and Technology. He is scheduled to present his session, “Data-Powered Research: Integrating Research and Data Science,” at 10 a.m. Wednesday, Nov. 15.
I started working as a Computer Science instructor at the Governor’s School for Science and Technology (GSST) in 2021. The Governor’s School for Science and Technology has three strands: Engineering, Computational, and Biological Sciences. We admit the best and brightest high school students from surrounding school districts and accept them during their junior year. They will also be with GSST in their senior school year. All of this while still attending their home schools in their districts.
My primary mission was to provide 2nd-year GSST computer science strand seniors with a comprehensive understanding of programming. Things took an exciting turn when I learned I would introduce them to the vast world of data science. This became possible through a partnership with Virginia Tech, enabling us to use their data science curriculum from the Computational Modeling and Data Analysis program inside GSST. With curriculum in hand and with the help of my colleague Laura Vobrak, I eventually provided data science skills to all 2nd-year GSST students in their senior year of high school.
The Rising Significance of Data Science
Though data science is a relatively new academic discipline, its transformative impact can’t be overstated. The field began around the early 2000s, with an educational expansion from 2010 to now. Virginia Tech’s own CMDA major began in 2015. We’re seeing a trickling-down effect of data science as a discipline. It started as part of statistics, came into its own as a graduate and undergraduate major, and now, with me as a part of it, is making its way into public schools at the high school level.
Data science is inherently interdisciplinary, making it an excellent fit for many different fields, but it also hinders it. When it comes to teaching data science, part of the reason it started at such high levels in academia and worked its way down is that the amount of requisite knowledge required to begin working in it is immense: mathematics up through calculus, a thorough understanding of statistics, programming proficiency in multiple languages, ability to communicate effectively to others, etc. All skills acquired over years of experience are certainly not skills that many high school students will have going into their senior year.
The Journey Begins with a Partnership
Virginia Tech has experience providing data science to students who don’t have all those skills by default. Their curriculum generally centers around teaching all of these within a single course. That is, the students have to learn math, statistics, programming, and communication, all in a single class. Perfect for bringing high school students into the world of data science. I primarily taught data science to computer science students during my inaugural year. They benefited immensely, especially with the addition of the programming language R. However, I quickly realized that engineering and biology students missed this invaluable resource. The computer science students already had a year’s worth of programming from their junior year, so the programming material was relatively straightforward for them to learn. However, this left the biological sciences and engineering students without a primer for data analysis with programming in college.
Expanding the Horizons: Integrating Data Science into Research Methodology
Driven to make data science accessible to all, I championed its integration into our Research Methodology course – a fundamental part of GSST’s curriculum. This course equips students with essential research skills, from literature reviews and grant proposals to drafting scientific papers. Most importantly, it pairs students with mentors from the scientific community, facilitating hands-on research projects. Introducing data science to the entire senior class posed its own set of challenges. Over 60% had never dabbled in programming. Thus, I had to start from scratch, teaching them not just data science but foundational programming concepts.
The GSST initiative, co-led by my colleague Laura Vobrak and myself, has redefined how our students approach scientific research. By taking certain days out of the week to focus on data science and the curriculum provided by Virginia Tech, I got all students at GSST to understand importing, cleaning, analyzing, and presenting data using the R programming language. There’s not enough material in the course to say that these students can go into college science and make straight A’s in research. They’ll still need to understand the tools and ways of doing research in their respective fields. Still, it helps them in their college careers as they’ll be a step ahead when they reach their more advanced courses and are asked to learn programming or perform statistical analyses.
A Call to Action: The Importance of Early Exposure to Data Science
Research in modern science hinges on data analysis. Through partnerships, GSST equips students with mentors for hands-on challenges. We ensure career and college readiness by weaving data science into our curriculum. Our aim isn’t to produce data scientists but to offer a solid foundation. Before their first college science course, data analytics won’t be new. Why wait until college to introduce Python, R, or data strategies? These early skills can shape their careers. Starting even in junior year can be beneficial. Data is central to advancing science. High schools are making data analysis a core subject, similar to the sciences, math, and reading, preparing students for a data-centric future.