| Title | Platform/Publisher | Description | URL | Year | Data Formats |
|---|---|---|---|---|---|
| iSnap - Introductory Programming | DataShop | "iSnap logs all student actions to a remote database, including any interactions with the user interface and coding area. It... | Link | 2017 | ProgSnap |
| Scratch Dataset | GitHub | "A dataset of 250K recent Scratch projects from 100K different authors scraped from the Scratch project repository. We processed the... | Link | ||
| Hour of Code 2013 | Code.org | Link | 2014 | ||
| ShortAnswersIDSV | Harvard Dataverse | This data set contains exam questions and answers from an introductory course to computer science | Link | ||
| Supplementary data for study: Challenges Faced by Teaching Assistants in Computer Science Education Across Europe | DataverseNO | This data includes the themes, sub-themes, codes and exemplary quotes from the analysis of reflection essays for the study "Challenges... | Link | 2020 | |
| CSEDM 2019 Data Challenge | DataShop | The dataset used in the challenge comes from a study of novice Python programmers working with the ITAP intelligent tutoring... | Link | 2016 | 0 |
| CodeWorkout data Spring 2019 | DataShop | Link | 2019 | ProgSnap 2 | |
| Supplementary data for study: Understanding the Relation Between Study Behaviors and Educational Design (Study 1) | DataVerseNO | It has been identified that the first-year experience is crucial to student motivation and throughput of study programs, therefore it... | Link | 2018 | |
| Code Hunt | GitHub | Code Hunt is a serious education game which has been played by over 140,000 students and enthusiasts over the past... | Link | 0 | 0 |
| Blackbox | ACM Digital Library | Blackbox is a perpetual data collection project that collects data from worldwide users of the BlueJ IDE -- a programming... | Link | 0 | 0 |
| 2019 CS1 Keystroke Data | Harvard Dataverse | Keystroke data collected from CS1 student participants during 2019 at Utah State University. See readme.txt for detailed information. This dataset... | Link | 2019 | ProgSnap 2 |
| 2021 CS1 Keystroke Data | Harvard Dataverse | Keystroke data collected from CS1 student participants during fall 2021 semester at Utah State University. See readme.txt for detailed information.... | Link | 2021 | ProgSnap 2 |
| CloudCoder | GitHub | CloudCoder is an open source web-based programming exercise system (inspired by CodingBat). It is designed to make it easy for... | Link | ||
| OLI Introductory Programming with Media | DataShop | Link | 2010 | 0 | |
| CloudCoder | Link | ||||
| Mob Programming | DataShop | Link | |||
| RedBlackTreeTutor | DataShop | Link | |||
| TMC | Link | ||||
| KC Modeling for Programming | DataShop | Step-by-step analysis of students solving introductory programming questions in Python | Link | 2016 | Custom |
| Fall 2019 use of OpenDSA Formal Languages eTextbook | DataShop | Student utilization of e textbook, student perceptions and performance on exams | Link | 2019 | -- |
| Runestone Interactive | DataShop | Analsysis of student hint seeking behaviour in relation to time spent | Link | 2019 | |
| OLI Principles of Computing | DataShop | Python | Link | 2021 | -- |
| Python Trace Table Tutor | DataShop | Link | |||
| QuizJET | DataShop | Link | |||
| ReadingCircle | DataShop | Link | |||
| Utrecht Python Datasets | DataShop | Link | |||
| INFSCI OOP Studies | DataShop | Link | |||
| Australian Institute Python Datasets | DataShop | Link | |||
| E-learning Design Course Instances | DataShop | 0 | Link | 2022 | |
| Robomission | Github | Block-based environments are today commonly used for introductory programming activities like those that are part of the Hour of Code... | Link | ||
| CodeBench | CodeBench | CodeBench is a Programming Online Judge developed by the Institute of Computing (IComp) of the Federal University of Amazonas, Brazil.... | Link | 2023 | |
| How Creatively Are We Teaching and Assessing Creativity in Computing Education: A Systematic Literature Review | Zenodo | Link | 2021 | ||
| METRECC Africa 2020 data | Apollo - University of Cambridge Repository | This file includes the responses from the 58 study participants to the survey questions on demographics, years of teaching experience,... | Link | 2021 | |
| FalconCode | FalconCode | FalconCode -a collection of over 1.5 million Python programs from over two thousand undergraduate students capturesoverfivesemestersworthofcodesamplesfromourintroductiontocomputingcourse,whichistakenbyeverystudent regardlessof theiracademicmajor. | Link | 0 | 0 |
| IDE Action Log Dataset from a CS1 MOOC | Zenodo | This is a a dataset containing Integrated Development Environment (IDE) logs from an introductory programming MOOC. The dataset contains information... | Link | 2017 | 0 |
| The Conventional versus a constructionist-Scratch programming instructions and students achievements in higher education CS1 classes. | Mendeley Data | Link | 0 | ||
| Dataset: Recursive problem solving in the online learning environment CodingBat by computer science students | DZHW | Link | 2017 | ||
| Programming steps working group at ITiCSE'22 | GitHub | The data is from an online introductory programming course using Dart language. The students have varied backgrounds and study from... | Link | 0 | 0 |
| Concept Map for Cybersecurity Courses | GitLab | Link | |||
| Cybersecurity Literature Review | Zenodo | This paper discusses trends,and implications for further research in cybersecurity education. | Link | 2019 | 0 |
| Distributed System Syllabi | Zenodo | authors try to map 51 offerings of distributed systems courses from different schools to two popular curriculum initiatives | Link | 2020 | |
| Supplementary materials for the paper "Hyperstyle : A Tool for Assessing the Code Quality of Solutions to Programming Assignments" | Zenodo | Link | 2021 | 0 | |
| Group Work in Learning Programming | DZHW | The research project "Digital Programming in Teams" (DiP-iT) investigates how collaborative learning in computer science studies can be didactically developed... | Link | 2020 | |
| Discovering Misconceptions in formal methods using ITS | OSF | In this data repository we store the data for the paper Discovering and quantifying misconceptions in formal methods using intelligent... | Link | 2022 | |
| CS1QA | GitHub | Repository for CS1QA: A Dataset for assisting Code-based Question Answering in an Introductory Programming Course, published at NAACL 2022 The... | Link | ||
| Artifacts of FSE-2017 paper on an Intelligent Tutoring System for Programming | Github | In our ESEC/FSE-17 paper titled A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments, we apply four... | Link | 2015 | |
| Dataset of Program Source Codes Solving Unique Programming Exercises Generated by Digital Teaching Assistant | Zeondo | The programming exercises were automatically generated by the Digital Teaching Assistant (DTA) system that automates a massive Python programming course... | Link | 2022 | |
| Dataset for the evaluation of student-level outcomes of a primary school Computer Science curricular reform | Zenodo | Student learning and perception data from three studies with respectively 1384, 2433 and 1644 grade 3-6 students (ages 7-11) and... | Link | 2022 | |
| Unravelling the numerical and spatial underpinnings of computational thinking: a pre-registered replication study | OSF | Link | 2022 |
This dataset catalog is a compilation of open-source datasets in computing education, curated by the "Where is the data? Finding and reusing datasets in computing education" CompEd 23' working group. The working group aims to make research data more accessible and encourage open data practices in the computing education research (CER) community. For more information, please refer to the working group's paper: Kiesler, Natalie, John Impagliazzo, Katarzyna Biernacka, Amanpreet Kapoor, Zain Kazmi, Sujeeth Goud Ramagoni, Aamod Sane, Keith Tran, Shubbhi Taneja, and Zihan Wu. "Where's the Data? Exploring Datasets in Computing Education." In Proceedings of the ACM Conference on Global Computing Education Vol 2, pp. 209-210. 2023.