Data Privacy - the Self-Evident problem ignored by EdTechs
Dr. Mansoor Agha Siddiqui
Academic Head at IvyLeague Career Services, Master Trainer for Teachers at ETS, Visiting Faculty at IMS and Raus IAS
Lets look at implications in future, before we look at a present problem that is being ignored by EdTechs today :
Yesterday, I was facilitating a teacher training workshop for three coaching institutes. It was a mixed bunch of 40 teachers, from rookies to really experienced ones and I had also asked five students to join in as I needed feedback about the attitude I was trying to develop in the teachers. What surprised me was the level of concern the students as well as teachers had about the data of test performance and class performance. They wanted to know whether I would share that data with their employers or with anyone. I told them the company policy of deleting all data whether diagnostic test, final test score data or administrative data such as attendance and class interactivity. The teachers were worried that low scores would be used against them by HR department during appraisal, while students were wary about their test and administrative data being accessible to future employers.
It’s the year 2026 and a 23-year-old college graduate Roma is has started applying for her first job. She has the right qualifications, good grades and a strong resume for the entry-level position in her dream company. But unknown to her, another organization has a lot of data about her and is getting paid to share it with all potential employers. An year back she had missed a lot of classes and tests in her college, as her father was hospitalized. An EdTech handling the college LMS had compiled data on her academic career and selling it to potential employers.
The series of missed classes from her colleges in the data raises a red flag in the algorithm used by the HR department to screen applicants, without granular details about why those classes were missed. Missed classes are labelled as Truancy and Roma will now have a tough time and get eliminated before she gets a chance at an interview.
All EdTech companies track a variety of student data metrics, generally for non-nefarious reasons. This data is generated and tracked internally for Artificial Intelligence algorithms governing personalization and leader-boards to increase engagement. The data is also tracked externally for marketing the product upgrades or more products to the users. However, such data collection poses significant user privacy risks, including unclear security protocols, heavily concentrated data aggregation, a lack of transparency and communication around terms and conditions, and insufficient policies regarding data archiving. While Big Data provides the opportunity for EdTech entrepreneurs to create innovative technology solutions for educational issues, it also has ushered in a wave of privacy issues.
EdTech startups undergo frequent acquisitions and mergers. Recently Byjus acquired WhiteHat Jr, while Unacademy acquired PrepLadder and CodeChef. This increases the risk of failing to maintain student data privacy safeguards. For example, let’s consider the student data management system PowerSchool to illustrate the difficulty in maintaining student privacy when there is leadership or ownership turnover. The PowerSchool system tracks student data in a number of sensitive areas ranging from attendance to behavioral misconduct to performance on academic assessments. The company has changed ownership three times in 16 years. It was first bought first by Apple, then Pearson, and now Vista Equity Partners. High ownership turnover rate is a common phenomenon among many ventures in the EdTech space. Every time a company changes hands, however, it opens the possibility for weakened protections around its student data.
Companies that occupy a disproportionate share of the market such as Google ring greater privacy concerns. Google has gained mass market share in classrooms in part because the company’s size allows for the development of quality products that can be offered to users for free. For example, Google Apps for Education [GAFE] is on pace to hit 180M users by 2022. Microsoft Teams had 115 million daily active users in October 2020 and is projected to hit 200 million in July 2021. Zoom has already hit 300 million daily users globally. This growth should raise serious concerns for two primary reasons.
First, school administrators who place everything in a single GAFE account (or a comparable product such as Microsoft 365 for Education) make it possible for a single hacked administrator login to reveal a swath of student data, including student work, teacher feedback, grades and class history.
Second is the issue of mining of student data. Google makes about 90 percent of its money from selling ads and collects and mines user data on an ongoing basis. In response to a lawsuit brought forward by the Electronic Frontier Foundation, Google admitted that it mined data from G Suite for Education users who use core services outside of G Suite for Education-- contrary to their user license agreements. This G Suite for Education user data includes name, email address, telephone number, device information, and IP address. In response to another lawsuit, Google admitted that it scanned student emails for advertising purposes. In fact, the state of Mississippi recently sued Google for illegally harvesting student data, and asked the company to fully disclose its data tracking practices. Google relies on data mining because the practice supports the company’s non-paid business model for users by providing a way for the company to make a profit. The issue of data mining as a component of an EdTech company’s business model extends to Facebook, which makes 98 percent of its money from advertising, is also giving away a free education software product. The EU found that the company illegally changed its position regarding data mining for WhatsApp users in order to better advertise to target customers. The recent mass migration of users from WhatsApp to Telegram on change in data privacy policy by Facebook is a warning bell to all.
Why isn’t there a greater public outcry to address these student data privacy issues? There are a few reasons. First, it can be difficult for everyday users to notice what data companies are collecting on them, how they are storing it, and how they are using it. Second, companies often aren’t transparent about what they’re doing. Inbloom, a non-profit organization focused on personalized learning in K-12 education, was forced to shut down due to significant profit losses resulting from parent outcry resulting from the company’s lack of disclosure about what data they were using and why. Similarly, Clever, a company that specializes in keeping educational applications synced, only made slight changes to how it handled opting-out of data collecting and processing school and teacher feedback after it was attacked by a handful of savvy users for a clause in its privacy policy. The policy initially gave the company free reign to change whatever, whenever, without telling schools. This had also been true of ClassDojo’s policy several years ago, although the policy has been updated since 2014 and now states that “We won’t reduce your rights under this Privacy Policy without your explicit consent. If we make any significant changes, we’ll provide prominent notice by posting a notice on the Service and/or notifying you by email (using the email address you provided), so you can review and make sure you know about them.”
We need to have a conversation about what companies are doing and about how EdTech data privacy practices should be regulated. EdTech companies need to earn user trust and should do so by raising consumer awareness about what data they are using and how they are using it. EdTech users-- parents, students and educators-- need to do their homework and understand the privacy policies of the technology products that they’re using. While this may seem daunting, CoSN’s Protecting Privacy in Connected Learning Toolkit is an excellent resource that offers an in-depth guide to key federal student data privacy laws. It also includes guidance on how key laws operate together, suggested contract terms, explanations of metadata and data de-identification and use of click-wrap agreements.
Moreover, EdTech products should be monitored more closely as part of a larger dialogue about how and under what conditions companies should be allowed to advertise in classrooms. As an EdTech community, if we continue to ignore privacy concerns, educators and parents may avoid the use of technology solutions, which would reduce the use and impact of EdTech products. For this reason, understanding and addressing the real concerns around student data privacy is essential to the continued growth of the EdTech ecosystem, before a class action suit stops us in our tracks.