Why Is Poor Data Quality a Problem for Higher Education?
Ra?ed Awdeh, PhD
Digital Transformation Leader || Bridging Technology & Business Strategy || CIO ● CTO ● Advisor ● Consultant
Universities are rich in data, with various types of data about students, faculty, and staff; learning activities and processes related data; data about study programs and courses; research data produced by faculty and students; data collected by many network devices and sensors, etc.
The question is whether universities are able to realize the true power of their data. Are they using it to drive student success, guide decisions, overcome challenges, improve operational efficiency, and inform responsible strategies and plans? Some are, but many are not.
With universities' increasing use of data, they have also discovered that they have bad data. Data-driven decision making depends heavily on the quality of the data used. If the data is unreliable or not trustworthy, that is, of poor quality, the decisions and actions made based on it will not help and can actually harm. Simply put, garbage in, garbage out.?
Data quality issues are not a problem only for higher education. They persist across all domains and affect every organization that collects and uses data. That is, all organizations!
If you are an administrator at a university, do any of the following scenarios sound familiar?
If some universities struggle with such basic, straightforward expectations, imagine how it would be with more serious, mission-impacting questions, such as those pertaining to at-risk students or personalized learning. Or those related to planning for new programs or closing existing ones? Or those related to library investments to better serve university stakeholders? Or those impacting funding decisions for research? Etc.
High-Quality Data
Data is of high quality when it meets business needs and the expectations of its end users. This can be assessed using several data quality dimensions including but not limited to:
Consequences of Poor-Quality Data
Poor data quality impacts many areas at higher education institutions. The below includes some of them.
Incorrect decisions
Almost all decisions and workflows in a university rely on data. When there is bad data in a system, business processes fail and such errors can be costly. This can range from applicants being wrongly rejected or admitted, to students being registered in courses that are not in their study plans, to wrong academic advising decisions, to inaccurate tuition billings, to wrong decisions about financial aid eligibility, to payroll errors, etc. Consider, for example, the miscalculation of course section fill rates, which can result in either overcrowded classes, or underfilled classes and a lot of waste.
Reduced productivity
When data is suspected to be unreliable, business users must do one or more of the following activities, which waste time and resources and take them away from their primary functions.
Under deadline pressures, some users may resort to localized ad-hoc data corrections, which can result in more data quality issues.
Missed opportunities
Poor-quality data in higher education can lead to inaccurate student forecasting and misleading predictions. This can be costly, as a university might end up investing resources in courses that will not be of interest to students or programs that will not attract enough applicants.
Poor-quality data can also make it difficult for a university to reach out to applicants, students, and alumni, and stay connected with its customers and stakeholders.
Furthermore, poor-quality data can negatively impact the reputation of a university, which in turn results in missed potential opportunities. These days, even a single negative student experience can quickly go viral on social media, costing a university many new applicants.
Accreditation risks
Higher education institutions are required to regularly undergo lengthy accreditation and licensure processes that rely heavily on data. Institutions are evaluated against specific standards to ensure that they are providing high-quality education to their students. Based on the provided data, accreditation agencies make informed decisions about an institution's compliance with the standards. If the data provided by an institution (e.g., on graduation rates, etc.) is of poor quality, it can lead to inaccurate assessments and can impact the institution's accreditation status. Furthermore, poor data quality makes it difficult for an institution to accurately track its performance and identify areas where it needs to improve to meet accreditation standards.
Data Quality Challenges
Many challenges hinder data quality. The following lists the main ones.
Poor data governance
Many universities still deal with data as a byproduct of systems and not as a product or an asset that has value. They do not recognize that they have data quality issues and do not have formal programs to govern and manage their data. At best, they engage in sporadic data correction projects that are short-lived without getting to the root causes of the data quality issues.
Furthermore, compared to other industries, higher education is probably slower in recognizing the need for dedicated senior data positions such as chief data officers. Chief data officers are usually responsible for leading data management and governance initiatives and strategies as well as establishing data governance policies, procedures, standards, and guidelines.
领英推荐
Data entry errors
The journey of a student at a university often starts with an application process where students enter their details. Unfortunately, many data entry errors occur at this early stage. Students may lack the understanding of how to enter certain information correctly, or they may simply be unbothered to do so! There are also many data entry points in other university processes where data can be incorrectly entered by students, faculty, academic administrators, and staff. Bad data then affects and corrupts downstream systems and processes!
Many data sources and types
Due to the nature of their business, higher education institutions’ data comes from a variety of sources and is stored in a variety of systems. These include student information systems, learning management systems, enterprise resource planning systems, customer relationship management systems, surveys of all types, external sources, etc. Universities collect data on anything, everything, and all the time!
Add to that the fact that many types of data are being collected. The vast majority of data is no longer structured. Emails, documents, images, videos, and social media posts are unstructured data and are therefore more difficult to process, manage, and analyze.
The multitude of data sources and data types can lead to data quality issues.
Poor data integration
This is related to the previous challenge. Although universities engage in occasional initiatives to integrate systems and reduce data redundancies, many still lack the needed data and system integration expertise. This is made more challenging given that most core systems used in higher education are commercial systems. Vendors are sometimes unable or unwilling to do what is necessary to enable the integration of their systems with other existing ones. This results in silos and fragmented systems, often hosting overlapping details about the same entities (e.g., students). This can get messier if the owners of the different systems employ different business definitions of their data. For example, what constitutes an "active student" or how to define "student retention"?
Outdated data
A major challenge for data quality issues at any university is outdated or obsolete data. Outdated data may persist in a university because of a lack of processes, systems, and/or resources to update data when and where needed. This problem accumulates with time and makes it very difficult to use the data in any useful fashion. There are various examples where this is often seen. For example, student contact information, department codes, course codes, student academic statuses, etc.
Traditional IT structures and processes
Nowadays, the IT department in any university plays a major role in enabling the university’s mission and objectives, and delivering services to students, faculty, and staff. The IT department is responsible for building, managing, and maintaining the technological infrastructure of the university, including computing and data storage. The IT department is also responsible for software solutions, whether they are built in-house, procured, or leased.?
However, if the IT department lacks a deep understanding of the data and data management principles, this can lead to data quality issues. For example, not accounting for data modeling in the Software Development Life Cycle (SDLC), poor data integration, compliance issues with data governance (if any), wrong practices such as direct changes in the databases, wrong data cleansing practices, workarounds to make end users happy at the expense of data quality, insufficient data quality monitoring, etc.
AI and Data Quality
We cannot talk about data without talking about AI! But the focus here is only on how AI can be used to support data quality work. (Of course, AI can also be used to identify patterns and trends in data, help universities make informed decisions, identify potential risks, etc.)
AI can play a significant role in ensuring data quality:
What Can Be Done?
Universities must address the challenges listed above. However, it starts with data governance, as data governance informs and guides all data management activities and work, which should lead to high-quality data that is fit for purpose. Data quality efforts can be performed without data governance, but data quality cannot be sustained without data governance.?
Data governance helps to ensure data is entered correctly, and defined, codified, and interpreted consistently throughout the institution. Without data governance, there will not be uniform rules for data collection, storage, integration, protection, retrieval, analysis, and disposal, and business users will not trust the data.
High data quality, informed and guided by sound data governance, will have a positive influence on any higher education institution in terms of student enrollment, student success, efficiency, productivity, innovation, etc. Instead of wasting time and resources checking if the data can be trusted, users can spend more time using and extracting value from the data.
With sound data governance, a university is expected to:
Concluding Remarks
Data quality is critical for universities to be able to make correct and smart decisions about their programs, students, faculty, and resources. Without high-quality data, universities might make serious mistakes, miss opportunities to improve, or even risk losing their accreditation status. Bad data can also harm their reputation. On the other hand, good data can help improve student outcomes, optimize resource utilization, and inform decision-making.
Robust data governance is needed for sustained data quality, and universities must recognize this need. Universities must begin treating data as an asset that has value and can be used to create value. Universities are encouraged to invest the necessary resources and expertise to achieve this goal.
What do you think?
Higher Education Consultant, Executive Member-Learning Ideas Conference, Member of the International Advisory Council
1 年Excellent article well-thoughtful information. AI and Big Data are major concern for competitors in the near future.
?? Awarded CIO?? Digital Transformation Leader ?? Kaizen 改善
1 年The scenarios mentioned in the article are amazing Ra?ed Awdeh, PhD and it is crucial that universities address these issues to ensure better student success, operational efficiency, and informed strategies. It is high time that universities prioritize data quality and invest in measures to improve it.