Why Is Poor Data Quality a Problem for Higher Education?
Image from https://tinyurl.com/5y75am57

Why Is Poor Data Quality a Problem for Higher Education?

Universities are rich in data, with various types of data about students, faculty, and staff; learning activities and processes related data; data about study programs and courses; research data produced by faculty and students; data collected by many network devices and sensors, etc.

The question is whether universities are able to realize the true power of their data. Are they using it to drive student success, guide decisions, overcome challenges, improve operational efficiency, and inform responsible strategies and plans? Some are, but many are not.

With universities' increasing use of data, they have also discovered that they have bad data. Data-driven decision making depends heavily on the quality of the data used. If the data is unreliable or not trustworthy, that is, of poor quality, the decisions and actions made based on it will not help and can actually harm. Simply put, garbage in, garbage out.?

Data quality issues are not a problem only for higher education. They persist across all domains and affect every organization that collects and uses data. That is, all organizations!

If you are an administrator at a university, do any of the following scenarios sound familiar?

  • Obtaining different answers when asking about the number of students you have, depending on who you ask?
  • Struggling to quantify your faculty workload?
  • Not being able to contact alumni or even students, although you have their phone numbers and email addresses?
  • Facing difficulties in tracking your faculty research information?
  • Dealing with upset students who believe they have fulfilled the requirements of their programs only to discover that they still have a few more courses to complete?
  • Trying to explain why different analysts across the university get different results from the data?
  • Having grade discrepancies between the SIS and LMS?

If some universities struggle with such basic, straightforward expectations, imagine how it would be with more serious, mission-impacting questions, such as those pertaining to at-risk students or personalized learning. Or those related to planning for new programs or closing existing ones? Or those related to library investments to better serve university stakeholders? Or those impacting funding decisions for research? Etc.


No alt text provided for this image

High-Quality Data

Data is of high quality when it meets business needs and the expectations of its end users. This can be assessed using several data quality dimensions including but not limited to:

  • Accuracy – Does the data truly reflect the real-world events or entities being described?
  • Completeness – Does all the mandatory data exist? Are all the fields complete? Are there any gaps in the data?
  • Consistency – Is the data consistent across the different systems? Does the data in one system align with the data stored in other systems?
  • Timeliness – Is the delay between the actual event occurrence and the availability of the related data to the business users acceptable? Is the data age suitable for its intended use??
  • Validity – Does the data conform to established formats, rules, and standards, including internal, external, and industry standards?
  • Integrity – Is the data truly connected and can be traced, even as it gets stored and used in diverse systems? According to some definitions, data integrity encompasses accuracy and consistency over the entire life cycle of the data.
  • Uniqueness – Do you have duplicate data for the same event, object, or entity? Does similar data appear more than once within the same dataset or across different datasets?


No alt text provided for this image

Consequences of Poor-Quality Data

Poor data quality impacts many areas at higher education institutions. The below includes some of them.

Incorrect decisions

Almost all decisions and workflows in a university rely on data. When there is bad data in a system, business processes fail and such errors can be costly. This can range from applicants being wrongly rejected or admitted, to students being registered in courses that are not in their study plans, to wrong academic advising decisions, to inaccurate tuition billings, to wrong decisions about financial aid eligibility, to payroll errors, etc. Consider, for example, the miscalculation of course section fill rates, which can result in either overcrowded classes, or underfilled classes and a lot of waste.

Reduced productivity

When data is suspected to be unreliable, business users must do one or more of the following activities, which waste time and resources and take them away from their primary functions.

  • Validate the accuracy of the data
  • Perform manual workarounds
  • Re-collect the same data
  • Correct the erroneous data

Under deadline pressures, some users may resort to localized ad-hoc data corrections, which can result in more data quality issues.

Missed opportunities

Poor-quality data in higher education can lead to inaccurate student forecasting and misleading predictions. This can be costly, as a university might end up investing resources in courses that will not be of interest to students or programs that will not attract enough applicants.

Poor-quality data can also make it difficult for a university to reach out to applicants, students, and alumni, and stay connected with its customers and stakeholders.

Furthermore, poor-quality data can negatively impact the reputation of a university, which in turn results in missed potential opportunities. These days, even a single negative student experience can quickly go viral on social media, costing a university many new applicants.

Accreditation risks

Higher education institutions are required to regularly undergo lengthy accreditation and licensure processes that rely heavily on data. Institutions are evaluated against specific standards to ensure that they are providing high-quality education to their students. Based on the provided data, accreditation agencies make informed decisions about an institution's compliance with the standards. If the data provided by an institution (e.g., on graduation rates, etc.) is of poor quality, it can lead to inaccurate assessments and can impact the institution's accreditation status. Furthermore, poor data quality makes it difficult for an institution to accurately track its performance and identify areas where it needs to improve to meet accreditation standards.


Data Quality Challenges

Many challenges hinder data quality. The following lists the main ones.

Poor data governance

Many universities still deal with data as a byproduct of systems and not as a product or an asset that has value. They do not recognize that they have data quality issues and do not have formal programs to govern and manage their data. At best, they engage in sporadic data correction projects that are short-lived without getting to the root causes of the data quality issues.

Furthermore, compared to other industries, higher education is probably slower in recognizing the need for dedicated senior data positions such as chief data officers. Chief data officers are usually responsible for leading data management and governance initiatives and strategies as well as establishing data governance policies, procedures, standards, and guidelines.

Data entry errors

The journey of a student at a university often starts with an application process where students enter their details. Unfortunately, many data entry errors occur at this early stage. Students may lack the understanding of how to enter certain information correctly, or they may simply be unbothered to do so! There are also many data entry points in other university processes where data can be incorrectly entered by students, faculty, academic administrators, and staff. Bad data then affects and corrupts downstream systems and processes!

Many data sources and types

Due to the nature of their business, higher education institutions’ data comes from a variety of sources and is stored in a variety of systems. These include student information systems, learning management systems, enterprise resource planning systems, customer relationship management systems, surveys of all types, external sources, etc. Universities collect data on anything, everything, and all the time!

Add to that the fact that many types of data are being collected. The vast majority of data is no longer structured. Emails, documents, images, videos, and social media posts are unstructured data and are therefore more difficult to process, manage, and analyze.

The multitude of data sources and data types can lead to data quality issues.

Poor data integration

This is related to the previous challenge. Although universities engage in occasional initiatives to integrate systems and reduce data redundancies, many still lack the needed data and system integration expertise. This is made more challenging given that most core systems used in higher education are commercial systems. Vendors are sometimes unable or unwilling to do what is necessary to enable the integration of their systems with other existing ones. This results in silos and fragmented systems, often hosting overlapping details about the same entities (e.g., students). This can get messier if the owners of the different systems employ different business definitions of their data. For example, what constitutes an "active student" or how to define "student retention"?

Outdated data

A major challenge for data quality issues at any university is outdated or obsolete data. Outdated data may persist in a university because of a lack of processes, systems, and/or resources to update data when and where needed. This problem accumulates with time and makes it very difficult to use the data in any useful fashion. There are various examples where this is often seen. For example, student contact information, department codes, course codes, student academic statuses, etc.

Traditional IT structures and processes

Nowadays, the IT department in any university plays a major role in enabling the university’s mission and objectives, and delivering services to students, faculty, and staff. The IT department is responsible for building, managing, and maintaining the technological infrastructure of the university, including computing and data storage. The IT department is also responsible for software solutions, whether they are built in-house, procured, or leased.?

However, if the IT department lacks a deep understanding of the data and data management principles, this can lead to data quality issues. For example, not accounting for data modeling in the Software Development Life Cycle (SDLC), poor data integration, compliance issues with data governance (if any), wrong practices such as direct changes in the databases, wrong data cleansing practices, workarounds to make end users happy at the expense of data quality, insufficient data quality monitoring, etc.


AI and Data Quality

We cannot talk about data without talking about AI! But the focus here is only on how AI can be used to support data quality work. (Of course, AI can also be used to identify patterns and trends in data, help universities make informed decisions, identify potential risks, etc.)

AI can play a significant role in ensuring data quality:

  • AI-based algorithms can help with data profiling, including detecting and correcting data errors, inconsistencies, and duplicates. For example, AI can be used in record matching and entity resolution.
  • AI can be used to validate data by comparing it to predefined rules, such as data types, ranges, and dependencies.
  • AI can help in integrating data from multiple sources and formats in a way that ensures data quality.
  • AI-based tools can help monitor data quality in real-time and raise alerts when issues are detected.
  • AI can help with the classification of data, making it easier to protect sensitive data.


What Can Be Done?

Universities must address the challenges listed above. However, it starts with data governance, as data governance informs and guides all data management activities and work, which should lead to high-quality data that is fit for purpose. Data quality efforts can be performed without data governance, but data quality cannot be sustained without data governance.?

Data governance helps to ensure data is entered correctly, and defined, codified, and interpreted consistently throughout the institution. Without data governance, there will not be uniform rules for data collection, storage, integration, protection, retrieval, analysis, and disposal, and business users will not trust the data.

High data quality, informed and guided by sound data governance, will have a positive influence on any higher education institution in terms of student enrollment, student success, efficiency, productivity, innovation, etc. Instead of wasting time and resources checking if the data can be trusted, users can spend more time using and extracting value from the data.

With sound data governance, a university is expected to:

  • Have a culture that fosters cooperation and collaboration between all relevant stakeholders, including the IT department, data management staff, academic departments, and business units.
  • Focus on its critical data, where data quality investments will impact organizational outcomes.
  • Understand that data quality management is a continuous, culture-enabled process that is the responsibility of all staff.
  • Have a single source of truth for any critical data.
  • Remember that preventing data quality issues is less costly and more sustainable than only correcting those issues.
  • Have quality control over data input to stop poor-quality data from causing issues to downstream operations.
  • Define and implement standardization across the different data sources with clear, agreed-on, and documented business definitions.
  • Include data management requirements as a key part of any project and SDLC, and do so from the early stages.
  • Invest in proper system and data integration to eliminate data redundancies and discrepancies.
  • Utilize automated quality management tools, while noting that tools without sound data governance will not help.
  • Recognize the need for having dedicated data quality professionals, separate from its IT department.
  • Boost its overall data management capacity through awareness and training programs.

No alt text provided for this image

Concluding Remarks

Data quality is critical for universities to be able to make correct and smart decisions about their programs, students, faculty, and resources. Without high-quality data, universities might make serious mistakes, miss opportunities to improve, or even risk losing their accreditation status. Bad data can also harm their reputation. On the other hand, good data can help improve student outcomes, optimize resource utilization, and inform decision-making.

Robust data governance is needed for sustained data quality, and universities must recognize this need. Universities must begin treating data as an asset that has value and can be used to create value. Universities are encouraged to invest the necessary resources and expertise to achieve this goal.


What do you think?



#highereducation #highered #data #dataquality #dataqualitymanagement #datagovernance #universities #academia

Dr. Mohammed Ali Akour

Higher Education Consultant, Executive Member-Learning Ideas Conference, Member of the International Advisory Council

1 年

Excellent article well-thoughtful information. AI and Big Data are major concern for competitors in the near future.

"Mohammed Rasol" Al Saidat

?? Awarded CIO?? Digital Transformation Leader ?? Kaizen 改善

1 年

The scenarios mentioned in the article are amazing Ra?ed Awdeh, PhD and it is crucial that universities address these issues to ensure better student success, operational efficiency, and informed strategies. It is high time that universities prioritize data quality and invest in measures to improve it.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了