You Keep Using That Word, I Do Not Think It Means What You Think It Means

You Keep Using That Word, I Do Not Think It Means What You Think It Means

Not the snappiest title ever, but any reference to the?Princess Bride?has to be cut a certain amount of slack! As must I, based on the previous blog post being published some eighteen months ago. In my defence, the last couple of years have been crazy busy which has also generated a whole load of content I’m keen to share.

Starting with a oldie but a goodie, chosen only because I?finally fixed?the Business Glossary template on my website linked to in?this article.?If you’ve not read that, it’s offers lots of good advice on the how and why of generating, maintaining and using a glossary as part of your data governance (DG) efforts.

Today, we’re going to focus a bit more on the generation phase- specifically on common mistakes and a little best practice. The picture at the head of this post explains the problem. Language is messy, it has evolved over many generations and while there are agreed rules and taxonomies, many words and phrases fall outside of agreed norms. I’m not expert on language evolution, but from a DG point of view, two common themes emerge:

  1. Synonyms:?‘We all know what we mean by a student don’t we?’?That’ll be a no then! There’s a danger in thinking what we call something is universally adopted or even understood. Reasons are legion- different experiences, cultures, specialities, nationalities and even age. From our picture we can see Villa has four valid synonyms for a roofed structure people mingle in*. The precise definition for each one, however,?is very different. This is the classic?‘describe a fork’?to which someone answers ‘a piece of cutlery to eat with‘ leaving us to ponder if they mean spoon, fork, knife, spork, etc.
  2. Cohorts: “How many students do we have?“. Never have the words?‘it depends’?done quite so much heavy lifting! There is much complexity in how we count things and that has implications for a trusted glossary. We need agreed definitions to help us determine cohorts:?‘part time, full time, funded, domicile, learning path, withdrawn, deferred, etc, etc, etc’. But it’s easy to fall into the trap of sub division by edge case. FTE is another great example. We need enough definitions to get us through the day, but not so many they becomes unusable.

This is not a definitive list! Antoyms are far less common but no less problematic. Reverse engineering data dictionary terms brings many issues when trying to consolidate into a single definition. Attempting to manage external and internal definitions of?‘the same thing‘ is not trivial. Even knowing who the right people are to craft a definition isn’t easy.

Some?best practice?can help here.

  1. Prioritise.?We’ll be back to the importance/process of identifying material/critical data in a future post. Right now, it’s enough to know we need to pick those terms that are going to help us with problems we already have, or projects we’re embarking on. Data Futures is a great example of both of those! It’s also a useful exercise in understanding when to use an external or internal definition. We advocate using the external definition UNLESS there is a good reason not too. It’s often a cohort/counting issue but not always. Important to reference all terms / synonyms / in every version of the term and where they are used.
  2. Write great definitions. Sounds obvious but it isn’t easy.?There is a lot of advice on line from dodgy youtube videos to full on paid for courses. Our advice is to understand all the places a term is used, get those people in a room and generate something that works for you. There are rules/good practice (eg do not use system terms, avoid statuses, be unambiguous, etc) but mostly it’s about getting something that’s good enough.
  3. Consult widely but set a deadline. Business Glossaries fail for many reasons. Two of the most common are a lack of prioritisation and death by edge case. Not every use case can be accommodated so make that clear in the definitions and move on. We try and iteratively publish on – say – a monthly cycle prioritising as we go. Sign off should be with the Data Owner and definitions the purview of the Stewards. Trying to create a glossary without recognised DG roles (eg built out of technology groups) is rarely successfull.
  4. Publish and be damned. Or ‘Sunlight is the best disinfectant’. Two key points here; publish with a robust and transparent challenge process. It is far easier to manage disagreement than apathy. Secondly do not abandon all your good work to a spreadsheet. There are so many ‘free‘ tools we have access to now that offer far better interfaces and access. SharePoint, Wiki’s, whatever is new in Teams this week. The tool isn’t important but it’s prominence and usability is.

This is a big topic and we’ve barely scraped the surface here. Hopefully, though, it provides the start of a framework to deal with the inevitable issues creating a university wide asset that has stakeholders with many different views. Trust us here, it is worth the effort!

A respected business glossary is one of the four pillars of unleashing data’s superpower – that of utility: create once, use many. The other three are?quality,?literacy?and?culture. We’ll be back to all of those later this year!

*Except for Villa from Blakes7. That’s just me being geeky!

John Britton

Consultant - Education, Regulation, and Local Government

2 年

Pertinent as ever! Today I'm working with a lovely piece of American software that keeps wanting to enforce a change in my UK local authorities source data from ‘Wirral’ to ‘Wirral and Enniskillen’. Slightly frustrating that it thinks a Canadian parish with a population of 204 should supplant a local authority with a population of 320-odd thousand in a data set that is otherwise composed of UK local authorities (other than the other ones it's tried to change to random places!).

Andy Youell

Higher education data, systems and regulation; advisory, training and interim.

2 年

Great article - of Course!

Andrew Reynolds CMgr FCMI

HE Statutory Reporting, Business Intelligence and Project Management

2 年

Missed it last time. What about Pancho? Also, Happy New Year, Alex.

要查看或添加评论,请登录

Alex Leigh的更多文章

  • How many students do we have #4

    How many students do we have #4

    Not quite! In the previous three instalments, we’ve covered the why, what and how of designing and building trusted…

    2 条评论
  • Tell me again: how many students do we have?

    Tell me again: how many students do we have?

    (This is the first of four articles as promised to discuss why it is so difficult to agree how many of something we…

    5 条评论
  • Crafting the Data Governance business case

    Crafting the Data Governance business case

    This is a subject we often come back to, because it’s a hard problem to solve. Not impossible, but requires…

    13 条评论
  • Data Governance has an image problem. Does that matter?

    Data Governance has an image problem. Does that matter?

    Data Governance is a damaged brand, apparently. The first word is mostly thought of as someone else’s problem while the…

    32 条评论
  • Poor data management and governance is a wet sock problem.

    Poor data management and governance is a wet sock problem.

    This might be a bit of a stretch. But stick with me.

    35 条评论
  • Data Owners - what's the first thing you should do?

    Data Owners - what's the first thing you should do?

    Once we’ve identified, allocated and trained our Data Owners then we’re done, right? Wrong! This is exactly the time we…

    24 条评论
  • Virtual learnings

    Virtual learnings

    There are many ways to develop a Data Strategy. I don't pretend to have 'the best way', but I do know - from long…

    6 条评论
  • Swivel Chair integration

    Swivel Chair integration

    Any data professional should be a clear advocate of implementing good data management and governance. Like many…

    6 条评论
  • Welcome to the new normal

    Welcome to the new normal

    Let’s start by saying there are far more important things going on in the world right now. We’re in uncharted territory…

    4 条评论
  • Are Data Maturity assessments worth the effort?

    Are Data Maturity assessments worth the effort?

    The answer should be a firm yes, but first let me explain why it is often a definite no. Assessment scores are amongst…

    12 条评论

社区洞察

其他会员也浏览了