Go Digital: Measurements
Chris Leong, FHCA
Director | Advisory & Delivery | Change & Transformation | GRC & Digital Ethics | All views my own
Measuring or measurement is something we have been doing for a long time, and across all disciplines of life. Given Wikipaedia’s classical definition of measurement that is, ‘the quantification of attributes of an object or event, which can be used to compare with other objects or events.’, we can equate these attributes to the data that is collected as a result.
In the digital world, we have data about everything that is represented within. What we do with these measurement data can be correlated to what they are intended to be compared against.
Measuring performance
When I first came across the concept of lean six sigma a couple of decades ago, I learnt that we cannot improve something that we cannot measure. Certainly, within the context and aspiration of continuous improvement and quality, this makes perfect sense in every setting across industries, sports, well-being and our daily lives! Let’s look at some performance-related examples.
In all forms of elite and professional sports, performance-related data is captured by a myriad of devices. In Formula One, telemetry data captured in real-time from each of the cars provide insights to their teams and drivers on how they are performing as well as valuable data for diagnosing below-par performance so that adjustments can be made. In the world of professional football (or soccer), more and more data has been captured from the video content, so that in-depth analysis can be undertaken on players’ performance.
Within a business, performance is typically measured periodically where key metrics derived from a predetermined set of data, are compared against those reported for the previous reporting period, within the context of set targets and objectives.
The benefit of such measurements allows for validation of strategic and operational execution, as well as providing indicators for areas of concern where there is a degradation in performance. As organisations become more digitalised, data availability opens up more possibilities for performance to be measured more frequently, where it is beneficial to do so. In an automated and dynamic business environment such as an e-commerce retail platform, you would expect performance to be constantly measured via key metrics to ensure that customers consistently get the service they expect through the customer lifecycle since the underlying data is available. Any degradation of performance is likely to impact the customer experience, therefore should be dealt with by exception as and when it occurs.
I, unfortunately, have experienced below par customer experience in many of these e-commerce retail platforms, which can only suggest that performance is not measured despite the underlying data being present. Case in point being orders placed and paid for not being fulfilled and delivered due to breaks in the workflow, but not dealt with a reasonable timeframe they can commit to despite complaints. The only resolution offered was a refund.
When algorithmic, AI and autonomous systems are deployed, their performance should be monitored for any deviations from their intended scope, context, nature and purpose. If adequate governance is in place when these systems were approved for deployment, we can expect their outcomes will be fair, accurate, reliable, explainable/transparent, compliant/lawful, safe and secure. We also know that these systems do not produce predictable outcomes all the time, therefore their performance will need to be continuously monitored.
Measuring quality
There are industries where quality is paramount. Often these industries are also regulated as people and society can be adversely impacted if the acceptable level of quality is not met at all times.
The aircraft industry is a prime example. Mechanical components, software and systems that collectively make up the aircraft need to individually and interdependently satisfy stringent quality standards and controls set by the regulator before the aircraft is certified to fly. Thereafter, routine checks need to be carried out to ensure that the aircraft continues to meet control standards, as operational failure in flight resulting from the failure to check would amount to gross negligence on the part of the airline. Air accidents, unfortunately, have happened from mechanical or software or systems failure, however, the standards set by the regulator is so high that the failure rates have been extremely low by comparison.
Quality is also rigorously measured in other regulated industries such as life sciences, where failure can have a direct impact on human lives. Consequently, we have robust and stringent scrutiny on the quality and effectiveness of products before they are certified for use on humans.
The enforcement of quality standards varies from industry to industry. It would be reasonable to suggest that this has been largely driven by the severity of potential impacts on humans.
The financial services industry is as far away from the aircraft industry as you can get. Whilst it is also regulated, the severity of failure in one investment firm is not comparable to the severity of failure of an airliner in flight. There is wiggle room in the way regulations can be interpreted, resulting in variations in the way firms are set up to operate. As more financial services organisations are digitalised and also transforming to digital-first organisations, the need for measuring quality across these organisations increases by default. Going digital enables seamless and instantaneous engagement which provides for an enhanced user experience, however, any failure is likely to immediately impact the user. Hence the need for higher quality standards to be met.
In my article ‘What does Quality mean?’, I explored the role quality can play in mitigating downside risks from algorithmic, AI and autonomous systems.
Measuring risk
While we measure performance to improve, we need to also measure risks to manage the likelihood of failures or outcomes that can adversely impact us personally, our families, our society, our organisations, our businesses, our environment, our world, and our being.
Various risks are measured, assessed, monitored, managed and mitigated daily and by many. Yet, some risks are not measured, let alone monitored, assessed, managed and mitigated.
You could argue that if we do not know what that risk looks like, then how can we measure it? Many have said this about emerging risks related to the use of transformative technologies.
I would argue that any responsible use of such technologies, especially when it has been researched, experimented on, trialled although in specific use cases, should also have considered the downside risks from diverse inputs and multi-stakeholder feedback, before deploying in a public and uncontrolled environment.
Any organisation deploying algorithmic, AI and autonomous systems have a duty of care to ensure that the outcomes from these systems do not adversely impact humans.
There is no reason why downside risks associated with algorithmic, AI and autonomous systems that process personal data, cannot be identified, measured, assessed, monitored against prescribed thresholds, managed and mitigated through internal risk, compliance and governance structures, especially when data is available.
I accept the fact that some of these systems will not process personal data and where it does, are limited in its scope, nature, context and purpose, consequently, their outcomes could be regarded to be more predictable consistently than not. Some will have transparent models and therefore more explainable. Many would argue that the downside risks from these systems are minimal, however, they should still be examined, documented and accounted for, if these risks subsequently manifest into issues.
领英推荐
There will also be systems that have black box models which are difficult to explain, in which case the likelihood of them inferring results that are not aligned to their intended scope, nature, context and purpose will be material enough for careful examination incorporating diverse inputs and multi-stakeholder feedback to identify downside risks, which then must be documented and accounted for through measurement, assessment, monitoring, management and mitigation with adequate controls and governance established.
Furthermore, the accuracy of inferences needs to be accounted for. Inferred data is not fact, hence it is critical for any subsequent decision that is made from inferred data, undergoes risk and impact assessments for the likelihood of their unintended consequences materialising.
The need to comply
For any organisation or business operating in a regulated environment or industry, the need to comply with respective regulations is part of and parcel of being in business. Let’s consider the following scenarios:
·??????Any decisions made by accountable persons following risk and impact assessments of inferred output from algorithmic, AI and autonomous systems need to be documented for auditability and future reference.
·??????In instances where risk and impact assessments were conducted, the accountable persons can choose to do nothing on the back of advice from their control functions to manage and mitigate the downside risks. This decision to accept the risks will need to also be documented for auditability.
·??????Finally, in instances where no risk and impact assessments were conducted on the algorithmic, AI and autonomous systems deployed internally within their organisations or by their third-party providers of information, services and solutions, whose outcomes impact humans, the internal audit function needs to identify this as a risk for the attention of accountable persons.
Existing regulations such as GDPR and for financial services firms (operating in the UK) the SM&CR, impact organisations that deploy algorithmic, AI and autonomous systems that process personal data. Whilst GDPR focuses on privacy and data protection, SM&CR lists conduct rules which ‘are intended to improve standards of individual behaviour in financial services from the top-down and the bottom-up.’
Emma Parry - Culture and Conduct Senior Advisor at ChangeGap, provides her views on what financial services firms leveraging AI and personal data to engage with their customers, need to do within the context of SM&CR:
“One of the core principles of SM&CR is that of 'reasonable steps'. So, if for example, there were to be an escalation to the relevant accountable executive concerning AI / algorithmic outputs which could cause detriment to customers, then the next 'reasonable step' would be for the accountable executive to take appropriate actions. Then, and just as importantly, to ensure that those actions are recorded and managed (via a clear audit trail) to completion.
Of course, there may be instances where an escalation to the accountable executive is not forthcoming from the relevant team. Perhaps there is a culture where raising 'issues' is not encouraged; perhaps the team is scared of being blamed for making a mistake. Hence, on a broader point, what we need to see are cultures where there is psychological safety to speak up.
What we also need across the industry are robust and effective whistleblowing mechanisms and for those processes to operate without retribution. Sadly, there are too many examples where employees have raised a whistleblowing case only to be penalised as a result - in the worst cases, losing their jobs.
Top performing firms are those that have strong and effective governance. They are measuring and tracking the right metrics, but they also have cultures that encourage inclusive and diverse insights, opinions and constructive challenge, and certainly, that have open cultures that actively encourage speaking up.”
The trouble with AI
The trouble with AI is that it is, at present, far from accurate and reliable in complex settings, despite its scope, context, nature and purpose being well-defined.
Despite Tesla being ahead of the pack when it comes to autonomous vehicles, their loyal owners should be happy with the news this week about ‘The US federal agency in charge of road safety is opening an official investigation into Tesla's "self-driving" Autopilot system.’
The key for any organisation choosing to deploy algorithmic, AI and autonomous systems, is to first understand their limitation and conduct proper due diligence on whether these technologies are necessary and appropriate to leverage to solve their problem, and at the same time deliver value to all stakeholders.
Organisations that use any of these systems in services that interact with humans, should seriously consider providing explanations and transparency into how the results were reached. For example, where selected psychometric assessments are used early in the screening process of candidates for roles, a summary of the candidate’s profile based on their responses to a set of prescribed questions is provided but without any correlation to the profile sought for the role. Whilst it benefits the recruiter, it does not provide the candidate with any value in instances where they have invested their time and effort but was not selected to progress forward.
Measurements are critical indicators that we can and must continue to use and derive insights from, to maintain a high standard of quality, manage downside risks and achieve the performance levels we strive for, especially in the digital world where data is everywhere.
ForHumanity examines the application of AI and autonomous systems when they present a risk to humans, the environment, or societal systems. It advocates the adoption of its Independent Audit of AI Systems (IAAIS) which can be adapted and adopted for the Governance, Accountability, and Oversight of AI and Autonomous Systems to provide an infrastructure of trust and comprehensive risk management framework.
Innovation is not about needing to use the latest technology in the first instance but is very much about solving the problem in a simpler and better way that is also transparent, explainable, compliant, safe and secure for consumers, society, humans and the environment. You can read my article about responsible innovation here.
If algorithmic, AI and autonomous systems are necessary, you will need to understand how to deploy them responsibly, ethically, lawfully, transparently, with explainability, reliably, accurately, sustainably, safely and securely so that you can deliver trustworthy solutions that benefit rather than disadvantage, discriminate or harm humans.
Emerging technology & quality infrastructure
3 年Agree, although I would add that the actual measurements are well understood (ROC, AUC, F1, FP, FN, TPR, FPR... etc) and you could find these in ISO/IEC DIS 23053, or from January in ISO/IEC TR 24027 (Bias in AI). You can combine these with measures of fairness from the literature. Where it gets complicated in understanding how those apply to different groups, or why. An example of this I am currently pondering is where full body computer vision technology applies differently to people walking with a different gait. It's easy to measure the accuracy overall, but in order to measure how it is affected by gait, I need to first classify and determine each person's gait in my test set. Another example is in determining the representativeness of a dataset. There are many use-case specific factors in determining this that are not obvious.