登录查看更多内容

Doing things the way it's always been done, but better (Qualified)

Jo Kaybryn

Head of Evaluation Office

发布日期: 2019年3月28日

Acknowledgements: With kind thanks to Linda Raftree, Rick Davies and Kim Forss for their encouragement, support, feedback and critique.

If you would like to catch up with the previous sections, please follow the links...

~ Part One ~ Computation? Evaluate it!

~ Part Two ~ Distance Still Matters

This article is part three of a series exploring the links between evaluation and technologies. It explores the gulfs between digital data-centric and analogue data-centric worlds, seen in not only the workflows but in the very fabric of the buildings where we work. In the evaluation world we collect and analyse data, but arguably it did not occur to us that the massive body of work we have produced is its own dataset, one that repels new technologies’ ability to mine it through its sheer unstructuredness. This calls for cooperation between social and data scientists to make use of it.

With apologies to Mr. Adams...

“I love paradigm shifts. I love the whooshing noise they make as they go by.”

Qualifying it...

One of the things that strikes me as interesting, when reflecting on how we use and apply technology, is how fast things move, and how messy things get. To illustrate this, I found it helpful to think about how some bleeding edge technology companies organise. If you have the opportunity to chat with folks who work in computer games development, or high-end film and TV post-production, you can discern that the work pipeline is all important. The facilities and the organisation of work are built around the application of skills and techniques through the pipeline. Much like the factory model of manufacture, data is: collected, generated, documented, processed, evaluated, refined, modelled, organised, reported and stored (amongst other things). And whilst the material of that work is not data, it is actually data. In some cases, the building is specially built or at the very least its fabric adapted significantly to ensure that the network computing infrastructure can do what it needs to, much more so than the basic needs of an average office. It demonstrates how important computing is to their businesses.

Equally, the pipeline of tools which are adapted to each project provides an end-to-end data system that includes everything to facilitate the production of the digital products. There are teams of programmers and engineers, building, maintaining, refining and developing within each project, returning lessons learned and innovation back to the wider company. Having had a peek inside one or two of these facilities, and having friends who work within them, I have glimpsed a working environment that holds data as central to the work.

This (poor) description of an industry I barely comprehend gives me pause for thought in relation to what my work and sector look like. Whilst there is obviously variation, I would wager that for most organisations and companies in the international development sector (except for instances in very large providers or in niche innovators) most of the time do not resemble post-production studios. More likely, we would describe a fairly basic office environment, with laptops, some servers, some systems and workflows, and probably overworked IT staff, scratching their heads in bemusement and exercising unimaginable patience with colleagues (like me sometimes). I would posit that (for obvious reasons) our sector is not so data and tech centric by design.

Switching downwards, from buildings and infrastructure to workflows before I delve a little more into machine learning, I will explore a specific aspect of qualitative practice. And I further illustrate how we have probably missed out a bit in our work practices (or maybe I am just re-framing a pet annoyance): We don't code well enough! And by this I mean we don’t code the literature and reports that we create. We are (probably) all familiar with coding when conducting literature reviews, and some of us are moving towards newer tools and processes. In that sense we are recoding other people’s work for our purposes. When was the last time you found a document that the author had coded as part of their process, making your life that much easier? Of course, I appreciate that when we code documents for our own specific inquiry, we cannot expect the author to have predicted what we would then look for and anticipate our coding needs. But an author’s coding efforts make life a lot easier to explore the document. OK, so it is not completely true - the table of contents does provide an index of sorts, so it can be considered structural coding to a point. And there is usually a list of acronyms, and a bibliography, plus some annexes, which gives us the basic 'academic' report layout, and provides us with something to work with.[58] But it is easy to envisage a blockchain based publication system where original documents retain their veracity, displaying their chain of custody of data, and the community of practice can upload and link parallel codings and derivative data and documents; along with citations that can be easily tracked and mapped. The adoption of a networked, multi layered publication format, that is both human and machine readable, seems pertinent and timely.

So why I am I reflecting on this? I design and direct a number of research and evaluation quality assurance systems. I am often looking for evidence within an evaluation product, e.g. evidence of triangulation so that I can qualify the quality of it. So it would be super useful if triangulation was labelled ‘triangulation’, and evidence labelled as ‘evidence’ etc. In fact, I would like the reports to be very structured, and actually if coming from the same organisation consistently structured across all the organisation’s projects. More than that, I would like the data structure to be evident throughout the project lifecycle, from Terms of Reference, through to inception, mid-term and all the way to the final evaluation report. It would be the most marvellous experience to be able to track the qualities, developments, methods, evidence, evolutions, changes, derivations all the way through the process and be able to do so coherently and easily for all concerned. Well designed and consistent coding alone, in my view, would revolutionise this work. If you have ever tried to prep evaluation reports for machine learning, you might find, like me, that the transformation of PDF doc into usable data is hugely frustrating and anything that structures the data and makes it easier to process is highly welcome.

This seems particularly relevant when looking at a body of work where the outputs form a portfolio of project cases. The ability to then facilitate cross case analysis becomes more possible, less arduous and probably better quality. A well designed and consistent workflow also then potentiates structured machine learning to return better results, as well as a whole host of other opportunities.

Exploring this helps me attempt to elucidate my main point. That technology development is not just about hyped frontier technologies, but about the much needed investment in people, skills and facilities that hold the ability to effectively manage and deliver our development initiatives. The 'obvious reasons' that I mention above in stating that our sector is not tech centric by design, are part and parcel of the goals and visions of the international development sectors. We are often trying to help others develop their communities with, very often, rudimentary technologies to meet basic needs like access to clean water. In the current operating environment, resources are limited and the business case for spending is usually based on (strong or some semblance of) assurances of results.

The cost of technology development and the tech model of business is usually contingent on factors such mass user take up, venture capital and advertising revenues. If an NGO, government or multilateral is spending vast amounts of money on technology and facilities, their constituents will raise not only eyebrows but questions in boardrooms, parliaments and assemblies. The traditional development industry doesn't have the money or the remit. The main option is for public-private-partnerships. Our current business model is small innovation units, start ups or partnerships with established players and the results can be mixed. We are working with two models of development - technology development and traditional international development. We have begun to play well together and the tension and friction is creative at the moment, but have fundamental paradigmatic paradoxes.

The criticality of my perfect data world as expressed in brief above is the messiness in the world that I mentioned. Even if we try to standardise workflows, ensure best practice of data integrity across all project aspects, and substantially improve our lot, we will still have different systems disaggregated across different geographies, languages, and thematic areas etc. We will still need to be able to accommodate the uniqueness of every context, project and evaluation. Obviously, well designed data system workflows can translate across group systems, and still have flaws and errors, and we will certainly add noise to the data set in numerous ways.

So in qualitative research and analysis terms, we get to where machine learning has already made great strides. Technology assisted review (TAR) / eDiscovery / Predictive Coding has come on leaps and bounds in the legal sector for example.[59] The previous gold standard of human (only) review has been exposed as a myth where it has been pitted against machine learning solutions.[60] By having experienced analysts carefully coding documents for structured machine learning, legal document review is now potentially way better, and comes at a much lower cost to the client (nothwithstanding considerable investments by the company) presumably leading to better outcomes. Based on the concepts of Natural Language Processing (NLP) we seemingly find something near to a golden grail when we have the ability to train a machine to accurately match queries, find patterns and generally make very large language datasets easy to interrogate.

Whilst we can look to the legal sectors development and use of TAR, both as a practice and a pipeline of work and skills, we can also look to the bleeding edge of NLP to see where this is heading (and where it already is). NLP seems to have taken some great steps forward towards better language modelling last year with the release of a number of technologies that greatly improve its accuracy.[61] I am not nearly expert enough to explain the intricacies of these technologies (I am currently struggling to train a model to recognise triangulation in evaluation reports). But I observe that the approaches taken form a pre-trained language model (trained over months on large corpora of unlabelled texts). As a learning process this amounts to what some are describing as an ImageNet moment for the field.[62] These forays have yielded record beating approaches in testing, showing significant improvement from the previous generation of machine learning.[63] Having made these models available, we can input clean data into the model and gain semantic, contextualised results. So it will certainly begin to help with my triangulation training model.

For example, a semantic context search engine is now easily available to us. With a pre-trained ready-to-use model we can get great results.[64] But equally, on a project or sector or organisation basis we can also invest in performing our own model learning on our existing corpus of texts. This gives us the externally provided general pre-trained model, and a specific pre-trained model of our own. So we are able to interrogate and perform auto coding and other pattern recognition and NLP tasks. If you have followed along with my poor explanations... this is really quite huge step forward, and it is here, now.

Given that our sectors have done a very good job of reporting, recording and publishing our work for decades, we have produced a (really) huge corpus of publicly available knowledge that is only basically structured. It is unfeasible (but not impossible) that we would employ armies of document coders to go back through these mountains of work.[65] But with structured, semi-structured and unstructured auto-coding, it might very well be possible, or already happening. With so much of our output transparently available on the net, that data is available to the AI’s that can and do crawl the web and process what they find. Unlike the for-profit corporations, we do not guard our published data, or tightly control access and use. Like the giant tech companies, we aim to provide services to the world. Unlike the giant tech companies, we mostly do not monetise access to the data and resources we have published.

That data now has a market place, and data has a price and a cost. We seem to need the technologists to help us process and utilise our own data that we have gathered and analysed. We also need to access data that other organisations and people have produced. For the private enterprise that places a satellite in space, it has a means to recoup a return on investment through selling access to the raw data. For the donor organisation operating for the social good, the return on investment is in the development and progress of the communities it works with. Data in that case is most often a product that is made available for all. We have data. They have tech. We give our data away freely, and they process it probably better than we can. As I alluded to, because we have not established structured pipelines and data standards that can serve us well with future proofing, even within single organisations, our abilities to utilise our own data as fully as we might is limited. We need to respond to this together. Significant questions remain over data ownership, access and usage, as well as over access to tech, technologists, and investment in the varying business models of the development sector. If the global networking project becomes (as it seems to have increasingly) both public utility and renewable resource.[66] And therefore a both a ‘right’ (formerly) and ‘raw material’ (latterly) we need to think about what that means for people and their big and wider data.[67]

What's next?

Part 4 considers how we deal with quantitative data, and asks whether we can make better use of data for decision making. The tensions for evaluators arise with the introduction of algorithms, and mysterious "black boxes". But we also need to think about how humans’ assumptions and biases leak into our technologies, whether digital or analogue.

If you would like to catch up with the previous sections, please follow the links...

~ Part One ~ Computation? Evaluate it!

~ Part Two ~ Distance Still Matters

About the author

Jo Kaybryn is an international development consultant, currently directing evaluation frameworks, evaluation quality assurance services and leading evaluations for UN agencies and INGOs. “All thoughts presented are my opinion only. Mentions of colleagues past and present should be taken as recommendations, as I have always gained much from working with them. No one should take investment advice or suggestions in this series: none is intended, the series aims to present reflections, and add to the conversations in and around evaluation and frontier technology.”

References

[58] It was brought to my attention that there is a controversial internet meme that relates to coding, currently trending. I do not engage with much of the internet and social media. So I don’t fully understand what it means. However, I am told that learning to code is being used in reference to he technologization of jobs and industry (and acting unkindly). So it may be tangentially relevant, but not intentionally, with reference to any meme.

[59] Thomas C. Gricks III and Robert J. Ambrogi, A Brief History of Technology Assisted Review, 2015

[60] Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, Richmond Journal of Law & Technology, Vol 17, Issue 3, 2011

[61] Jay Alammar, The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning, 2018

[62] Eds. Andrey Kurenkov, Eric Wang, and Aditya Ganesh, NLP's ImageNet moment has arrived, 2018

[63] Eds. Andrey Kurenkov, Eric Wang, and Aditya Ganesh, Ibid

[64] Josh Taylor, ELMo: Contextual language embedding, 2019

[65] A question would be: would we even want to, considering the enormous variability in quality? Thanks to Dr. Rick Davies for pointing out that it’s not necessarily a pile of gold that we’re looking at.

[66] “To be sure there are always sound business reasons for hiding the location of your gold mine, In Google’s case, the hiding strategy accrued to its competitive advantage, but there were other reasons for concealment and obfuscation. What might the response have been back then if the public were told that Google’s magic derived from its exclusive capabilities in unilateral surveillance of online behaviour and it’s methods specifically designed to override individual decision rights?” Shoshan Zuboff, 2019, ibid

[67] Antonella Bonanni, Why Exploring Thick Data Helps to Understand Human Motivation, 2019

要查看或添加评论，请登录

Jo Kaybryn的更多文章

Human rights... with caveats

2021年5月2日

Human rights... with caveats

We respect the human rights of women in totality. Except that bit that says they have equality with men.

1 条评论
‘De-development’: can it be managed?

2021年2月7日

‘De-development’: can it be managed?

How did you go bankrupt? Two ways. Gradually, then suddenly.
Prosecuting the slave trade is “Missing in Action”

2020年10月24日

Prosecuting the slave trade is “Missing in Action”

This week in a compelling presentation, Patricia Sellers delivered a startling message, that the international crime of…
Women and men: survivors of crimes against humanity

2020年10月4日

Women and men: survivors of crimes against humanity

Concepts: women as victims of sexual violence; women and men as victims of international crimes In 2013 Judith Gardam…
Precis || Al Hassan case: Opening of the trial, 15th July 2020 (Day 2) Second session

2020年8月15日

Precis || Al Hassan case: Opening of the trial, 15th July 2020 (Day 2) Second session

Unofficial precis of Day 2, Session 2, the final instalment in the prosecution's opening statements. In this session…
Precis || Al Hassan case: Opening of the trial, 15th July 2020 (Day 2) First session

2020年8月8日

Precis || Al Hassan case: Opening of the trial, 15th July 2020 (Day 2) First session

Unofficial precis of Day 2, Session 1, of the Trial Opening; The Prosecution continues its opening statements. Day two…
Precis || Al Hassan case: Opening of the trial, 14th July 2020 Third session

2020年7月25日

Precis || Al Hassan case: Opening of the trial, 14th July 2020 Third session

Unofficial precis of the third session of day one. Senior Trial Lawyer Gilles Dutertre resumes the opening presentation…
Precis || Al Hassan Case: Opening of the trial, 14 July 2020 Second session

2020年7月24日

Precis || Al Hassan Case: Opening of the trial, 14 July 2020 Second session

Unofficial summary of the second session of the first day of the Al Hassan trial. The Judge picks up the issue that the…
Precis || Al Hassan Case: Opening of the trial, 14 July 2020 First session FLOOR

2020年7月20日

Precis || Al Hassan Case: Opening of the trial, 14 July 2020 First session FLOOR

Unofficial precis of the official opening of the case in Trial Chamber X. In this session, Presiding Judge Antoine…
Definitely not the last word...

2019年5月19日

Definitely not the last word...

Acknowledgements: With kind thanks to Linda Raftree, Rick Davies and Kim Forss for their encouragement, support…

See all articles

Doing things the way it's always been done, but better (Qualified)

Jo Kaybryn

Head of Evaluation Office

What's next?

About the author

Jo Kaybryn的更多文章

社区洞察

其他会员也浏览了

URSABLOG: Computer Narratives

From a Maps to Satnav world – The way boards take decisions is changing

Tectrail October Edition: Ready for Growth and Fresh Discoveries! ??

"Zoom-out, it all makes sense". Vol.4: about the controversial lack of impact from Digital

Insights into Impact

Data, the New Physics of Digital & Information Systems

Modern Approaches to PESTLE Analysis: Trends, Technology, and Best Practices

OS AI: Delivering Fresh Competitive Intelligence - October 21, 2024

“Citizen Centric Focus for Innovative Information Technology” Proceedings of the Global Forum 2019, Angers, France 7-8 October 2019

Enhancing Market Transparency and Fair Pricing: Insights from Eugene Grinberg, Co-Founder and CEO of SOLVE

What's next?

About the author

Jo Kaybryn的更多文章

Human rights... with caveats

‘De-development’: can it be managed?

Prosecuting the slave trade is “Missing in Action”

Women and men: survivors of crimes against humanity

Precis || Al Hassan case: Opening of the trial, 15th July 2020 (Day 2) Second session

Precis || Al Hassan case: Opening of the trial, 15th July 2020 (Day 2) First session

Precis || Al Hassan case: Opening of the trial, 14th July 2020 Third session

Precis || Al Hassan Case: Opening of the trial, 14 July 2020 Second session

Precis || Al Hassan Case: Opening of the trial, 14 July 2020 First session FLOOR

Definitely not the last word...

社区洞察

其他会员也浏览了

URSABLOG: Computer Narratives

From a Maps to Satnav world – The way boards take decisions is changing

Tectrail October Edition: Ready for Growth and Fresh Discoveries! ??

"Zoom-out, it all makes sense". Vol.4: about the controversial lack of impact from Digital

Insights into Impact

Data, the New Physics of Digital & Information Systems

Modern Approaches to PESTLE Analysis: Trends, Technology, and Best Practices

OS AI: Delivering Fresh Competitive Intelligence - October 21, 2024

“Citizen Centric Focus for Innovative Information Technology” Proceedings of the Global Forum 2019, Angers, France 7-8 October 2019

Enhancing Market Transparency and Fair Pricing: Insights from Eugene Grinberg, Co-Founder and CEO of SOLVE