Facebook released test bench that pits AI againts human
The standard practice for machine learning model is to train using existing dataset such as ImageNet - public data set consist of 14 million images to test for image recognition algorithm or MNIST for hand writing recognition or GLUE (General Language Understanding Evaluation) for natural-language processing, leading to breakthrough language models like GPT-3.
A fixed target soon gets overtaken. ImageNet is being updated and GLUE has been replaced by SuperGLUE, a set of harder linguistic tasks. Still, sooner or later researchers will report that their AI has reached superhuman levels, outperforming people in this or that challenge. And that’s a problem if we want benchmarks to keep driving progress.
To overcome limited data set problem FB has come with solution using people to interrogate AIs. It just like trying to measure human intelligence, you need to talk to them, ask questions to test for the IQs - it's call Dynabench.
It is a first-of-its-kind platform for dynamic data collection and benchmarking in artificial intelligence. It uses both humans and models together “in the loop” to create challenging new data sets that will lead to better, more flexible AI.
Dynabench radically rethinks AI benchmarking, using a novel procedure called dynamic adversarial data collection to evaluate AI models. It measures how easily AI systems are fooled by humans, which is a better indicator of a model’s quality than current static benchmarks provide. Ultimately, this metric will better reflect the performance of AI models in the circumstances that matter most: when interacting with people, who behave and react in complex, changing ways that can’t be reflected in a fixed set of data points.
More detail information on this link
Jr. Brand Manager at PT. Prambanan Kencana
4 年Thank you for sharing