Aria ranks #1 among open-source models in VideoAutoArena ?? A great arena-style benchmark for measuring open-ended video analysis tasks!
Co-founder & Multimodal Chief Scientist @Rhymes.AI | Ex Senior Research Manager @Salesforce | Aria and BLIP series
Introducing VideoAutoArena: an automated and scalable arena-style ?? benchmark for video understanding! Unlike previous benchmarks that mostly rely on multiple-choice questions, VideoAutoArena explores open-ended, complex video analysis while aiming to align closely with human judgment. ?? What makes VideoAutoArena unique? ? A scalable user-simulation framework to generate user-centric, open-ended questions. ? A fault-driven evolution strategy to increase question complexity. ? Insights into strengths and weaknesses of state-of-the-art LMMs in video understanding. Project page: https://lnkd.in/gyShtv3p Paper: https://lnkd.in/gpDqhfrX Led by Ziyang Luo, joint work with Haoning Wu, Dongxu Li, Jing Ma, Mohan Kankanhalli