登录查看更多内容

2020 For Course5 Artificial Intelligence Labs: The Year In Review

Tamal Chowdhury, Ph.D.

CTO | Computer Scientist & Mathematician | Artificial Intelligence R&D ? Autonomous Computing ? Product Engineering

发布日期: 2021年1月15日

2020 has been one of the most tumultuous years in modern history. The global economy has already contracted by over 4%, human lives and businesses have been severely disrupted, and, at the same time, digital adoption is gradually accelerating in most industries. A high degree of uncertainty still prevails globally, and even after the situation stabilizes, the new world order will be distinctively different from the pre-Covid one.

Amidst this global turbulence, the engineers, researchers and scientists at Course5 AI Labs worked extremely hard to ensure that planned product releases, client commitments, and other deliverables are not impacted. Our efforts were predominantly focused on AI research & development, and engineering the company's flagship products & platforms. This paper discusses the major technical areas that received most of our focus in 2020, and briefly shares our plans for 2021.

Major AI Research & Engineering Areas In 2020

Seven key focus areas were identified for the year 2020. While the first four were largely new areas of research and development, the others were a continuation of our 2019 efforts.

Advanced Object Detection
Human Action & Emotion Recognition
Neural-Symbolic Reasoning
Anomaly & Causality Discovery In Noisy Temporal Data
Cloud-Native AI Development
Efficiencies In AI Operationalization
Model Interpretability & Explainability

Focus Area 1: Advanced Object Detection

One of our major focus areas in 2020 was to expand the capabilities of our existing object detection systems. Four important aspects of our R&D in this area are highlighted below.

Small-object and 3D-object detection: While modern object detectors perform well in cases of regular-sized 2D-objects, they often perform poorly with smaller-sized objects. Similarly, 3D-objects do not follow any specific orientation, and this poses considerable challenges in detecting them. These limitations get compounded in video data due to the added complexity of temporal dependencies, and our research was primarily focused on addressing these challenges.
Low-shot/Few-shot object detection: Most deep learning-based detectors require large corpora of labeled/annotated data for high performance, which is expensive, inefficient and time-consuming. Low-shot/few-shot detection techniques help to address this. Our work largely focused on semi-supervised and weakly-supervised approaches, and addressing class, scale and spatial imbalances.
Anchor-free methods: Anchor-free detection techniques (e.g., keypoint-based or center-based) eliminate many of the limitations of anchor-based methods, such as the need for anchor-related hyperparameter setup. However, most anchor-free detectors today exhibit average to mediocre performance during production inference, particularly for large-scale workloads. Our research focused on addressing these problems.
Deep contextual encoding: Most object detectors do not efficiently capture contexts in computer vision data (e.g., the relationship between different objects, spatial correlations of objects, or the object-text similarities in video frames.) Context-awareness is critical to building efficient AI systems, and our research focused on capturing and encoding deep contexts in video data.

Focus Area 2: Human Action & Emotion Recognition

Understanding human actions and facial expressions/emotions are emerging areas of AI research. The past few years have witnessed important innovations, such as two-stream & multi-stream networks, 3D-CNN architectures, and others. Some of the key aspects of our work in these areas are highlighted below:

Skeleton-based action recognition through attention-based and graph-based architectures that capture both short-term and long-term temporal information in videos.
Focus on early action recognition (i.e., recognizing actions before they are completed) in videos to extract maximum information from temporal data, and reinforce the predictive power of the models.
Adaptive learning-based emotion detection to capture different emotional states as well as the emotional intensity of those states.
Sophisticated Java-based and Hadoop-based backend systems that enable distributed and parallel processing of heavy-duty workloads, high concurrency, low latency, etc.

Focus Area 3: Neural-Symbolic Reasoning

This was the most complex area of research for us in 2020. While neuro-symbolic reasoning is often explored for interpretability purposes, our research stemmed from the need to build a cognitive solution for automating highly unstructured manual processes, which could not be effectively addressed through regular machine learning or deep learning techniques.

Despite recent advances in deep learning, building cognitive systems for large corpora of unstructured, multi-level hierarchical data is still a big challenge. This is especially true for data with negligible patterns, or where even abundant labeled data cannot capture all variations in patterns. Our research was primarily aimed at building complex human-like logic generation capabilities for one of our flagship products.

The goal was to create an end-to-end AI architecture where (i) knowledge is represented in a symbolic form, (ii) machine learning components are built to learn from that knowledge, and (iii) a reasoning system is built to generate (and apply) complex logic based on the learnings of the machine learning components. Our work is still in the initial phases, and early signs have been encouraging.

Focus Area 4: Anomaly and Causality Discovery In Noisy Temporal Data

Real-world, multivariate time series data of many domains are characterized by high degrees of noise, complex abnormal patterns, and unstable distributions. Under such circumstances, commonly-used anomaly and causality discovery techniques often become ineffective, particularly on account of two reasons:

the absence of a consistent understanding of anomalies, especially as time progresses; coupled with the absence of adequate labeled data in many cases
the difficulties in encoding both the long-term temporal dependencies within each time-series, and the complex inter-correlations between the different time-series

Many existing anomaly detection systems focus primarily on the identification of anomalies, and pay limited attention to diagnosing or explaining the root causes of the anomalies. This becomes a problem in many real-world applications where the first-order, second-order (..) causal factors that create the anomalies also need to be accurately understood. Furthermore, this causality discovery should account for confounding and instantaneous effects, and more importantly, the time delays between the root causes and the occurrences of their effects.

The above reasons necessitated the development of an unsupervised anomaly and causality discovery system that could be effectively applied to noisy domains. Additionally, this integrated system was designed to determine the severity or impact of each anomaly for better decision-making. Our work involved experimenting with multiple approaches, such as reconstruction-based methods, generative modeling, and convolutional-based architectures.

Focus Area 5: Cloud-Native AI Development

Our AI systems do not operate in isolation but as the core drivers of our enterprise products and solutions. This necessitates them to be seamlessly integrated into the products as regular software components. This implies that they need to be architected, designed, and developed in line with modern engineering practices, particularly as cloud-native systems. We established a two-prong approach to achieve the same.

i. Building Cloud-Native Everything

An important architectural decision that we took a few years back was to design and develop all our products and solutions as cloud-native to 'build for the future'. This involves the adoption of design patterns and engineering practices that enable high application scalability, extensibility & maintainability; allow seamless portability from one infrastructure ecosystem to another, as well as interoperability with other applications; ensure high performance with thousands of concurrent user traffic; and provide high availability and safety mechanisms against system failures and security vulnerabilities.

DevOps, distributed engineering, microservices, function-based development, and test-driven development are the standard industry strategies that we deploy.

Moreover, API design and lifecycle management receive significant focus, including critical requirements like API security, forward & backward compatibility, gateway design, etc. Some of the key aspects of our API strategy were/are as follows:

While REST remains our de-facto standard, we also explored gRPC and GraphQL for certain specific requirements.
We prioritize API security, particularly for those services that drive the core runtime capabilities, over the ease of API creation & configuration.
Multi-purpose APIs are leveraged for traditional systems & components, while single-purpose APIs are the norm for complex systems.

ii. Migrating Older AI Systems to Cloud-Native Architectures

Our efforts were focused on re-designing, re-factoring and migrating the older AI systems to cloud-native architectures. Special attention was paid to detecting and remediating problems such as common smells, complex glue codes, configuration-related issues, pipeline jungles, and undeclared consumers. This also involved re-writing our existing caching mechanisms, optimizing circuit-breakers, reducing system resource consumption, replacing older libraries with newer ones, upgrading our asynchronous development strategies, and other tasks.

Focus Area 6: Efficiencies In AI Operationalization

Efficient operationalization is critical to the success of any product or solution. This is especially true for AI applications where deployment and production management are fraught with several challenges. We adopted a three-prong strategy to address this.

i. Integrated DataOps - ModelOps - DevOps

In 2020, we significantly invested in transitioning from our traditional DevOps structure to an integrated DataOps-ModelOps-DevOps one. This covers the entire spectrum of machine learning & software development, deployment, and production management, ranging from:

efficient data pipeline creation to large-scale data orchestration,
automated code analysis to CI & CD,
ML metadata-stores to ML feature-stores,
low-latency model serving to production model performance evaluation,
schedule-based & on-demand model re-training/revamp to online machine learning.

The new integrated structure allows us to rapidly and efficiently prototype, build, test, deploy and maintain our AI products and solutions.

ii. Automated Machine Learning

Open-source and proprietary AutoML libraries and tools do not always provide enterprise-grade output, particularly for tasks that involve complex learning, deep feature engineering, or high explainability. The need for better inference, higher scalability, lower costs, and greater integration with our DevOps structure necessitated our AutoML efforts, particularly for Computer Vision and NLP workloads in our problem-domain. Two key aspects of our AutoML engineering were/are:

Addressing the highly compute-intensive nature, and instability problems of traditional Neural Architecture Search (NAS) frameworks.
Building end-to-end AutoML pipelines that directly integrate with our product development setup.

iii. Deep Neural Network Compression

Deep neural networks are computation-memory-power-intensive, and this often creates problems in deploying AI systems with multiple deep learning models in edge devices, or even in regular CPU environments. As a result, sophisticated compression strategies are needed to optimize these networks for greater production efficiency. As our AI products kept increasing in scope and scale, the significance of compression also kept increasing.

Our compression techniques are based on four approaches: Knowledge Distillation, Low-Rank Matrix Factorization, Network Pruning, and Quantization. We explored and exploited various state-of-the-art and emerging techniques pertaining to these approaches.

Focus Area 7: Model Interpretability & Explainability

Our strategies for model interpretability & explainability rest on four pillars.

Model Explainability: Linear proxy approaches (e.g., LIME or Locally Interpretable Machine Agnostic Explanations), Shapley additive explanations, and logic-based/rule-extracted explanations form most of our explainability methods. Our 2020 focus was to explore more sophisticated techniques, such as neural dissection and explainable neural networks (e.g., DeepLift.)

Interpretable or White-Box Modeling: We have generally leveraged decision trees for explanations, and explainable boosting machines to address this area. In 2020, we primarily focused on improving the implementation of these techniques.

Visual Interpretation: We have traditionally relied on accumulated local effects, partial dependence/residual plots, correlation network graphs, and conditional expectation visualizations. Our 2020 focus was to further improve the way we deploy these methods.

Sensitivity Modeling: Our focus in 2020 was to deploy adversarial-based techniques for our computer vision workloads.

Special Mention: Transformers

While this was not an explicit focus area, it receives a mention because the transformer architecture was the overwhelming theme of our NLP work in 2020. We studied, explored and exploited multiple types of transformers: GPT-2, BERT, ALBERT, XLNet, DistilBERT, ELECTRA, LongFormer, Reformer, RoBERTa, StructBERT, T5, and others. Moreover, we built the preliminary version of our own (proprietary) transformer for AI components where existing open-source transformers failed to provide the desired results.

What's Next?

As we continue our journey in 2021 (and beyond), our goal is to keep strengthening our capabilities and offerings in different technologies related to computer vision, speech recognition, natural language processing, and machine/deep learning. Some of our existing focus areas are expected to gain additional momentum, particularly the ones on human action and emotion detection, and neural-symbolic reasoning. New areas of AI research and development are planned as well.

We expect our Narrow AI Strategy to keep generating significant value for our customers. Complex AI transformations need domain-specific data, strong functional knowledge, and 'precisely-engineered solutions' for critical problems. As a result, focused AI systems that are specifically designed, developed and optimized to address domain-specific or complex problems will continue to be more efficient than generic industry solutions.

An important global trend that is emerging is the focus on shifting away from traditional AI development approaches that are either inefficient (e.g. building task-specific models) or constraint-laden (e.g., high reliance on labeled data.) We expect this trend to gain greater momentum this year, and to continue influencing our internal AI development strategies. This implies more innovations in areas like Active Learning, Multi-Task Learning, One/Few-Shot Learning, and in semi-supervised and self-supervised learning methods. As the open-source ecosystem continues to get stronger, the external innovations will keep augmenting our internal innovations.

Designing AI applications as truly complex, adaptive systems will be a critical aspect of our architectural evolution this year. As our products get enhanced with advanced features and more deep learning components, the emergent behaviors of the overall systems are expected to increase as well. Hence, state-of-the-art reinforcement learning systems will be designed and developed to address the increased emergent behaviors of our AI products. Similarly, cognitive architectures, particularly those pertaining to the symbolic paradigm, will witness greater adoption in our AI workloads. These architectures enable the modeling of core cognitive abilities such as attention mechanism, dynamic action selection, learning, memory, perception and reasoning.

Another important area that is expected to witness key innovations in 2021 is multi-modal AI development. Intelligence from multiple sources will be integrated to gain a greater understanding of the subjects of interest. For instance, our emotion detection systems will be enhanced by integrating video-based emotions, speech-based emotions, and emotions from conversational systems. Furthermore, we expect increased adoption of advanced optimization techniques, such as evolutionary/genetic algorithms.

Finally, 2020 has been a reasonably decent year for Course5 AI Labs in terms of innovations and new releases, and I am extremely proud of what the team has achieved. We executed more R&D experiments, built more algorithms, wrote (and debugged) more code, shipped more releases, and managed more large-scale production systems than the preceding years. We also faced setbacks and challenges, but the team had enough grit to keep progressing. 2021 is expected to be 1.5x to 2x of all of these things.

Prashanth Chiliveri (PC)

Senior Director Of Client Services at Course5i

4 年

Inspiring for upcoming Data Scientists. Great stuff ??

要查看或添加评论，请登录

Tamal Chowdhury, Ph.D.的更多文章

Engineering Multi-Agent Systems: A Technical Playbook

2025年1月27日

Engineering Multi-Agent Systems: A Technical Playbook

Over the past couple of years, Agentic AI has been generating significant excitement, and deservedly so. In tandem with…

3 条评论
The Rust-Python Hybrid: A Powerful Polyglot Architecture for Cutting-Edge AI Engineering

2024年11月25日

The Rust-Python Hybrid: A Powerful Polyglot Architecture for Cutting-Edge AI Engineering

Modern AI systems are advancing at an accelerated pace, expanding in complexity, scale, and scope. Python dominates the…

2 条评论
Augmenting Mathematical Optimization with Reinforcement Learning

2024年10月21日

Augmenting Mathematical Optimization with Reinforcement Learning

Mathematical optimization drives complex decision-making across a diverse range of problems – e.g.
Generative AI: Searching the Signal amidst the Noise

2023年1月30日

Generative AI: Searching the Signal amidst the Noise

Generative AI technologies have garnered much attention over the last 24 to 30 months. Innovations like GPT-3 and 3.

1 条评论
Are your AI-driven applications really secure?

2022年5月16日

Are your AI-driven applications really secure?

The AI that your company buys (or builds) may not be as secure as you think! As companies across the globe accelerate…
Next-Generation Engineering With WebAssembly

2022年2月3日

Next-Generation Engineering With WebAssembly

In 2017, Mozilla and others in the World Wide Web Consortium (W3C) released a new browser-based technology to enable…
The RISC-V Revolution: Why The Global Tech Community Needs To Pay More Attention To This

2021年12月27日

The RISC-V Revolution: Why The Global Tech Community Needs To Pay More Attention To This

Traditional processor architectures have inherent limitations in fulfilling the high-performance needs of modern…
Trends & Innovations In Object Detection

2020年7月31日

Trends & Innovations In Object Detection

Object detection is one of the most fundamental problems in Computer Vision, and has powered many of the significant…

1 条评论
Navigating R&D Organizations Through Economic Turbulence - Four Principal Strategies For A Strong (Post-Crisis) Emergence

2020年5月15日

Navigating R&D Organizations Through Economic Turbulence - Four Principal Strategies For A Strong (Post-Crisis) Emergence

Extraordinary threats create extraordinary opportunities. Research & Development programs offer a great mechanism to…

2 条评论
Dealing With Systemic Risk Shocks In Complex Systems: Early Over-reactions & Extreme Measures Hold The Key

2020年4月21日

Dealing With Systemic Risk Shocks In Complex Systems: Early Over-reactions & Extreme Measures Hold The Key

The materialization of systemic risk in any system leads to high instability and turbulence. This disruption is even…

1 条评论

See all articles

2020 For Course5 Artificial Intelligence Labs: The Year In Review

Tamal Chowdhury, Ph.D.

CTO | Computer Scientist & Mathematician | Artificial Intelligence R&D ? Autonomous Computing ? Product Engineering

Major AI Research & Engineering Areas In 2020

Focus Area 1: Advanced Object Detection

Focus Area 2: Human Action & Emotion Recognition

Focus Area 3: Neural-Symbolic Reasoning

Focus Area 4: Anomaly and Causality Discovery In Noisy Temporal Data

Focus Area 5: Cloud-Native AI Development

i. Building Cloud-Native Everything

ii. Migrating Older AI Systems to Cloud-Native Architectures

Focus Area 6: Efficiencies In AI Operationalization

i. Integrated DataOps - ModelOps - DevOps

ii. Automated Machine Learning

iii. Deep Neural Network Compression

Focus Area 7: Model Interpretability & Explainability

Special Mention: Transformers

What's Next?

Tamal Chowdhury, Ph.D.的更多文章

社区洞察

其他会员也浏览了

Multimodal AI Market: Analyzing Diverse Data Types Effectively

The Things AI Can't Do: A Comparison of Human and Machine Intelligence

Anthropic Announces Claude 3 AI model

AI in Vision Market: Transforming Visual Intelligence for the Future

Seeing Beyond: The Convergence of Vision Models and Multimodal AI in Enterprise Innovation

Title: Demystifying Series 2: Artificial Intelligence

The Reign of Artificial Intelligence

Superintelligent AI: Will It Change or Threaten Humanity?

The Evolution of Artificial Intelligence

Unlocking the Power of AI: How Artificial Intelligence is Transforming Business Analysis

Major AI Research & Engineering Areas In 2020

Focus Area 1: Advanced Object Detection

Focus Area 2: Human Action & Emotion Recognition

Focus Area 3: Neural-Symbolic Reasoning

Focus Area 4: Anomaly and Causality Discovery In Noisy Temporal Data

Focus Area 5: Cloud-Native AI Development

i. Building Cloud-Native Everything

ii. Migrating Older AI Systems to Cloud-Native Architectures

Focus Area 6: Efficiencies In AI Operationalization

i. Integrated DataOps - ModelOps - DevOps

ii. Automated Machine Learning

iii. Deep Neural Network Compression

Focus Area 7: Model Interpretability & Explainability

Special Mention: Transformers

What's Next?

Tamal Chowdhury, Ph.D.的更多文章

Engineering Multi-Agent Systems: A Technical Playbook

The Rust-Python Hybrid: A Powerful Polyglot Architecture for Cutting-Edge AI Engineering

Augmenting Mathematical Optimization with Reinforcement Learning

Generative AI: Searching the Signal amidst the Noise

Are your AI-driven applications really secure?

Next-Generation Engineering With WebAssembly

The RISC-V Revolution: Why The Global Tech Community Needs To Pay More Attention To This

Trends & Innovations In Object Detection

Navigating R&D Organizations Through Economic Turbulence - Four Principal Strategies For A Strong (Post-Crisis) Emergence

Dealing With Systemic Risk Shocks In Complex Systems: Early Over-reactions & Extreme Measures Hold The Key

社区洞察

其他会员也浏览了

Multimodal AI Market: Analyzing Diverse Data Types Effectively

The Things AI Can't Do: A Comparison of Human and Machine Intelligence

Anthropic Announces Claude 3 AI model

AI in Vision Market: Transforming Visual Intelligence for the Future

Seeing Beyond: The Convergence of Vision Models and Multimodal AI in Enterprise Innovation

Title: Demystifying Series 2: Artificial Intelligence

The Reign of Artificial Intelligence

Superintelligent AI: Will It Change or Threaten Humanity?

The Evolution of Artificial Intelligence

Unlocking the Power of AI: How Artificial Intelligence is Transforming Business Analysis