Well-Implemented Data and AI: A Top Competitive Advantage for Large Companies

Well-Implemented Data and AI: A Top Competitive Advantage for Large Companies

Survival of the fittest in markets

Surviving and thriving in a competitive market has a lot in common with Darwin's theory of survival of the fittest: When two organisms compete in the wild, it's often the one that can process information and respond more swiftly and effectively that survives and thrives. The same principle applies to large organisations in the business world.


Challenge of Large Organisation

As companies expand from dozens to thousands of employees, they often face a critical challenge: diminishing agility. This loss stems from inefficient flow of information within organisation. Crucial data—whether from customers, competitors, markets, or within the organisation itself—frequently gets lost in transmission or moves with glacial slowness. When information does arrive, it's often inaccurate or degraded. This breakdown in flow of information can severely impact a company's ability to respond quickly and effectively to changing conditions.

Conversely, a company that effectively manages information flow can not only survive but thrive, quickly rising to the top of its industry. The most critical information at risk of delay or loss typically comes from customers. Nearly as crucial is the data exchanged between various functions and departments within the organisation. When companies struggle to process this information swiftly and accurately, their decision-making degrades and slows. This leads to delayed responses to customer needs and requests, ultimately resulting in lost business to more agile competitors.


Competitive Advantage of Well Implemented Data and AI

A well-implemented data and AI system serves as the company's crucial nervous system, enhancing its intelligence and driving its growth to the top of the industry. 'Well-implemented' in this context means a system that enables the company to process information rapidly and accurately, facilitating swift and effective responses at all organisational levels. This enhanced responsiveness is key to outperforming competitors and achieving market leadership.

When I was at Lazada, my data science team could consistently deliver results in 2 weeks that typically take other organisations 6+ months, demonstrating the transformative nature of an optimised approach to data and AI.

While, well-implemented data and AI is a top competitive advantage, it is surely not the only one that matters. Few other significant competitive advantages I can think of include good culture and optimised processes. Data and AI can't replace these but it can certainly make these dramatically more effective.


Strategy and Tactic

Drawing from my experience at AWS and elsewhere, I've observed a critical misconception in data and AI initiatives: confusing tactics with strategy. Data lake, warehouse, mlops are not strategy, these are tactics. The core strategy should be enhancing an organization's ability to rapidly process market and internal information, enabling swift and effective responses. This agility translates to faster time-to-market, quicker insights, improved customer retention, and cost reduction.

Tactics such as data lakes and ML Ops are merely tools to achieve this overarching goal. The key isn't just deploying these tactics, but executing them in a way that genuinely accelerates information processing and decision-making across the organization. A well-implemented tactic amplifies the company's responsiveness; a poorly executed one creates bottlenecks.

Remember: your north star should be the strategy of rapid information processing and agile response. Every tactic should be evaluated based on how it enhances this organisational capability, not as a mere checkbox exercise.


Optimising Tactics according to Strategy

This brings me to what I have learnt about optimised approach to data and AI that helps achieve accurate processing of information and effective and swift response.

I will break these tactics into two parts, one for data analytics systems, and another for AI systems.

  1. Data Analytics Optimized Approach
  2. AI Optimized Approach


Data Analytics Optimized Approach

These approaches help organisation be more agile in processing information accurately and responding swiftly and effectively at all levels of organisation, whether at operational level or strategic levels.

  1. Early Integration of DevOps/DataOps - Identify critical need for DataOps implementation from project inception. This improves project agility, enabling faster iteration and automated deployment of analytics solutions. This reduces time-to-deployment significantly across projects and saves significant costs in repeated efforts. This can also reduce heavy technical lifting for repeated iterations as parts of those complexities are automated.
  2. Implement Automated Data Quality Workflows - This addresses the critical 'garbage in, garbage out' problem in data projects. Design and implement automated data quality checks and correction workflows for customers. This could mean automatically detected data quality issues trigger a workflow in work management system such as Jira/ServiceDesk/etc to manage the correction workflow. This automation accelerates the improvement in a way that manually tracked quality initiatives fail to. This significantly reduces data-related project failures, mitigating a key risk in data initiatives.
  3. Standardise and Simplify Analytics Processes - Implement ?SQL-based solutions ( Spark, Snowflake, Databricks, Redshift, EMR ) to simplify analytics for 80% of end users. Integrate GenAI Development Assistant to streamline coding processes. Shield most users from advanced concepts (e.g., buckets, delta, iceberg, PySpark) to improve accessibility. Deploy GenAI for automated code review of SQL and Python scripts. Standardise methods to ensure efficiency across projects. Such simplification and standardisation can significantly increase user adoption of analytics tools.
  4. Champion usage of productivity tools such as DBT and Other Productivity Analtyics Tools
  5. Implement Collaborative Business Catalog for Data Asset Management - Deploy a modern, collaborative solution for publishing, discovering, and subscribing to data assets across the organisation. Reduce time to discover trustworthy datasets from months to minutes. Accelerate time-to-market for analytics projects significantly.
  6. Implement of Automated Access Provisioning Workflow - Design and deploy an automated system for provisioning data access permissions. This eliminates manual admin steps after access approval, streamlining the process. Reduce waiting time for data access, accelerating experimentation cycles. Maintain robust governance while improving efficiency.
  7. Champion a Modular Analytics and AI Infrastructure Library using CDK/CDKTF - Create reusable analytics infrastructure-as-code high level libraries using Python or Node.js to improve productivity. Standardise best practices for performance, cost-effectiveness, and security in cloud environments. Automate majority of regulatory compliance checks and cloud best practices implementation. Ensure consistent application of organizational standards across projects. This significantly reduces infrastructure deployment time. This also improves consistency across projects.


Optimized AI Approach

These approaches help organisation be more agile in responding to information using AI and responding swiftly and effectively at all levels of organisation, operational and strategic.

  1. Avoid 'Kaggle Syndrome' in Production Data Science - Implement proper alignment between DataOps and MLOps to overcome the 'quick fix' mentality often seen in data science projects. This syndrome, reminiscent of Kaggle competitions, involves: 1) Downloading one-time datasets from siloed internal systems to S3 buckets 2) Conducting experiments on stale data 3) Lacking automated processes for data refresh, quality workflows, and governance. Such proof-of-concepts often fail due to insufficient agility in quickly iterating model experiments. Proper implementation ensures sustainable, production-ready data science solutions.
  2. Structured Approach to Model Development - Guide Data Science team to avoid 'trigger-happy' model development. Establish process for aligning business context and clear business outcomes with stakeholders. Developed framework for aligning ML optimization metrics (e.g., sales figures, customer engagement) with stakeholders. Create system for measuring incremental model improvements in production. Ensure model development began only after thorough preparation and alignment. Reduce project failures by aligning technical and business assumptions. This can help increase model success rate significantly across projects
  3. Champion Improved User Experience of ML tools - Adopt AutoML Tools. Upgrade the notebook/jupyter environment to more modern and user friendlier versions. CI/CD and MLOps should be integrated into notebook environment. Integration complexities around networking, security, performance etc should be automated. This can Improve ML tools adoption significantly
  4. Champion Early Integration of DevOps/MLOps - Identify critical need for MLOps implementation from project inception. This improves project agility, enabling faster iteration and automated deployment of analytics solutions. This reduces time-to-deployment significantly across projects and saves significant costs in repeated efforts. This can also reduce heavy technical lifting for repeated iterations as parts of those complexities are automated. Implement MLFlow for Experiment Tracking, Model Packaging, Model Versioning, Model Serving, Model Registry for better management and traceability of models. This improves project agility, enabling faster iteration and automated deployment of ML solutions. This reduces time-to-deployment significantly across projects.
  5. Implement Automated Data Quality Workflows - This addresses the critical 'garbage in, garbage out' problem in data projects. Design and implement automated data quality checks and correction workflows for customers. This could mean automatically detected data quality issues trigger a workflow in work management system such as Jira/ServiceDesk/etc to manage the correction workflow. This automation accelerates the improvement in a way that manually tracked quality initiatives fail to. This significantly reduces data-related project failures, mitigating a key risk in data initiatives.
  6. Implement Collaborative Business Catalog for Data Asset Management - Deploy a modern, collaborative solution for publishing, discovering, and subscribing to data assets across the organisation. Reduce time to discover trustworthy datasets from months to minutes. Accelerate time-to-market for analytics projects significantly.
  7. Implement of Automated Access Provisioning Workflow - Design and deploy an automated system for provisioning data access permissions. This eliminates manual admin steps after access approval, streamlining the process. Reduce waiting time for data access, accelerating experimentation cycles. Maintain robust governance while improving efficiency.
  8. Champion a Modular Analytics and AI Infrastructure Library using CDK/CDKTF - Create reusable analytics infrastructure-as-code high level libraries using Python or Node.js to improve productivity. Standardise best practices for performance, cost-effectiveness, and security in cloud environments. Automate majority of regulatory compliance checks and cloud best practices implementation. Ensure consistent application of organizational standards across projects. This significantly reduces infrastructure deployment time. This also improves consistency across projects.



要查看或添加评论,请登录

社区洞察