Why scaling IaC at enterprise level is still a problem
Rushabh Shah
Lead Product Manager | Mastercard | Ex-JPMC | Ex-TCS | Product Strategy & Development | Platform Products | B2B & B2C SaaS | AI, Analytics, Cloud, Security
Problem Discovery
Problem statement
We need leverage standardized and streamline process to manage enterprise IT infrastructure management which is a backbone for any functional side of business.
In current state, we are seeing 9 different sub-problems that any organization is going to face while adopting infrastructure as a Code with keeping security, governance and compliance as a de-facto benchmark.
Sub-problems
1. Learning curve: Expertise in IaC tools at Enterprise level
?- It is always challenging to switch between imperative programming model to declarative programming model. Majorly team are working for cloud infrastructure management manually via cloud console provided by CSP providers.
?- It requires dedicative time for enterprise to train employees to adopt new technologies like Terraform, Pulumi, Crossplane.
?
2. Tools selection and proliferation
?- Larger the organization bigger the problem in selecting relevant IaC tools and eventually it will start creating silos within organization.
?- Pros and cons are hard to judge as tools selection criteria when multiple people are decision makers within organization.
?- Configuration of tools are developed for personas keeping in mind with respect to imperative or declarative way. So it becomes inevitable that developer say in selection of tool is most important.
?- It creates silos in expertise across large enterprise organization and different tools are used in different way across organization.
3. Lack of cloud expertise
?- People who develops the IaC features doesn't necessarily have an expertise in cloud or cloud best practice or security practices in cloud.
?- Collaboration between cloud technology expert and IaC developer is necessarily.
?- How cloud services should be configured, how they connect with different component, what order they should be provisioned is unknown to majority IaC developer.
?- Multi cloud make it much worse to manage it via single control plane.
?
4. Enterprise governance and standardization
?- It will disrupt existing workflow that is running smoothly and will start enforcing compliance and governance policies across organization.
?- It will work as gate keeper within your CI/CD pipelines where failure will result in build/deployment failure.
?- We may require exception management before we go live in enterprise so that people can take exception for their portfolio with keeping risk registry up to date.
?
5. Security concerns
?- We may see developer hardcoding secrets, passwords or URLs as part of IaC code which requires stringent IaC code testing from security perspective.
?- We may see resources misconfigurations like keeping internet ports open or over provisioning the user access via template.
?-We need to have significant IaC scanning via Terrascan, Wiz, Synk, Sysdig, PingSafe as a part of development and testing pipelines.
?- We need to enforce zero trust for each, and every privilege provisioned.
6. Versioning and Auditing
?- Managing and versioning IaC with review and thorough code quality scanning process will lead to conflicts.
领英推è
?- Creating clones of IaC code for different product/projects/clients/operating units is more important to cater different needs of an organization.
?- Micro services adds more complexity over monolith, overall, we have major concern in terms of versioning, configuring, testing IaC codebase for different entities.
?- IaC templates and code requires continuous security, reliability, patching.
?
7. Change Management
- It is impossible for IaC tools to capture the drift in infrastructure once they are provisioned successfully.
- Organization requires detection, tracking and remediation by various means like building sandbox of same configuration and check it against real-time snapshot or teardown the entire infrastructure, provision it again from scratch.
- We might have to create our own in-house mechanism for bookkeeping the drift management.
?
8. Exception Management
- It is impossible to enforce same policies across organization given different skill set, working style, risk appetite, regulatory compliance rule.
- We need to build vs buy exception management workflow to keep track of compliance, governance and security policies across the organization.
- We also need to have approval workflow and risk registry management for any exception taken across organization.
?
9. Bookkeeping for Audits & workflow orchestration
- We need to create in-house or buy bookkeeping tool to keep track of any changes, exception and approval taken while enabling IaC at enterprise level.
- It will help organization be prepared for various auditory checks like SOC2, PCI DSS etc.
User Personas
1.?Principal architect/Security architect
2.?DevSecOps engineer
3.?SRE engineer
4.?Support executive
5.?Compliance officer
KPI
1.?Deployment Frequency – Increase
2.?Lead Time – Decrease
3.?Defect escape rate – Decrease
4.?Average success rate of successful deployment vs failed deployment - Increase
Solution Discovery