← Back to all talks

Our DevOps Journey @ ClarityAI

EN
Year: 2022 Event: Devops Lisbon Core Talk

Description

This talk details Clarity AI's two-year evolution from siloed chapters to stream-aligned teams using Team Topologies, highlighting the shift toward managed services, self-service infrastructure, and a dedicated DevEx team to reduce toil and accelerate product delivery.

🎯 Key Learning

Transitioning from functional silos to a Team Topologies-driven structure—supported by a self-service platform and a dedicated Developer Experience team—allows a growing organization to scale effectively by reducing toil and empowering teams with clear end-to-end ownership. Success in this DevOps journey lies in "making it easy to do the right thing" through managed services and automated tools that minimize friction in the development flow while aligning infrastructure efforts with the product roadmap.

📋 Key Points

  • Organizational Scaling through Team Topologies: Transitioned from functional chapters (silos) where engineers reported to chapter leads to stream-aligned teams where engineers report directly to Squad Team Leads, enhancing cross-functionality and ownership.
  • Defining DevOps as Culture: Emphasized that DevOps is not a role or a team but a culture focused on end-to-end flow, small batches, fast feedback loops, and experimentation.
  • Reducing Platform Basal Cost and Toil: Focused on reducing the effort required just to maintain existing infrastructure (Basal Cost) and eliminating repetitive, manual tasks (toil) to free up time for innovation.
  • Platform Principles: Adopted core principles to buy/use managed services instead of self-building, simplify by removing unneeded systems, and ensure everything is self-service.
  • Managed Service Migration: Replaced self-managed tools with managed solutions like AWS EKS (Kubernetes), MongoDB Atlas, and Datadog (monitoring) to reduce maintenance overhead.
  • Dedicated Developer Experience (DevEx) Team: Created a DevEx team specifically to reduce friction in the product development flow and promote engineering best practices like fast feedback and quality.
  • Reinforcing Squad Ownership: Utilized "Acceptance Criteria" for performance, operability, and scalability to make it clear that squads—not the platform team—own the end-to-end lifecycle of their services.
  • Self-Service Automation via Platform Bot: Developed a Slack-based Platform Bot to automate credential management, deployments, and database migrations, removing the platform team as a bottleneck.
  • Improving Delivery Performance: Increased release frequency for the monolith from every 2–3 weeks to once per week with zero downtime, while enabling new services to deploy on-demand.
  • "Making it Easy to do the Right Thing": Focused on a cultural shift where the platform provides tools and paths that are so low-friction they naturally become the preferred way for developers to work.
  • Future Initiatives: Plans include increasing release frequency further and establishing a Talent Acceleration Program to train junior staff in Extreme Programming (XP) and the DevOps mindset.