{
    "version": "https://jsonfeed.org/version/1",
    "title": "Suraj Srivastav Blog",
    "home_page_url": "https://surajsrivastav.com/blog/",
    "description": "Suraj Srivastav Blog",
    "items": [
        {
            "id": "https://surajsrivastav.com/blog/designing-scalable-backend-systems/",
            "content_html": "<p>Building backend systems that scale requires more than just throwing more servers at a problem. It requires thoughtful architecture, clear abstractions, and a deep understanding of your constraints.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"the-foundation-understanding-your-constraints\">The Foundation: Understanding Your Constraints<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#the-foundation-understanding-your-constraints\" class=\"hash-link\" aria-label=\"Direct link to The Foundation: Understanding Your Constraints\" title=\"Direct link to The Foundation: Understanding Your Constraints\" translate=\"no\">​</a></h2>\n<p>Before you start designing anything, understand your constraints:</p>\n<ul>\n<li class=\"\"><strong>Traffic patterns</strong>: Are you handling constant load or spiky traffic?</li>\n<li class=\"\"><strong>Data volume</strong>: How much data do you need to store and retrieve?</li>\n<li class=\"\"><strong>Latency requirements</strong>: What are your response time SLAs?</li>\n<li class=\"\"><strong>Consistency requirements</strong>: Do you need strong consistency or eventual consistency?</li>\n<li class=\"\"><strong>Team size</strong>: How many engineers will maintain this system?</li>\n</ul>\n<p>These constraints determine everything about your architecture.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"key-principles-for-scalability\">Key Principles for Scalability<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#key-principles-for-scalability\" class=\"hash-link\" aria-label=\"Direct link to Key Principles for Scalability\" title=\"Direct link to Key Principles for Scalability\" translate=\"no\">​</a></h2>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"1-stateless-design\">1. Stateless Design<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#1-stateless-design\" class=\"hash-link\" aria-label=\"Direct link to 1. Stateless Design\" title=\"Direct link to 1. Stateless Design\" translate=\"no\">​</a></h3>\n<p>Keep your application servers stateless. This allows you to:</p>\n<ul>\n<li class=\"\">Scale horizontally by adding more servers</li>\n<li class=\"\">Route requests to any server without affinity</li>\n<li class=\"\">Handle server failures gracefully</li>\n</ul>\n<p>State should live in dedicated systems (databases, caches) designed for that purpose.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"2-separation-of-concerns\">2. Separation of Concerns<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#2-separation-of-concerns\" class=\"hash-link\" aria-label=\"Direct link to 2. Separation of Concerns\" title=\"Direct link to 2. Separation of Concerns\" translate=\"no\">​</a></h3>\n<p>Different components have different scaling characteristics:</p>\n<ul>\n<li class=\"\"><strong>Compute</strong>: Scales with the number of concurrent requests</li>\n<li class=\"\"><strong>Database</strong>: Scales with data volume and query complexity</li>\n<li class=\"\"><strong>Cache</strong>: Scales with working set size</li>\n<li class=\"\"><strong>Message queue</strong>: Scales with throughput</li>\n</ul>\n<p>Separate these concerns so you can scale each independently.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"3-caching-strategy\">3. Caching Strategy<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#3-caching-strategy\" class=\"hash-link\" aria-label=\"Direct link to 3. Caching Strategy\" title=\"Direct link to 3. Caching Strategy\" translate=\"no\">​</a></h3>\n<p>Caching is your best friend for scalability. But it's also a common source of complexity.</p>\n<ul>\n<li class=\"\"><strong>Cache invalidation</strong>: Invalidate strategically. Don't cache everything.</li>\n<li class=\"\"><strong>Cache warming</strong>: Preload hot data to avoid cache misses on startup.</li>\n<li class=\"\"><strong>Cache levels</strong>: Use multiple levels (app, redis, database).</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"4-database-design\">4. Database Design<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#4-database-design\" class=\"hash-link\" aria-label=\"Direct link to 4. Database Design\" title=\"Direct link to 4. Database Design\" translate=\"no\">​</a></h3>\n<p>Most performance problems are database problems.</p>\n<ul>\n<li class=\"\"><strong>Indexing</strong>: Index your most common queries. Don't index everything.</li>\n<li class=\"\"><strong>Denormalization</strong>: Normalize for correctness, denormalize for performance.</li>\n<li class=\"\"><strong>Sharding</strong>: When single-instance doesn't work, shard strategically.</li>\n<li class=\"\"><strong>Read replicas</strong>: Use for read-heavy workloads.</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"5-asynchronous-processing\">5. Asynchronous Processing<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#5-asynchronous-processing\" class=\"hash-link\" aria-label=\"Direct link to 5. Asynchronous Processing\" title=\"Direct link to 5. Asynchronous Processing\" translate=\"no\">​</a></h3>\n<p>Push slow work off the critical path:</p>\n<ul>\n<li class=\"\">Use message queues for async tasks</li>\n<li class=\"\">Process in workers, not in request handlers</li>\n<li class=\"\">Return results asynchronously when appropriate</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"real-world-example-e-commerce-platform\">Real-World Example: E-commerce Platform<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#real-world-example-e-commerce-platform\" class=\"hash-link\" aria-label=\"Direct link to Real-World Example: E-commerce Platform\" title=\"Direct link to Real-World Example: E-commerce Platform\" translate=\"no\">​</a></h2>\n<p>Consider an e-commerce platform handling millions of users:</p>\n<p><strong>The Problem</strong>: Product catalog requests are slow during peak traffic.</p>\n<p><strong>The Solution</strong>:</p>\n<ol>\n<li class=\"\">Cache product data in memory (Redis)</li>\n<li class=\"\">Invalidate cache only on inventory changes</li>\n<li class=\"\">Use database replicas for product queries</li>\n<li class=\"\">Denormalize product info to avoid joins</li>\n<li class=\"\">Queue inventory updates asynchronously</li>\n</ol>\n<p><strong>Result</strong>: Response times drop from 200ms to 10ms.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"monitoring-and-observability\">Monitoring and Observability<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#monitoring-and-observability\" class=\"hash-link\" aria-label=\"Direct link to Monitoring and Observability\" title=\"Direct link to Monitoring and Observability\" translate=\"no\">​</a></h2>\n<p>You can't optimize what you can't measure. Instrument everything:</p>\n<ul>\n<li class=\"\"><strong>Request latency</strong>: Track p50, p95, p99</li>\n<li class=\"\"><strong>Database queries</strong>: Track slow queries</li>\n<li class=\"\"><strong>Cache hit rates</strong>: Monitor cache effectiveness</li>\n<li class=\"\"><strong>Error rates</strong>: Track errors by type</li>\n</ul>\n<p>Use these metrics to identify bottlenecks.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"conclusion\">Conclusion<a href=\"https://surajsrivastav.com/blog/designing-scalable-backend-systems/#conclusion\" class=\"hash-link\" aria-label=\"Direct link to Conclusion\" title=\"Direct link to Conclusion\" translate=\"no\">​</a></h2>\n<p>Scalability is not a feature you add at the end. It's a consequence of good design decisions made from the start. Focus on:</p>\n<ul>\n<li class=\"\">Understanding your constraints</li>\n<li class=\"\">Keeping things simple</li>\n<li class=\"\">Measuring everything</li>\n<li class=\"\">Iterating based on data</li>\n</ul>\n<p>Start simple. Scale when you need to. Design for the scale you actually have, not the scale you might have.</p>",
            "url": "https://surajsrivastav.com/blog/designing-scalable-backend-systems/",
            "title": "Designing Scalable Backend Systems",
            "summary": "Building backend systems that scale requires more than just throwing more servers at a problem. It requires thoughtful architecture, clear abstractions, and a deep understanding of your constraints.",
            "date_modified": "2026-05-01T00:00:00.000Z",
            "author": {
                "name": "Suraj Srivastav",
                "url": "https://github.com/surajsrivastav"
            },
            "tags": [
                "Systems",
                "Architecture"
            ]
        },
        {
            "id": "https://surajsrivastav.com/blog/ai-agents-production-workflows/",
            "content_html": "<p>AI agents are moving from research labs to production systems. But deploying agents at scale is fundamentally different from deploying traditional applications. Let me share what we've learned.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"what-are-ai-agents\">What Are AI Agents?<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#what-are-ai-agents\" class=\"hash-link\" aria-label=\"Direct link to What Are AI Agents?\" title=\"Direct link to What Are AI Agents?\" translate=\"no\">​</a></h2>\n<p>An AI agent is a system that can:</p>\n<ul>\n<li class=\"\">Perceive its environment</li>\n<li class=\"\">Make decisions based on that perception</li>\n<li class=\"\">Take actions to achieve goals</li>\n<li class=\"\">Learn from feedback</li>\n</ul>\n<p>Unlike traditional chatbots that respond to user input, agents can:</p>\n<ul>\n<li class=\"\">Plan multi-step workflows</li>\n<li class=\"\">Use tools and APIs</li>\n<li class=\"\">Recover from failures</li>\n<li class=\"\">Improve through feedback</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"the-challenge-reliability-at-scale\">The Challenge: Reliability at Scale<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#the-challenge-reliability-at-scale\" class=\"hash-link\" aria-label=\"Direct link to The Challenge: Reliability at Scale\" title=\"Direct link to The Challenge: Reliability at Scale\" translate=\"no\">​</a></h2>\n<p>Agents introduce new reliability challenges:</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"1-non-determinism\">1. Non-Determinism<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#1-non-determinism\" class=\"hash-link\" aria-label=\"Direct link to 1. Non-Determinism\" title=\"Direct link to 1. Non-Determinism\" translate=\"no\">​</a></h3>\n<p>Traditional code is deterministic. Feed the same input, get the same output. Agents are not.</p>\n<p><strong>Solution</strong>: Design for variability</p>\n<ul>\n<li class=\"\">Add explicit error handling</li>\n<li class=\"\">Implement retry logic with backoff</li>\n<li class=\"\">Use deterministic fallbacks when agents fail</li>\n<li class=\"\">Monitor agent decisions for anomalies</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"2-hallucinations\">2. Hallucinations<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#2-hallucinations\" class=\"hash-link\" aria-label=\"Direct link to 2. Hallucinations\" title=\"Direct link to 2. Hallucinations\" translate=\"no\">​</a></h3>\n<p>LLMs can confidently generate incorrect information.</p>\n<p><strong>Solution</strong>: Grounding and verification</p>\n<ul>\n<li class=\"\">Ground agents in your actual data</li>\n<li class=\"\">Verify agent decisions against trusted sources</li>\n<li class=\"\">Use agent feedback to improve prompts</li>\n<li class=\"\">Log decision chains for auditing</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"3-cost-control\">3. Cost Control<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#3-cost-control\" class=\"hash-link\" aria-label=\"Direct link to 3. Cost Control\" title=\"Direct link to 3. Cost Control\" translate=\"no\">​</a></h3>\n<p>Running agents at scale gets expensive fast.</p>\n<p><strong>Solution</strong>: Efficient agent design</p>\n<ul>\n<li class=\"\">Use smaller models for simple tasks</li>\n<li class=\"\">Cache decisions when appropriate</li>\n<li class=\"\">Batch similar requests</li>\n<li class=\"\">Monitor token usage and costs</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"architecture-pattern-the-control-loop\">Architecture Pattern: The Control Loop<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#architecture-pattern-the-control-loop\" class=\"hash-link\" aria-label=\"Direct link to Architecture Pattern: The Control Loop\" title=\"Direct link to Architecture Pattern: The Control Loop\" translate=\"no\">​</a></h2>\n<p>Here's a pattern that works well in production:</p>\n<div class=\"language-text codeBlockContainer_Ckt0 theme-code-block\" style=\"--prism-background-color:hsl(220, 13%, 18%);--prism-color:hsl(220, 14%, 71%)\"><div class=\"codeBlockContent_QJqH\"><pre tabindex=\"0\" class=\"prism-code language-text codeBlock_bY9V thin-scrollbar\" style=\"background-color:hsl(220, 13%, 18%);color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><code class=\"codeBlockLines_e6Vv\"><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">[User Request]</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    ↓</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">[Agent Planner] → [Tool Calls] → [Tool Execution]</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    ↓                               ↓</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    ←────────── [Reflection] ←─────</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    ↓</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">[Verification] → Pass? → [Action]</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    ↓</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">   Fail? → [Recovery Strategy]</span><br></div></code></pre></div></div>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"key-components\">Key Components:<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#key-components\" class=\"hash-link\" aria-label=\"Direct link to Key Components:\" title=\"Direct link to Key Components:\" translate=\"no\">​</a></h3>\n<p><strong>Planner</strong>: The agent plans what to do. Be explicit about constraints.</p>\n<p><strong>Tools</strong>: Agents execute via tools. Make tools atomic and well-defined.</p>\n<p><strong>Reflection</strong>: After execution, agents reflect on results. This improves decisions.</p>\n<p><strong>Verification</strong>: Always verify critical decisions. Don't blindly trust agent output.</p>\n<p><strong>Recovery</strong>: When things fail, have explicit recovery strategies.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"real-example-customer-support-agent\">Real Example: Customer Support Agent<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#real-example-customer-support-agent\" class=\"hash-link\" aria-label=\"Direct link to Real Example: Customer Support Agent\" title=\"Direct link to Real Example: Customer Support Agent\" translate=\"no\">​</a></h2>\n<p>We built an agent to handle customer support tickets:</p>\n<p><strong>What it does</strong>:</p>\n<ul>\n<li class=\"\">Reads incoming support tickets</li>\n<li class=\"\">Decides if it can resolve or needs escalation</li>\n<li class=\"\">Queries knowledge base and database</li>\n<li class=\"\">Drafts responses</li>\n<li class=\"\">Escalates complex issues</li>\n</ul>\n<p><strong>How we made it reliable</strong>:</p>\n<ol>\n<li class=\"\"><strong>Grounding</strong>: Agent only uses our knowledge base + database</li>\n<li class=\"\"><strong>Guardrails</strong>: Hard limits on what agent can do (no payments, no data deletion)</li>\n<li class=\"\"><strong>Verification</strong>: Manager reviews all responses before sending</li>\n<li class=\"\"><strong>Feedback loop</strong>: Bad decisions are logged and used for fine-tuning</li>\n<li class=\"\"><strong>Escalation</strong>: Complex issues go to humans immediately</li>\n</ol>\n<p><strong>Results</strong>:</p>\n<ul>\n<li class=\"\">60% of tickets resolved automatically</li>\n<li class=\"\">Resolution time dropped 40%</li>\n<li class=\"\">Customer satisfaction: 4.2/5</li>\n<li class=\"\">Human time freed for complex issues</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"implementation-patterns\">Implementation Patterns<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#implementation-patterns\" class=\"hash-link\" aria-label=\"Direct link to Implementation Patterns\" title=\"Direct link to Implementation Patterns\" translate=\"no\">​</a></h2>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"pattern-1-tool-based-agents\">Pattern 1: Tool-Based Agents<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#pattern-1-tool-based-agents\" class=\"hash-link\" aria-label=\"Direct link to Pattern 1: Tool-Based Agents\" title=\"Direct link to Pattern 1: Tool-Based Agents\" translate=\"no\">​</a></h3>\n<p>Agents call tools (functions, APIs) to accomplish tasks.</p>\n<div class=\"language-python codeBlockContainer_Ckt0 theme-code-block\" style=\"--prism-background-color:hsl(220, 13%, 18%);--prism-color:hsl(220, 14%, 71%)\"><div class=\"codeBlockContent_QJqH\"><pre tabindex=\"0\" class=\"prism-code language-python codeBlock_bY9V thin-scrollbar\" style=\"background-color:hsl(220, 13%, 18%);color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><code class=\"codeBlockLines_e6Vv\"><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token comment\" style=\"color:hsl(220, 10%, 40%)\"># Tools the agent can call</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">tools </span><span class=\"token operator\" style=\"color:hsl(207, 82%, 66%)\">=</span><span class=\"token plain\"> </span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">[</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    lookup_customer</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    query_database</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    send_email</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    escalate_to_human</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\"></span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">]</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\" style=\"display:inline-block\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\"></span><span class=\"token comment\" style=\"color:hsl(220, 10%, 40%)\"># Agent decides which tools to use</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">agent </span><span class=\"token operator\" style=\"color:hsl(207, 82%, 66%)\">=</span><span class=\"token plain\"> create_agent</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">(</span><span class=\"token plain\">tools</span><span class=\"token operator\" style=\"color:hsl(207, 82%, 66%)\">=</span><span class=\"token plain\">tools</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"> constraints</span><span class=\"token operator\" style=\"color:hsl(207, 82%, 66%)\">=</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">[</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"Do not handle payments\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"Always escalate legal issues\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"Verify before sending emails\"</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\"></span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">]</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">)</span><br></div></code></pre></div></div>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"pattern-2-workflow-agents\">Pattern 2: Workflow Agents<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#pattern-2-workflow-agents\" class=\"hash-link\" aria-label=\"Direct link to Pattern 2: Workflow Agents\" title=\"Direct link to Pattern 2: Workflow Agents\" translate=\"no\">​</a></h3>\n<p>Agents orchestrate workflows where order matters.</p>\n<div class=\"language-python codeBlockContainer_Ckt0 theme-code-block\" style=\"--prism-background-color:hsl(220, 13%, 18%);--prism-color:hsl(220, 14%, 71%)\"><div class=\"codeBlockContent_QJqH\"><pre tabindex=\"0\" class=\"prism-code language-python codeBlock_bY9V thin-scrollbar\" style=\"background-color:hsl(220, 13%, 18%);color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><code class=\"codeBlockLines_e6Vv\"><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">workflow </span><span class=\"token operator\" style=\"color:hsl(207, 82%, 66%)\">=</span><span class=\"token plain\"> </span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">[</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    </span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">{</span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"step\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"validate_input\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"agent\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> input_validator</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">}</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    </span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">{</span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"step\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"process_request\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"agent\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> processor</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">}</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    </span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">{</span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"step\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"verify_output\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"agent\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> verifier</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">}</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    </span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">{</span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"step\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"execute_action\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"> </span><span class=\"token string\" style=\"color:hsl(95, 38%, 62%)\">\"agent\"</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">:</span><span class=\"token plain\"> executor</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">}</span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">,</span><span class=\"token plain\"></span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\"></span><span class=\"token punctuation\" style=\"color:hsl(220, 14%, 71%)\">]</span><br></div></code></pre></div></div>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"pattern-3-hierarchical-agents\">Pattern 3: Hierarchical Agents<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#pattern-3-hierarchical-agents\" class=\"hash-link\" aria-label=\"Direct link to Pattern 3: Hierarchical Agents\" title=\"Direct link to Pattern 3: Hierarchical Agents\" translate=\"no\">​</a></h3>\n<p>One agent delegates to specialized agents.</p>\n<div class=\"language-text codeBlockContainer_Ckt0 theme-code-block\" style=\"--prism-background-color:hsl(220, 13%, 18%);--prism-color:hsl(220, 14%, 71%)\"><div class=\"codeBlockContent_QJqH\"><pre tabindex=\"0\" class=\"prism-code language-text codeBlock_bY9V thin-scrollbar\" style=\"background-color:hsl(220, 13%, 18%);color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><code class=\"codeBlockLines_e6Vv\"><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">[Manager Agent]</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    ├─ [Analyst Agent] - analyzes data</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    ├─ [Writer Agent] - writes output</span><br></div><div class=\"token-line\" style=\"color:hsl(220, 14%, 71%);text-shadow:0 1px rgba(0, 0, 0, 0.3)\"><span class=\"token plain\">    └─ [Reviewer Agent] - reviews quality</span><br></div></code></pre></div></div>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"monitoring-agents\">Monitoring Agents<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#monitoring-agents\" class=\"hash-link\" aria-label=\"Direct link to Monitoring Agents\" title=\"Direct link to Monitoring Agents\" translate=\"no\">​</a></h2>\n<p>You need different metrics for agents:</p>\n<ul>\n<li class=\"\"><strong>Decision quality</strong>: Are agent decisions correct?</li>\n<li class=\"\"><strong>Coverage</strong>: What percentage of tasks does the agent handle?</li>\n<li class=\"\"><strong>Escalation rate</strong>: What % needs human intervention?</li>\n<li class=\"\"><strong>Cost per task</strong>: Tokens used × cost</li>\n<li class=\"\"><strong>Latency</strong>: How long does the agent take?</li>\n<li class=\"\"><strong>Hallucination rate</strong>: How often does it generate false info?</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"the-future\">The Future<a href=\"https://surajsrivastav.com/blog/ai-agents-production-workflows/#the-future\" class=\"hash-link\" aria-label=\"Direct link to The Future\" title=\"Direct link to The Future\" translate=\"no\">​</a></h2>\n<p>Agents will become standard infrastructure for many workflows. But they require:</p>\n<ul>\n<li class=\"\">Explicit error handling</li>\n<li class=\"\">Verification and guardrails</li>\n<li class=\"\">Careful monitoring</li>\n<li class=\"\">Human oversight for critical decisions</li>\n</ul>\n<p>The key is treating agents as tools that augment humans, not replace them.</p>\n<p>Start small. Deploy agents for low-risk, high-value tasks. Learn from production data. Scale gradually.</p>",
            "url": "https://surajsrivastav.com/blog/ai-agents-production-workflows/",
            "title": "AI Agents in Production Workflows",
            "summary": "AI agents are moving from research labs to production systems. But deploying agents at scale is fundamentally different from deploying traditional applications. Let me share what we've learned.",
            "date_modified": "2026-04-28T00:00:00.000Z",
            "author": {
                "name": "Suraj Srivastav",
                "url": "https://github.com/surajsrivastav"
            },
            "tags": [
                "AI",
                "Systems"
            ]
        },
        {
            "id": "https://surajsrivastav.com/blog/engineering-leadership-at-scale/",
            "content_html": "<p>Scaling an engineering organization is different from building a small team. The skills that made you successful as an engineer or small team leader don't automatically work at scale.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"the-three-inflection-points\">The Three Inflection Points<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#the-three-inflection-points\" class=\"hash-link\" aria-label=\"Direct link to The Three Inflection Points\" title=\"Direct link to The Three Inflection Points\" translate=\"no\">​</a></h2>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"inflection-1-from-ic-to-manager-5-15-people\">Inflection 1: From IC to Manager (5-15 people)<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#inflection-1-from-ic-to-manager-5-15-people\" class=\"hash-link\" aria-label=\"Direct link to Inflection 1: From IC to Manager (5-15 people)\" title=\"Direct link to Inflection 1: From IC to Manager (5-15 people)\" translate=\"no\">​</a></h3>\n<p><strong>Challenge</strong>: You can no longer code your way out of problems.</p>\n<p><strong>What changes</strong>:</p>\n<ul>\n<li class=\"\">Your output is now through others</li>\n<li class=\"\">You need new skills (1-on-1s, feedback, hiring)</li>\n<li class=\"\">You spend 50% of time in meetings</li>\n<li class=\"\">You're no longer the technical expert on every issue</li>\n</ul>\n<p><strong>How to succeed</strong>:</p>\n<ul>\n<li class=\"\">Focus on people development</li>\n<li class=\"\">Be explicit about expectations</li>\n<li class=\"\">Give feedback early and often</li>\n<li class=\"\">Don't try to still be an individual contributor</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"inflection-2-from-manager-to-manager-of-managers-30-50-people\">Inflection 2: From Manager to Manager of Managers (30-50 people)<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#inflection-2-from-manager-to-manager-of-managers-30-50-people\" class=\"hash-link\" aria-label=\"Direct link to Inflection 2: From Manager to Manager of Managers (30-50 people)\" title=\"Direct link to Inflection 2: From Manager to Manager of Managers (30-50 people)\" translate=\"no\">​</a></h3>\n<p><strong>Challenge</strong>: You can't have 1-on-1s with everyone. Systems become critical.</p>\n<p><strong>What changes</strong>:</p>\n<ul>\n<li class=\"\">You need processes (hiring, onboarding, growth)</li>\n<li class=\"\">Technical depth becomes less important</li>\n<li class=\"\">You need to trust your managers</li>\n<li class=\"\">Communication gets harder</li>\n</ul>\n<p><strong>How to succeed</strong>:</p>\n<ul>\n<li class=\"\">Document your values and principles</li>\n<li class=\"\">Create clear promotion criteria</li>\n<li class=\"\">Invest in manager development</li>\n<li class=\"\">Be obsessive about hiring</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"inflection-3-from-manager-of-managers-to-directorstaff-50-people\">Inflection 3: From Manager of Managers to Director/Staff (50+ people)<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#inflection-3-from-manager-of-managers-to-directorstaff-50-people\" class=\"hash-link\" aria-label=\"Direct link to Inflection 3: From Manager of Managers to Director/Staff (50+ people)\" title=\"Direct link to Inflection 3: From Manager of Managers to Director/Staff (50+ people)\" translate=\"no\">​</a></h3>\n<p><strong>Challenge</strong>: You're now playing a different game. Strategy matters more than tactics.</p>\n<p><strong>What changes</strong>:</p>\n<ul>\n<li class=\"\">You set direction and strategy</li>\n<li class=\"\">You need cross-team visibility</li>\n<li class=\"\">Politics become real</li>\n<li class=\"\">You're responsible for things you don't control</li>\n</ul>\n<p><strong>How to succeed</strong>:</p>\n<ul>\n<li class=\"\">Think in quarters and years, not weeks</li>\n<li class=\"\">Build deep relationships across the org</li>\n<li class=\"\">Document your thinking</li>\n<li class=\"\">Delegate authority, not just work</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"core-principles-for-scaling-teams\">Core Principles for Scaling Teams<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#core-principles-for-scaling-teams\" class=\"hash-link\" aria-label=\"Direct link to Core Principles for Scaling Teams\" title=\"Direct link to Core Principles for Scaling Teams\" translate=\"no\">​</a></h2>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"1-clarity-over-comfort\">1. Clarity Over Comfort<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#1-clarity-over-comfort\" class=\"hash-link\" aria-label=\"Direct link to 1. Clarity Over Comfort\" title=\"Direct link to 1. Clarity Over Comfort\" translate=\"no\">​</a></h3>\n<p>As teams grow, people need clarity on:</p>\n<ul>\n<li class=\"\">What are we building?</li>\n<li class=\"\">Why are we building it?</li>\n<li class=\"\">How does my work fit?</li>\n<li class=\"\">How will we measure success?</li>\n</ul>\n<p>This clarity needs to be documented and reinforced constantly.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"2-systems-over-heroics\">2. Systems Over Heroics<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#2-systems-over-heroics\" class=\"hash-link\" aria-label=\"Direct link to 2. Systems Over Heroics\" title=\"Direct link to 2. Systems Over Heroics\" translate=\"no\">​</a></h3>\n<p>At small scale, heroics work. At large scale, they break:</p>\n<ul>\n<li class=\"\">You can't have key-person dependencies</li>\n<li class=\"\">You need repeatable processes</li>\n<li class=\"\">You need clear escalation paths</li>\n<li class=\"\">You need asynchronous communication</li>\n</ul>\n<p>Build systems that work without heroes.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"3-trust-and-autonomy\">3. Trust and Autonomy<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#3-trust-and-autonomy\" class=\"hash-link\" aria-label=\"Direct link to 3. Trust and Autonomy\" title=\"Direct link to 3. Trust and Autonomy\" translate=\"no\">​</a></h3>\n<p>Micromanaging doesn't scale. You need:</p>\n<ul>\n<li class=\"\">Clear decision rights</li>\n<li class=\"\">Trust in your team</li>\n<li class=\"\">Retrospectives to learn</li>\n<li class=\"\">Freedom to fail (safely)</li>\n</ul>\n<p>People perform best when they have autonomy.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"4-async-communication\">4. Async Communication<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#4-async-communication\" class=\"hash-link\" aria-label=\"Direct link to 4. Async Communication\" title=\"Direct link to 4. Async Communication\" translate=\"no\">​</a></h3>\n<p>With 50+ people, you can't have everyone in every meeting:</p>\n<ul>\n<li class=\"\">Document decisions in writing</li>\n<li class=\"\">Use async channels (docs, Slack)</li>\n<li class=\"\">Make synchronous meetings rare and valuable</li>\n<li class=\"\">Enforce \"no meetings Wednesday\" or similar</li>\n</ul>\n<p>Async is not the default at small scale. It must be intentional.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"the-org-design-problem\">The Org Design Problem<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#the-org-design-problem\" class=\"hash-link\" aria-label=\"Direct link to The Org Design Problem\" title=\"Direct link to The Org Design Problem\" translate=\"no\">​</a></h2>\n<p>How you organize directly impacts:</p>\n<ul>\n<li class=\"\">Decision velocity</li>\n<li class=\"\">Communication complexity</li>\n<li class=\"\">Team morale</li>\n<li class=\"\">Hiring and retention</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"bad-org-structure-patterns\">Bad Org Structure Patterns<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#bad-org-structure-patterns\" class=\"hash-link\" aria-label=\"Direct link to Bad Org Structure Patterns\" title=\"Direct link to Bad Org Structure Patterns\" translate=\"no\">​</a></h3>\n<p><strong>Too many layers</strong>:</p>\n<ul>\n<li class=\"\">Decision-making becomes slow</li>\n<li class=\"\">Context is lost at each level</li>\n<li class=\"\">Politics increase</li>\n</ul>\n<p><strong>Unclear responsibilities</strong>:</p>\n<ul>\n<li class=\"\">Duplicate work</li>\n<li class=\"\">Gaps that fall between teams</li>\n<li class=\"\">Finger-pointing</li>\n</ul>\n<p><strong>Churn-driven reorganization</strong>:</p>\n<ul>\n<li class=\"\">People are constantly confused</li>\n<li class=\"\">Trust erodes</li>\n<li class=\"\">Productivity plummets</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"good-org-patterns\">Good Org Patterns<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#good-org-patterns\" class=\"hash-link\" aria-label=\"Direct link to Good Org Patterns\" title=\"Direct link to Good Org Patterns\" translate=\"no\">​</a></h3>\n<p><strong>Clear ownership</strong>:</p>\n<ul>\n<li class=\"\">Each team owns a domain</li>\n<li class=\"\">Clear APIs between teams</li>\n<li class=\"\">Decision rights are explicit</li>\n</ul>\n<p><strong>Limited hierarchy</strong>:</p>\n<ul>\n<li class=\"\">Flat is better than deep</li>\n<li class=\"\">5-8 reports per manager is healthy</li>\n<li class=\"\">Skip-level meetings matter</li>\n</ul>\n<p><strong>Cross-functional alignment</strong>:</p>\n<ul>\n<li class=\"\">Product, Engineering, Design aligned on goals</li>\n<li class=\"\">Regular sync points</li>\n<li class=\"\">Shared OKRs</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"common-scaling-mistakes\">Common Scaling Mistakes<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#common-scaling-mistakes\" class=\"hash-link\" aria-label=\"Direct link to Common Scaling Mistakes\" title=\"Direct link to Common Scaling Mistakes\" translate=\"no\">​</a></h2>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"mistake-1-premature-specialization\">Mistake 1: Premature Specialization<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#mistake-1-premature-specialization\" class=\"hash-link\" aria-label=\"Direct link to Mistake 1: Premature Specialization\" title=\"Direct link to Mistake 1: Premature Specialization\" translate=\"no\">​</a></h3>\n<p>Don't create specialized teams (DX team, ops team) until you need them. Generalists are better at small scale.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"mistake-2-process-without-purpose\">Mistake 2: Process Without Purpose<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#mistake-2-process-without-purpose\" class=\"hash-link\" aria-label=\"Direct link to Mistake 2: Process Without Purpose\" title=\"Direct link to Mistake 2: Process Without Purpose\" translate=\"no\">​</a></h3>\n<p>Processes slow things down. Only add processes when:</p>\n<ul>\n<li class=\"\">You've felt the pain multiple times</li>\n<li class=\"\">The process solves a real problem</li>\n<li class=\"\">Someone is accountable for the process</li>\n</ul>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"mistake-3-hiring-too-fast\">Mistake 3: Hiring Too Fast<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#mistake-3-hiring-too-fast\" class=\"hash-link\" aria-label=\"Direct link to Mistake 3: Hiring Too Fast\" title=\"Direct link to Mistake 3: Hiring Too Fast\" translate=\"no\">​</a></h3>\n<p>If you hire 20 people in a quarter:</p>\n<ul>\n<li class=\"\">Culture dilutes</li>\n<li class=\"\">Onboarding breaks</li>\n<li class=\"\">Internal friction increases</li>\n<li class=\"\">Quality suffers</li>\n</ul>\n<p>Hire at a sustainable pace. Plan 6-12 months ahead.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"mistake-4-ignoring-culture\">Mistake 4: Ignoring Culture<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#mistake-4-ignoring-culture\" class=\"hash-link\" aria-label=\"Direct link to Mistake 4: Ignoring Culture\" title=\"Direct link to Mistake 4: Ignoring Culture\" translate=\"no\">​</a></h3>\n<p>Culture is how you get people to do the right thing when you're not in the room. At scale, culture is everything.</p>\n<p>Define your values early:</p>\n<ul>\n<li class=\"\">How do we make decisions?</li>\n<li class=\"\">What do we optimize for?</li>\n<li class=\"\">How do we handle disagreement?</li>\n<li class=\"\">What behavior do we reward?</li>\n</ul>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"concrete-practices-for-scale\">Concrete Practices for Scale<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#concrete-practices-for-scale\" class=\"hash-link\" aria-label=\"Direct link to Concrete Practices for Scale\" title=\"Direct link to Concrete Practices for Scale\" translate=\"no\">​</a></h2>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"1-engineering-principles\">1. Engineering Principles<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#1-engineering-principles\" class=\"hash-link\" aria-label=\"Direct link to 1. Engineering Principles\" title=\"Direct link to 1. Engineering Principles\" translate=\"no\">​</a></h3>\n<p>Document your engineering principles:</p>\n<ul>\n<li class=\"\">Favor simplicity</li>\n<li class=\"\">Prefer monitoring over prediction</li>\n<li class=\"\">Build for operational excellence</li>\n<li class=\"\">Optimize for team velocity</li>\n</ul>\n<p>Refer back to these constantly.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"2-decision-framework\">2. Decision Framework<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#2-decision-framework\" class=\"hash-link\" aria-label=\"Direct link to 2. Decision Framework\" title=\"Direct link to 2. Decision Framework\" translate=\"no\">​</a></h3>\n<p>Make a decision framework public:</p>\n<ul>\n<li class=\"\"><strong>Type 1 decisions</strong>: Irreversible. Slow, deliberate process.</li>\n<li class=\"\"><strong>Type 2 decisions</strong>: Reversible. Fast decision-making, can change course.</li>\n</ul>\n<p>Most decisions are Type 2. Go fast.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"3-career-framework\">3. Career Framework<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#3-career-framework\" class=\"hash-link\" aria-label=\"Direct link to 3. Career Framework\" title=\"Direct link to 3. Career Framework\" translate=\"no\">​</a></h3>\n<p>Make career progression explicit:</p>\n<ul>\n<li class=\"\"><strong>IC Track</strong>: Engineer → Senior Engineer → Staff Engineer → Principal Engineer</li>\n<li class=\"\"><strong>Manager Track</strong>: Manager → Senior Manager → Director</li>\n<li class=\"\"><strong>Hybrid Tracks</strong>: Possible in some organizations</li>\n</ul>\n<p>Clear frameworks reduce ambiguity.</p>\n<h3 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"4-technical-strategy\">4. Technical Strategy<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#4-technical-strategy\" class=\"hash-link\" aria-label=\"Direct link to 4. Technical Strategy\" title=\"Direct link to 4. Technical Strategy\" translate=\"no\">​</a></h3>\n<p>Document your technical strategy:</p>\n<ul>\n<li class=\"\">What technologies do we use and why?</li>\n<li class=\"\">What are we not building (make this explicit)?</li>\n<li class=\"\">How do we handle tech debt?</li>\n<li class=\"\">What's our upgrade/deprecation policy?</li>\n</ul>\n<p>This prevents 50 different approaches across 50 engineers.</p>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"the-leaders-role-at-scale\">The Leader's Role at Scale<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#the-leaders-role-at-scale\" class=\"hash-link\" aria-label=\"Direct link to The Leader's Role at Scale\" title=\"Direct link to The Leader's Role at Scale\" translate=\"no\">​</a></h2>\n<p>Your job changes from \"build great stuff\" to:</p>\n<ol>\n<li class=\"\"><strong>Set direction</strong>: Where are we going?</li>\n<li class=\"\"><strong>Unblock teams</strong>: Remove obstacles</li>\n<li class=\"\"><strong>Develop talent</strong>: Hire and grow good people</li>\n<li class=\"\"><strong>Build culture</strong>: Set values and reinforce them</li>\n<li class=\"\"><strong>Make hard calls</strong>: Say no. Prioritize ruthlessly.</li>\n</ol>\n<h2 class=\"anchor anchorTargetStickyNavbar_Vzrq\" id=\"conclusion\">Conclusion<a href=\"https://surajsrivastav.com/blog/engineering-leadership-at-scale/#conclusion\" class=\"hash-link\" aria-label=\"Direct link to Conclusion\" title=\"Direct link to Conclusion\" translate=\"no\">​</a></h2>\n<p>Scaling is hard. The skills that made you successful at small scale often work against you at large scale.</p>\n<p>Invest in:</p>\n<ul>\n<li class=\"\">Systems over heroics</li>\n<li class=\"\">Clarity over comfort</li>\n<li class=\"\">Async communication</li>\n<li class=\"\">Explicit processes</li>\n<li class=\"\">Trust and autonomy</li>\n</ul>\n<p>Build an organization where good decisions happen at all levels, not just at the top.</p>\n<p>That's how you scale to 100+ engineers while maintaining quality and velocity.</p>",
            "url": "https://surajsrivastav.com/blog/engineering-leadership-at-scale/",
            "title": "Engineering Leadership at Scale",
            "summary": "Scaling an engineering organization is different from building a small team. The skills that made you successful as an engineer or small team leader don't automatically work at scale.",
            "date_modified": "2026-04-25T00:00:00.000Z",
            "author": {
                "name": "Suraj Srivastav",
                "url": "https://github.com/surajsrivastav"
            },
            "tags": [
                "Leadership"
            ]
        }
    ]
}