spinny:~/writing $ cat platform-engineering-internal-developer-platform.md

Platform Engineering: How to Build an Internal Developer Platform

2026-03-25 · 7 min read · Filippo Spinella · DevOps, Platform Engineering, Cloud, Software Architecture

As software systems grow more complex - microservices, Kubernetes, multi-cloud, CI/CD pipelines, observability stacks - developers are spending more time on infrastructure and less time on building products. Platform Engineering solves this by creating an internal platform that abstracts away complexity and provides developers with self-service tools to ship faster.

Gartner predicts that by 2027, 80% of software engineering organizations will establish platform teams. In this guide, we'll explore what Platform Engineering is, why it matters, and how to build an Internal Developer Platform from scratch.

What is Platform Engineering?

Platform Engineering is the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations. The output is an Internal Developer Platform (IDP) - a layer that sits between developers and infrastructure.

Platform Engineering vs DevOps

Platform Engineering is not a replacement for DevOps - it's the next evolution. DevOps said "you build it, you run it." Platform Engineering says "we'll make building and running it effortless."

Aspect	DevOps	Platform Engineering
Focus	Culture and practices	Products and self-service
Approach	Every team manages infra	Platform team abstracts infra
Cognitive Load	High (each team learns everything)	Low (golden paths provided)
Output	CI/CD pipelines, IaC scripts	Internal Developer Platform
Users	All engineering	Platform team serves engineering

Why Platform Engineering Matters

The Cognitive Load Problem

In a typical modern organization, a developer needs to understand:

Git workflows and branching strategies
CI/CD pipeline configuration
Container building and registry management
Kubernetes manifests and Helm charts
Cloud provider services (AWS/GCP/Azure)
Networking, DNS, TLS certificates
Monitoring, logging, alerting setup
Database provisioning and migrations
Security policies and compliance

That's an enormous cognitive burden that takes focus away from the actual product.

The Golden Path

Platform Engineering introduces the concept of golden paths - opinionated, well-supported, and documented paths for common tasks. A golden path is not a mandate; developers can deviate, but the platform makes the right thing the easy thing.

Example golden path for creating a new microservice:

Developer selects "New Backend Service" in the portal
Chooses language/framework (Node.js, Go, Python)
Platform auto-creates: Git repo, CI/CD pipeline, Kubernetes namespace, monitoring dashboards, and alert rules
Developer clones the repo and starts coding immediately

Building an Internal Developer Platform

Layer 1: Developer Portal

The portal is the single entry point for developers. The most popular open-source option is Backstage (created by Spotify, now a CNCF project).

Key features:

Service catalog: Every service, its owner, documentation, and dependencies
Software templates: Scaffolding for new services with best practices built in
Tech docs: Documentation as code, rendered and searchable
Plugin ecosystem: Extend with custom functionality

# backstage/catalog-info.yaml
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: user-service
  description: Manages user accounts and authentication
  annotations:
    github.com/project-slug: myorg/user-service
    backstage.io/techdocs-ref: dir:.
spec:
  type: service
  lifecycle: production
  owner: team-auth
  system: identity-platform
  dependsOn:
    - resource:postgresql-main
  providesApis:
    - user-api

Layer 2: Infrastructure Abstraction

Developers shouldn't write Terraform or Kubernetes YAML directly. The platform should provide abstractions.

Tools:

Crossplane: Kubernetes-native infrastructure provisioning
Terraform with modules: Pre-built, tested infrastructure modules
Pulumi: Infrastructure as real code (TypeScript, Python, Go)

# Example: Crossplane composition for a database
apiVersion: database.example.com/v1
kind: PostgreSQLInstance
metadata:
  name: user-db
spec:
  size: small        # Abstraction: small = 2 vCPU, 4GB RAM
  version: "16"
  backup: daily
  team: auth-team

Instead of configuring RDS parameters, VPC subnets, security groups, and backup policies, the developer just specifies size: small and backup: daily. The platform handles the rest.

Layer 3: CI/CD Standardization

Standardize CI/CD so teams don't each build their own pipelines.

# .github/workflows/platform-ci.yml
# Teams just include the shared workflow
name: Build and Deploy
on:
  push:
    branches: [main]

jobs:
  pipeline:
    uses: myorg/platform-workflows/.github/workflows/standard-pipeline.yml@v2
    with:
      language: node
      deploy-target: production
    secrets: inherit

Key practices:

Shared CI/CD templates that teams include (not copy)
Automatic security scanning (SAST, dependency audit)
Standardized deployment strategies (canary, blue/green)
Automatic rollback on failed health checks

Layer 4: Observability

Pre-configured monitoring so developers get dashboards and alerts out of the box.

Metrics: Prometheus + Grafana with standard dashboards per service
Logging: Structured logging with centralized collection (Loki, ELK)
Tracing: Distributed tracing with OpenTelemetry
Alerting: PagerDuty/Opsgenie integration with sensible defaults

Measuring Success

How do you know your platform is working? Track these metrics:

DORA Metrics

Deployment frequency: How often code reaches production
Lead time for changes: Time from commit to production
Change failure rate: Percentage of deployments causing failures
Mean time to recovery: Time to restore service after an incident

Platform-Specific Metrics

Time to first deploy: How long from "new service" to first production deploy
Developer satisfaction (NPS): Survey your users regularly
Self-service ratio: % of infrastructure requests handled without tickets
Golden path adoption: % of services following the recommended path

Common Mistakes

1. Building Too Much, Too Soon

Start with the biggest pain point, not a grand vision. If deployments are painful, start there. If provisioning takes weeks, start there.

2. Not Treating the Platform as a Product

The platform team needs a product manager, user research, and feedback loops. Developers are your customers - understand their needs.

3. Mandating Instead of Attracting

The best platforms are adopted voluntarily because they make developers' lives easier. If you have to mandate usage, your platform isn't good enough.

4. Ignoring Developer Experience

A platform with terrible UX won't be used. Invest in clear documentation, helpful error messages, and fast feedback loops.

Getting Started

A practical roadmap for building your first IDP:

Minimum Viable Platform

Service catalog (Backstage) - know what exists and who owns it
One service template - golden path for your most common service type
Standardized CI/CD - shared pipeline that teams include
Basic docs - how to use the platform, where to get help

You can build this MVP in 2-3 months with a team of 2-3 engineers.

Conclusion

Platform Engineering is not about building the perfect platform from day one. It's about incrementally reducing the cognitive load on developers so they can focus on building products. Start small, measure impact, and iterate based on developer feedback.

The organizations that invest in Platform Engineering will have a significant competitive advantage: faster delivery, happier developers, and more reliable systems.

Resources:

Team Topologies - the book that popularized platform teams
Backstage - Spotify's open-source developer portal
CNCF Platforms White Paper - community definition and best practices
platformengineering.org - community, events, and resources