DevOps & Cloud Learning Roadmap

Master modern DevOps practices and cloud infrastructure management

Duration: 28 weeks | 3 steps | 37 topics

Career Opportunities

  • DevOps Engineer
  • Site Reliability Engineer
  • Cloud Architect
  • Infrastructure Engineer

Step 1: Linux & Command Line

Master Linux fundamentals and essential command line tools for system administration

Time: 6 weeks | Level: beginner

  • Linux File System & Navigation (required) — Learn the Linux directory hierarchy, absolute and relative paths, and essential navigation commands like ls, cd, and pwd.
    • The root directory (/) is the top of the filesystem hierarchy
    • Key directories include /etc for configuration, /var for logs, and /home for users
    • ls -la shows detailed file listings including hidden files and permissions
    • Absolute paths start from / while relative paths start from the current directory
  • File Manipulation (required) — Master file operations with cp, mv, rm, and powerful text processing tools like find, grep, sed, and awk for everyday tasks.
    • cp copies files and directories, with -r for recursive directory copies
    • find searches the filesystem by name, type, size, or modification time
    • grep filters text using regular expressions across files and streams
    • sed performs stream editing for search-and-replace transformations
  • Users & Permissions (required) — Manage Linux users, groups, and file permissions using chmod, chown, groups, and sudo for secure system administration.
    • Permissions use read (4), write (2), and execute (1) for owner, group, and others
    • chmod changes permissions using numeric (755) or symbolic (u+x) notation
    • chown changes file ownership to a different user or group
    • sudo grants temporary root privileges for administrative commands
  • Shell Scripting (required) — Write bash scripts with variables, conditionals, loops, and functions to automate repetitive system administration tasks.
    • Scripts start with #!/bin/bash (shebang) and need execute permissions
    • Variables are assigned without spaces and referenced with $ prefix
    • Conditionals use if/elif/else with test brackets for comparisons
    • Functions encapsulate reusable logic and accept positional parameters
  • Process Management (required) — Monitor and control running processes using ps, top, kill, systemctl, manage services, and schedule tasks with cron.
    • ps aux lists all running processes with their PID, CPU, and memory usage
    • top provides a real-time interactive view of system resource usage
    • kill sends signals to processes, with SIGTERM (15) for graceful and SIGKILL (9) for forced termination
    • cron schedules recurring tasks using the minute/hour/day/month/weekday format
  • Networking Basics (required) — Understand TCP/IP fundamentals, DNS resolution, port management, SSH connections, and diagnostic tools like curl and netstat.
    • TCP/IP is a four-layer model: link, internet, transport, and application
    • DNS translates domain names to IP addresses through recursive resolution
    • Common ports include 22 (SSH), 80 (HTTP), 443 (HTTPS), and 5432 (PostgreSQL)
    • curl makes HTTP requests from the command line for testing APIs and endpoints
  • Package Management (recommended) — Install and manage software packages using apt, yum, and snap, and learn to compile from source when packages are unavailable.
    • apt is the default package manager for Debian/Ubuntu distributions
    • yum/dnf manages packages on RHEL, CentOS, and Fedora systems
    • snap provides containerized cross-distribution package installation
    • Compiling from source requires configure, make, and make install steps
  • Vim/Nano Editors (recommended) — Edit files directly in the terminal using Vim and Nano, including basic navigation, editing, and search/replace operations.
    • Vim has normal, insert, and command modes for different operations
    • Nano is simpler with commands shown at the bottom of the screen
    • In Vim, :wq saves and quits, :q! quits without saving
    • Vim's /pattern searches forward and :%s/old/new/g replaces globally
  • Environment Variables & Profiles (recommended) — Configure environment variables, shell profiles like .bashrc and .profile, manage the PATH variable, and use export for child processes.
    • .bashrc runs for interactive non-login shells and is the common place for aliases
    • PATH determines which directories the shell searches for executable commands
    • export makes variables available to child processes spawned from the shell
    • env lists all current environment variables in the session
  • SSH & Remote Access (recommended) — Securely connect to remote servers using SSH with key-based authentication, configure SSH settings, set up tunneling, and transfer files with SCP.
    • ssh-keygen generates public/private key pairs for passwordless authentication
    • ~/.ssh/config simplifies connections with host aliases and default settings
    • SSH tunneling forwards local or remote ports securely through encrypted connections
    • SCP copies files securely between local and remote machines over SSH
  • Log Management (optional) — View and manage system logs using journalctl and syslog, configure log rotation, and understand centralized logging basics.
    • journalctl queries the systemd journal for service and kernel logs
    • syslog stores messages in /var/log with facility and severity levels
    • logrotate prevents log files from consuming all available disk space
    • Centralized logging aggregates logs from multiple servers for analysis
  • Linux Security Basics (optional) — Secure Linux systems using firewalls like ufw and iptables, understand SELinux policies, and apply system hardening techniques.
    • ufw provides a simplified frontend for managing iptables firewall rules
    • iptables defines packet filtering rules at the kernel level for network traffic
    • SELinux enforces mandatory access controls beyond traditional Unix permissions
    • Hardening includes disabling root SSH login, using fail2ban, and minimizing installed packages

Step 2: Containerization & CI/CD

Learn container technologies and orchestration with Docker and Kubernetes, and build CI/CD pipelines

Time: 10 weeks | Level: intermediate

  • Docker Fundamentals (required) — Understand Docker images, containers, Dockerfiles, image layers, and the build process for packaging applications.
    • Images are read-only templates built from Dockerfiles in sequential layers
    • Containers are running instances of images with their own writable layer
    • Each Dockerfile instruction creates a new layer that is cached for faster rebuilds
    • docker build, run, stop, and rm are the core lifecycle commands
  • Docker Compose (required) — Define and run multi-container applications with Docker Compose using volumes, networks, and service dependency management.
    • docker-compose.yml defines services, networks, and volumes declaratively
    • Volumes persist data beyond the container lifecycle for databases and state
    • Networks isolate communication between groups of related containers
    • depends_on controls service startup order but does not wait for readiness
  • Container Registries (required) — Push and pull container images from registries including Docker Hub, Amazon ECR, and Google Container Registry with proper tagging strategies.
    • Docker Hub is the default public registry for community and official images
    • Private registries like ECR and GCR store proprietary application images securely
    • Semantic version tags (v1.2.3) are preferred over mutable tags like latest
    • Image scanning detects known vulnerabilities in base images and dependencies
  • CI/CD Concepts (required) — Understand continuous integration, continuous delivery, and continuous deployment pipeline stages and best practices.
    • Continuous integration merges and tests code changes frequently, ideally multiple times per day
    • Continuous delivery ensures code is always in a deployable state through automated pipelines
    • Continuous deployment automatically releases every change that passes the pipeline to production
    • Pipeline stages typically include build, test, security scan, and deploy
  • GitHub Actions (required) — Build CI/CD workflows with GitHub Actions using workflow files, jobs, steps, secrets management, and matrix build strategies.
    • Workflows are defined in YAML files under .github/workflows/
    • Jobs run in parallel by default and can be configured with dependencies
    • Secrets are encrypted environment variables for API keys and credentials
    • Matrix builds test across multiple OS versions, language versions, or configurations
  • Jenkins Basics (required) — Set up Jenkins CI/CD pipelines using declarative Jenkinsfiles, configure plugins, and manage build agents for distributed builds.
    • Declarative pipelines use a structured Jenkinsfile with stages and steps
    • Plugins extend Jenkins with integrations for Docker, Kubernetes, Slack, and more
    • Build agents distribute workload across multiple machines for parallel execution
    • Shared libraries promote code reuse across multiple pipeline definitions
  • Container Orchestration Concepts (recommended) — Understand why Kubernetes is needed for production workloads and learn core concepts including pods, services, and deployments.
    • Kubernetes automates deployment, scaling, and management of containerized apps
    • Pods are the smallest deployable units containing one or more containers
    • Services expose pods to network traffic with stable endpoints and load balancing
    • Deployments manage pod replicas and enable rolling updates with zero downtime
  • GitOps Workflow (recommended) — Implement GitOps practices using tools like ArgoCD and Flux for declarative, Git-driven infrastructure and application management.
    • Git is the single source of truth for both application and infrastructure state
    • ArgoCD continuously syncs the cluster state with the desired state in Git
    • Pull-based deployment is more secure than push-based as the cluster pulls changes
    • Declarative configuration eliminates manual kubectl commands in production
  • Artifact Management (recommended) — Manage build artifacts using npm registries, Nexus Repository, and implement versioning strategies for reproducible builds.
    • Artifact repositories store versioned build outputs for deployment and rollback
    • Semantic versioning communicates the nature of changes in each release
    • Nexus and Artifactory support multiple package formats in a single repository
  • Testing in CI/CD (recommended) — Integrate unit tests, integration tests, and automated test suites into CI/CD pipelines for continuous quality assurance.
    • Unit tests run first and fastest, catching issues at the function level
    • Integration tests verify interactions between services and external dependencies
    • Test reports and coverage metrics should be published as pipeline artifacts
    • Flaky tests must be quarantined to maintain pipeline reliability
  • GitLab CI/CD (optional) — Configure CI/CD pipelines in GitLab using .gitlab-ci.yml, manage runners, and set up deployment environments.
    • .gitlab-ci.yml defines pipeline stages, jobs, and their execution rules
    • Runners are agents that execute CI/CD jobs on shared or dedicated infrastructure
    • Environments track deployments to staging, production, and review apps
    • Auto DevOps provides pre-configured pipelines for common project types
  • Build Tools & Strategies (optional) — Optimize container builds with multi-stage builds, layer caching, and build strategies that reduce image size and build time.
    • Multi-stage builds separate build dependencies from the final runtime image
    • Layer caching skips unchanged layers to dramatically speed up rebuilds
    • Ordering Dockerfile instructions from least to most frequently changed maximizes cache hits
    • Distroless and Alpine base images minimize the attack surface and image size

Step 3: Cloud & Infrastructure as Code

Master cloud services and infrastructure as code with AWS, Terraform, and Kubernetes

Time: 12 weeks | Level: advanced

  • AWS Core Services (required) — Learn essential AWS services including EC2 for compute, S3 for storage, VPC for networking, IAM for access control, RDS for databases, and Lambda for serverless.
    • EC2 provides resizable virtual servers with multiple instance types for different workloads
    • S3 offers virtually unlimited object storage with 99.999999999% durability
    • VPC creates isolated network environments with subnets, route tables, and security groups
    • IAM controls access with users, roles, and policies following least-privilege principles
  • Infrastructure as Code (Terraform) (required) — Provision and manage cloud infrastructure declaratively using Terraform with providers, resources, state management, and reusable modules.
    • Providers connect Terraform to cloud platforms like AWS, GCP, and Azure
    • Resources define the infrastructure components to create and manage
    • State tracks the mapping between configuration and real-world resources
    • Modules encapsulate reusable infrastructure patterns with input variables and outputs
  • Kubernetes Deep Dive (required) — Master advanced Kubernetes concepts including pod management, deployments, services, ConfigMaps, Secrets, and Ingress controllers.
    • Deployments manage ReplicaSets and enable rolling updates and rollbacks
    • ConfigMaps decouple configuration from container images for environment flexibility
    • Secrets store sensitive data like passwords and tokens with base64 encoding
    • Ingress controllers route external HTTP/HTTPS traffic to internal services
  • Monitoring & Observability (required) — Implement monitoring and observability using Prometheus for metrics collection, Grafana for dashboards, and configure alerting rules.
    • Prometheus scrapes metrics from targets at configured intervals using a pull model
    • PromQL queries time-series data for alerts, dashboards, and ad-hoc analysis
    • Grafana visualizes metrics from multiple data sources in customizable dashboards
    • Alerting rules trigger notifications when metrics exceed defined thresholds
  • Logging at Scale (required) — Aggregate and analyze logs at scale using the ELK/EFK stack, AWS CloudWatch, and implement structured logging practices.
    • ELK stack combines Elasticsearch, Logstash, and Kibana for log aggregation and search
    • EFK replaces Logstash with Fluentd for lighter-weight log forwarding in Kubernetes
    • CloudWatch provides native AWS log collection, monitoring, and alarming
    • Structured logging with JSON format enables efficient parsing and querying
  • Helm Charts (recommended) — Package Kubernetes applications with Helm using templates, values files, chart repositories, and manage upgrades and rollbacks.
    • Helm charts are packages of pre-configured Kubernetes resource templates
    • Values files customize chart behavior without modifying the templates directly
    • Chart repositories host and distribute charts like package registries
    • helm upgrade and rollback manage releases with version history tracking
  • AWS Advanced (recommended) — Explore advanced AWS services including ECS, EKS, CloudFormation, Route53 for DNS, and messaging services like SNS and SQS.
    • ECS runs containers on AWS with Fargate for serverless or EC2 for managed compute
    • EKS provides managed Kubernetes with AWS integration for networking and IAM
    • Route53 handles DNS routing with health checks and failover configurations
    • SNS/SQS enable decoupled architectures with pub/sub and message queue patterns
  • GCP / Azure Basics (recommended) — Compare major cloud providers including GCP and Azure, and understand multi-cloud considerations for avoiding vendor lock-in.
    • GCP excels in data analytics, machine learning, and Kubernetes (GKE)
    • Azure integrates deeply with Microsoft enterprise tools and Active Directory
    • Multi-cloud strategies reduce vendor lock-in but increase operational complexity
    • Terraform and Pulumi enable infrastructure code that works across cloud providers
  • Service Mesh (recommended) — Implement service mesh with Istio and Envoy for traffic management, mutual TLS encryption, and observability between microservices.
    • A service mesh manages service-to-service communication with sidecar proxies
    • Istio provides traffic management, security, and observability for microservices
    • Envoy proxy handles load balancing, retries, and circuit breaking transparently
    • Mutual TLS (mTLS) encrypts all service-to-service communication automatically
  • Secrets Management (recommended) — Securely store and manage secrets using HashiCorp Vault, AWS Secrets Manager, and Kubernetes Sealed Secrets.
    • Vault provides dynamic secrets, encryption as a service, and access control
    • AWS Secrets Manager rotates credentials automatically on a configured schedule
    • Sealed Secrets encrypt Kubernetes secrets safely for storage in Git repositories
    • Never store secrets in code, environment files, or container images
  • Chaos Engineering (optional) — Practice chaos engineering with tools like Chaos Monkey and Litmus, run game days, and understand blast radius management.
    • Chaos engineering proactively injects failures to discover system weaknesses
    • Start with small blast radius experiments in non-production environments
    • Game days are scheduled events where teams practice incident response with controlled chaos
    • Litmus provides Kubernetes-native chaos experiments with CRD-based workflows
  • Cost Optimization (optional) — Optimize cloud spending using FinOps practices, reserved and spot instances, and right-sizing resources to match actual workloads.
    • FinOps brings financial accountability to cloud spending through cross-team collaboration
    • Reserved instances save 30-60% over on-demand pricing for predictable workloads
    • Spot instances offer up to 90% savings for fault-tolerant and flexible workloads
    • Right-sizing matches instance types to actual resource utilization to eliminate waste
  • Platform Engineering (optional) — Build internal developer platforms using tools like Backstage to provide self-service infrastructure and improve developer experience.
    • Platform engineering builds golden paths that simplify infrastructure for developers
    • Backstage provides a unified developer portal with service catalogs and templates
    • Self-service platforms reduce time-to-deploy and dependency on platform teams
    • Internal developer platforms standardize tooling while allowing flexibility for teams
Advertisement
Join Us
blur