Start › Blog › CI/CD for Embedded Firmware: GitHub Actions + PlatformIO Pipeline for ESP32

Firmware Best Practices DevOps ESP32 Firmware IoT

CI/CD for Embedded Firmware: GitHub Actions + PlatformIO Pipeline for ESP32

📅 April 2026 ⏳ 10 min read FSS Engineering Team

Firmware engineers have spent two decades watching their cloud colleagues enjoy luxuries we did not have – reproducible builds, automated tests, push-button releases. The excuses for that gap have evaporated. PlatformIO gives us deterministic builds across toolchains, GitHub Actions gives us free compute and a sane workflow language, and modern ESP32 and STM32 ecosystems have unit-test frameworks that actually work. This article is the pipeline we deploy on every connected devices project at FSS, with the YAML you can lift directly. No demoware. No shortcuts that only work on the original author’s laptop.

Why Firmware Needs CI/CD More, Not Less

The case for firmware CI/CD is stronger than for cloud code, not weaker. Three reasons. First, firmware bugs are expensive to deploy fixes for, so catching them before tag-and-release is worth disproportionately more. Second, firmware ships across a matrix of board revisions, sensor variants, and customer SKUs that explodes combinatorially – no human can build all of them by hand reliably. Third, firmware integrates with hardware that can be damaged by bad code, which makes the cost of a regression measured in returned units, not in a quick rollback.

The pipeline we describe below has caught issues that would have shipped to production: a partition table change that bricked OTA on one board variant but not another, a stack overflow that only manifested when an unrelated feature flag was enabled, a TLS certificate expiry that nobody noticed because the device worked fine until 2025-03-14. None of these were caught by humans reading diffs. All were caught by automation. The return on a week of pipeline work is measured in months of saved firefighting, and the compounding effect on engineering velocity is hard to overstate once the team learns to trust green checks.

Project Structure

PlatformIO projects scale cleanly when you commit to a few conventions early. Our reference layout looks like this across every customer engagement, with minor variations for organizations that have strong opinions about monorepos:

firmware/
  platformio.ini          # environments, one per board variant
  src/                    # application code, board-agnostic
  lib/                    # internal libraries, versioned
  include/                # public headers
  test/                   # Unity tests, native and embedded
    test_native/          # runs on host
    test_embedded/        # runs on hardware
  hil/                    # hardware-in-the-loop scripts
  scripts/                # build helpers, signing, packaging
  partitions/             # custom partition tables per variant
  certs/                  # CA bundles, never device keys

The platformio.ini declares one environment per shipping variant. We use a base environment with extends to keep the file readable and avoid drift between variants:

[env]
framework = espidf
monitor_speed = 115200
build_flags = -Wall -Wextra -Werror
test_framework = unity

[env:esp32_devkit]
platform = espressif32@6.5.0
board = esp32dev

[env:esp32s3_v2]
platform = espressif32@6.5.0
board = esp32-s3-devkitc-1
board_build.partitions = partitions/v2_ota.csv

[env:stm32_l4_industrial]
platform = ststm32@17.3.0
board = nucleo_l476rg
framework = stm32cube

Pinning platform versions is not optional. A floating platform = espressif32 means your build is reproducible until Espressif releases a point version that breaks something subtle. Pin, then upgrade deliberately as a separate PR. We treat toolchain upgrades as their own change set with their own HIL validation.

The GitHub Actions Workflow

Our top-level workflow has four jobs that run in parallel where possible: lint, build matrix, native tests, and embedded tests. A fifth job – sign and release – runs only on tags. Every job caches the PlatformIO toolchain aggressively because cold installs eat several minutes per run.

name: firmware-ci
on:
  push:
    branches: [main]
    tags: ['v*']
  pull_request:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.11' }
      - run: pip install platformio cpplint
      - run: pio check --skip-packages --fail-on-defect=high
      - run: cpplint --recursive src/ lib/ include/

  build:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        env: [esp32_devkit, esp32s3_v2, stm32_l4_industrial]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.11' }
      - uses: actions/cache@v4
        with:
          path: |
            ~/.platformio/.cache
            ~/.platformio/packages
          key: pio-${{ matrix.env }}-${{ hashFiles('platformio.ini') }}
      - run: pip install platformio
      - run: pio run -e ${{ matrix.env }}
      - uses: actions/upload-artifact@v4
        with:
          name: firmware-${{ matrix.env }}
          path: .pio/build/${{ matrix.env }}/firmware.bin
          retention-days: 30

Two details worth highlighting. The cache key includes the hash of platformio.ini, so a platform version bump invalidates the cache automatically. And fail-fast: false means a failure on STM32 does not cancel the ESP32 builds – you get the full picture of what broke instead of having to rerun each board in sequence.

Unit Tests with Unity

Unity is the right choice for embedded C; Ceedling on top of it if you want mocking and a more opinionated layout. The pattern that pays off most is splitting tests into test_native and test_embedded. Native tests run on the CI runner against host-compiled code with hardware peripherals stubbed. Embedded tests flash an actual board.

// test/test_native/test_telemetry.c
#include <unity.h>
#include "telemetry.h"

void setUp(void) {}
void tearDown(void) {}

void test_telemetry_packs_temperature_correctly(void) {
    uint8_t buf[32];
    size_t len = telemetry_pack(buf, sizeof(buf), 23.4f, 1024);
    TEST_ASSERT_EQUAL(12, len);
    TEST_ASSERT_EQUAL_HEX8(0xA1, buf[0]);
}

int main(void) {
    UNITY_BEGIN();
    RUN_TEST(test_telemetry_packs_temperature_correctly);
    return UNITY_END();
}

Aim for native tests to cover protocol parsers, state machines, telemetry packing, command validation, and anything pure-functional. Anything that touches a peripheral, an RTOS primitive, or real timing belongs in embedded tests. If you are still warming up to RTOS-aware testing, our walkthrough of FreeRTOS for IoT covers patterns that survive in CI.

Hardware-in-the-Loop Testing

HIL is where most firmware teams stop, and it is the highest-leverage step. The setup: a small fleet of reference boards wired into a self-hosted runner, each with a known-good peripheral (real sensors, real radios, sometimes a programmable load). The CI job flashes the build, drives a test scenario, and asserts on serial output, MQTT messages, or measured voltages.

  hil:
    needs: build
    runs-on: [self-hosted, hil-bench]
    if: github.event_name == 'pull_request' || github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4
      - uses: actions/download-artifact@v4
        with: { name: firmware-esp32s3_v2, path: ./fw }
      - run: ./scripts/hil_flash.sh ./fw/firmware.bin /dev/ttyUSB0
      - run: pytest hil/ --board=esp32s3_v2 --junitxml=hil-results.xml
      - uses: actions/upload-artifact@v4
        if: always()
        with: { name: hil-results, path: hil-results.xml }

Self-hosted runners on a Raspberry Pi or NUC are inexpensive and reliable if you discipline the bench: each board on a switched USB hub so CI can power-cycle, no humans touching cables during work hours, and a smoke test that runs every hour to confirm the bench itself is healthy. We treat HIL as part of QA and testing rather than as developer tooling, with the same SLAs.

Semantic Versioning for Firmware

SemVer applies to firmware with one twist: the public surface is not just the API but also the over-the-air upgrade compatibility. Our convention:

Patch – bug fixes, no behavior change visible to cloud or app.
Minor – new features, backward compatible telemetry schema and OTA partition layout.
Major – breaking changes to telemetry, command schema, or partition layout. Requires a coordinated cloud release.

The version is baked into the binary at build time from the git tag, exposed in telemetry, and checked by the OTA service before delivery. A device on 1.x will not accept a 2.0 build unless the cloud explicitly opts it in. We enforce this at the manifest layer so a misconfigured rollout cannot accidentally brick a subset of the fleet.

Signing Binaries

Unsigned firmware is a security incident waiting to happen. ESP32 and modern STM32 both support secure boot with signed images; if you have not enabled it, that is the highest-ROI security hardening you can do, and our reference on secure boot covers the chain of trust in detail.

The signing job lives behind a manual approval and uses GitHub’s environment protection rules. The private signing key never touches the repository – it lives in Azure Key Vault or AWS KMS, and the job authenticates via OIDC so no long-lived cloud credentials sit in GitHub secrets.

  sign-and-release:
    needs: [build, hil]
    if: startsWith(github.ref, 'refs/tags/v')
    runs-on: ubuntu-latest
    environment: production-signing
    permissions:
      id-token: write
      contents: write
    steps:
      - uses: actions/download-artifact@v4
      - uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
      - run: ./scripts/sign_with_kv.sh firmware-*/firmware.bin
      - uses: softprops/action-gh-release@v2
        with:
          files: |
            firmware-*/firmware.bin
            firmware-*/firmware.bin.sig
            manifest.json

OTA Delivery Integration

The release job uploads to GitHub, but the canonical OTA artifact lives in cloud storage with a manifest that the device firmware fetches. Our manifest schema includes the version, the SHA-256, the signature, the rollout policy (canary percentage, allowed device groups), and the minimum compatible cloud schema version. The device-side updater verifies the signature before writing to the inactive OTA partition, then validates with a smoke test on first boot before marking the partition as bootable.

Rolling out at scale without bricking devices is its own discipline; our deeper writeup on OTA firmware updates at scale covers staged rollouts, A/B partitions, and rollback semantics. The CI pipeline’s job is just to produce a signed, manifested artifact and hand it off cleanly to the OTA service.

Secrets Management

Three categories of secrets touch a firmware pipeline: signing keys, cloud credentials for OTA upload, and per-device provisioning material. The first two belong in the platform secret store (GitHub Environments, with OIDC federation to Azure or AWS for credentials). The third should never be in CI at all – per-device keys are generated on the device or at the factory test station and never leave it.

A common antipattern is baking a single shared key into all units “to simplify provisioning.” Do not. The cost of correct per-device provisioning is one extra step at factory test and a few hundred bytes of storage. The cost of doing it wrong is a fleet-wide compromise when one unit is reverse-engineered. We cover the right pattern in our IoT security best practices guide.

Runner Choices

GitHub-hosted runners are fine for lint, build, and native tests. They are free for public repos and inexpensive for private ones, and they are reliably stateless. Use them for everything that does not need hardware.

Self-hosted runners are mandatory for HIL and useful for big build matrices. Run them on dedicated hardware, not on developer machines. Isolate them in their own network segment. Auto-update the runner agent. And wipe the workspace between jobs – actions/checkout with clean: true is not enough; we run a custom cleanup that nukes .pio caches between runs to avoid “works on Tuesday” bugs.

What This Buys You

A firmware team running this pipeline ships with confidence. Pull requests show a green check that means something. Releases are reproducible from source. Field issues can be bisected against tagged builds. New engineers can build the firmware on day one. None of this is exotic – it is what cloud teams have considered table stakes for a decade. Bringing that discipline to firmware is the single biggest productivity multiplier we have seen on customer projects.

If your firmware team is still building releases on a single engineer’s laptop, that is your highest-leverage place to spend the next two weeks. We help product teams stand up pipelines like this as part of our DevOps practice, often alongside the broader connected devices engagement. The pipeline pays for itself the first time it catches a regression that would have shipped.

Building something connected?

FSS Technology designs and builds IoT products from silicon to cloud — embedded firmware, custom hardware, and Azure backends.

Talk to our team →