OTA staged deployment — canary 5% → Stage 2 25% → full fleet, with automatic rollback on failure
Shipping firmware to 10,000 devices simultaneously is a recipe for a bricked fleet. Staged OTA rollouts with automatic rollback are not optional — they are the difference between a controlled release and a disaster.
ota_0 app ota_0 0x10000 1500K ota_1 app ota_1 , 1500K // After download + verify: esp_ota_set_boot_partition(update_partition); esp_restart(); // bootloader validates, rolls back if invalid
az iot hub configuration create --config-id "fw-2-1-0-canary" --target-condition "tags.stage='canary'" --priority 10
Full firmware images for ESP32-S3 applications typically run 1.2–2MB. Over a cellular connection at €0.05/MB, updating 5,000 devices monthly costs €500–€500 per release. Delta (binary diff) updates reduce this by 60–80% by transmitting only the bytes that changed between firmware versions. Tools like JojoDiff or bsdiff generate compact binary patches; the device applies them in-memory and verifies the result before committing.
Delta updates require more complex firmware update logic and careful management of base version compatibility — you need to know exactly which firmware version a device is running before sending a patch. Azure IoT Hub Device Twins make this straightforward: the device reports its exact firmware hash in the “reported” section, and the cloud selects the appropriate delta package for each device.
A robust rollback system requires more than just dual partitions. The device must validate the new firmware before committing to it as the boot partition, and it must have a safe mechanism to report failure even if the new firmware cannot reach the cloud. The standard pattern:
After installing new firmware, the bootloader marks the new partition as “pending verification” and boots it. The application has a grace period (typically 5 minutes) to complete a self-test, successfully connect to the cloud, and call esp_ota_mark_app_valid_cancel_rollback(). If this call is never made — because the application crashed, hung, or lost connectivity — the bootloader rolls back to the previous partition on the next boot.
At 10,000+ devices, firmware deployments need automation. We use Azure IoT Hub automatic device management configurations to target devices by tag: geography, hardware revision, customer tier. A typical release plan runs for 7 days: Day 1 canary (1%), Day 3 Stage 1 (10%), Day 5 Stage 2 (50%), Day 7 full fleet. Automated alerts trigger if error rates in any stage exceed 0.5% — the release pauses and engineering is notified before the problem scales.
Build your OTA infrastructure as a first-class concern, not an afterthought. The devices you ship today will need firmware updates for 5–10 years. Retrofitting a secure, reliable OTA system onto an existing product is significantly harder than designing it in from the start.
FSS is a full-stack IoT engineering team — hardware, firmware, cloud, and mobile in one place.
FSS Technology designs and builds IoT products from silicon to cloud — embedded firmware, custom hardware, and Azure backends.
Talk to our team →