Design a Feature Flag Platform for Mobile Apps

Published:

๐ŸŽฏ Problem

Design a feature-flag system used by multiple iOS apps. Requirements:

  • Progressive rollout by percentage/cohort
  • Rule targeting by app version, region, user segment
  • Offline-safe behavior for app startup
  • Instant kill switch for critical incidents
  • Full audit trail for who changed what and when

๐Ÿงฑ High-Level Architecture

flowchart TD; Admin["Flag Admin UI"]-->FlagAPI["Flag Config API"]; FlagAPI-->FlagDB["Flag Store"]; FlagAPI-->Audit["Audit Log"]; Mobile["iOS App SDK"]-->CDN["Edge Config CDN"]; CDN-->Mobile; Mobile-->Exposure["Exposure Events"]; Exposure-->Analytics["Analytics Pipeline"];

๐Ÿ”‘ Core Design Decisions

1) Edge-read, control-plane-write

  • Control plane writes to DB and publishes config snapshots.
  • Mobile SDK reads from CDN/edge for low latency and high availability.

2) Local cache + bootstrap file

  • App ships with last-known-safe defaults.
  • SDK persists last fetched config.
  • If network fails, app still evaluates flags deterministically.

3) Deterministic bucketing

  • Use stable hashing on (user_id, flag_key) for rollout consistency.
  • Prevents users flipping in/out across sessions.

๐Ÿ“ฆ SDK Contract

protocol FeatureFlagClient {
    func bool(_ key: String, defaultValue: Bool) -> Bool
    func variant(_ key: String, defaultValue: String) -> String
    func refresh() async
}

Evaluation order:

  1. emergency override cache
  2. targeted rule match
  3. percentage rollout
  4. default value

๐Ÿšจ Failure Modes

  • Config service outage -> use cached config + defaults
  • Bad rule publish -> use rollback version immediately
  • Exposure event backlog -> buffer and batch upload

โœ… Interview Tradeoff Talking Points

  • Strong consistency is unnecessary for reads; eventual consistency with fast propagation is enough.
  • Keep evaluation client-side for latency, but keep rule authoring centralized.
  • Add schema validation for rule safety before publish.

๐Ÿงช What interviewer wants to hear

  • How you prevent โ€œflag driftโ€ between platforms
  • How kill switches bypass normal cache TTL
  • How to test targeting deterministically in CI