ML//TinyML
ML inference on microcontrollers (MCU) and ultra-low-power devices.
ML inference on microcontrollers (MCU) and ultra-low-power devices.
Models measured in kilobytes, running on Cortex-M cores with sub-milliwatt budgets.
Frameworks: TensorFlow Lite Micro, Edge Impulse.
Use cases: keyword spotting, anomaly detection, gesture recognition.
Bypasses cloud latency and connectivity requirements by keeping data and inference entirely on-device.
Quantization: 100 KB 8-bit model from 100 MB 32-bit original.
C++ takes input ⟶ TensorFlow Lite ⟶ const_float model_weights.
Deployment tiers
@ sensor (ASIC/MCU): pattern filtering (Decision Trees, CNNs); stuck with manufacturer's logic
@ MCU (NPU): multiple sensor input; NNs
@ MPU: heavy NN; classify and predict
ML ⟶ sensor: always listening, event occurs shortly (vibration)
ML ⟶ MCU (NPU): required when event is continuous (sound); NPU is passive, awakens MCU only if needed.
ML ⟶ MPU (NPU): to avoid killing CPU (e.g. AirPods noise cancellation, Face ID, "Alexa")
"Alexa" pipeline: DSP detects keyword (3s buffer in MCU RAM) ⟶ audio ⟶ Wifi ⟶ internet ⟶ H100 (inference)