ML//TinyML

ML inference on microcontrollers (MCU) and ultra-low-power devices.


ML inference on microcontrollers (MCU) and ultra-low-power devices.

Models measured in kilobytes, running on Cortex-M cores with sub-milliwatt budgets.

Frameworks: TensorFlow Lite Micro, Edge Impulse.

Use cases: keyword spotting, anomaly detection, gesture recognition.

Bypasses cloud latency and connectivity requirements by keeping data and inference entirely on-device.

Quantization: 100 KB 8-bit model from 100 MB 32-bit original.

C++ takes input ⟶ TensorFlow Lite ⟶ const_float model_weights.

Deployment tiers

@ sensor (ASIC/MCU): pattern filtering (Decision Trees, CNNs); stuck with manufacturer's logic

@ MCU (NPU): multiple sensor input; NNs

@ MPU: heavy NN; classify and predict

ML ⟶ sensor: always listening, event occurs shortly (vibration)

ML ⟶ MCU (NPU): required when event is continuous (sound); NPU is passive, awakens MCU only if needed.

ML ⟶ MPU (NPU): to avoid killing CPU (e.g. AirPods noise cancellation, Face ID, "Alexa")

"Alexa" pipeline: DSP detects keyword (3s buffer in MCU RAM) ⟶ audio ⟶ Wifi ⟶ internet ⟶ H100 (inference)