Nowadays products are becoming smarter to provide additional value to their users. To optimize their usage, smart objects need to be aware of their environment. Artificial intelligence can decrypt data from various sensor such accelerometer or microphone in order to make these data meaningful data for Humans. For example, we have taught a neural network how to distinguish scenes (Indoor, outdoor, car) to be able to optimize equipment behavior regarding user environment. After optimization with STM32Cube.AI, the AI model can run on an Ultra-Low power microcontroller to embed intelligence everywhere. This approach can easily be adapted to many other use cases or environments by retraining the AI model with new data.

Approach

The goal of Acoustic Scene Classification (ASC) is to classify the actual environment into one of the provided three predefined classes (indoor, outdoor, in-vehicle) characterized with the acoustic captured by a single digital microphone.The demo runs on a small form factor board Sensor Tile that comes along with a smartphone application connected through Bluetooth Low Energy.

We used the FP-AI-SENSING1 function pack to build this example, running on an STEVAL-STLKT01V1 board. The ASC configuration captures audio at a 16 kHz (16-bit, 1 channel) rate using the on-board MEMS microphone.Every millisecond, a DMA interrupt is received with the last 16 PCM audio samples. These samples are then accumulated in a sliding window consisting of 1024 samples with a 50% overlap. For every 512 samples (i.e.,32 ms), the buffer is injected into the ASC preprocessing for feature extraction.The ASC preprocessing extracts audio features into a LogMel (30x32) spectrogram.
For computational efficiency and memory management optimization, the step is divided into two routines:

  • The first part computes one of the 32 spectrogram columns from the time domain input signal into the Mel scale using FFT and Filter bank applications (30 mel bands).
  • The second part, when all 32 columns have been calculated (i.e., after 1024 ms), a log scaling is applied to the mel scaled spectrogram, creating the input feature for the ASC convolutional neural network.


Every 1024ms, the (30x32) LogMel spectrogram is fed to the ASC convolutional neural network input, which can then classify the output labels: indoor, outdoor and in-vehicle. 

Sensor

Digital MEMS Microphone (ref. MP34DT05-A)

Data

Data format 22h53m of audio samples 

Results

Model ST Convolutional Neural Network Quantized 
Input size: 30x32
Complexity 517 K MACC
Memory footprint:
31 KB Flash for weights
18 KBRAM for activations
Performance on STM32L476 (Low Power) @ 80 MHz 
Use case: 1 classification/sec 
Pre/Post-processing: 3.7 MHz 
NN processing: 6 MHz 
Power consumption (1.8 V)

use-case-stm32-cube-ai-confusion-matrix-acoustic-scene-classification use-case-stm32-cube-ai-confusion-matrix-acoustic-scene-classification use-case-stm32-cube-ai-confusion-matrix-acoustic-scene-classification

Confusion matrix

Optimized with
STM32Cube.AI
STM32Cube.AI
Compatible with

STM32

STM32

Resources

Optimized with STM32Cube.AI

A free STM32Cube expansion package, X-CUBE-AI allows developers to convert pretrained AI algorithms automatically, such as neural network and machine learning models, into optimized C code for STM32.

STM32Cube.AI STM32Cube.AI STM32Cube.AI

Compatible with STM32

The STM32 family of 32-bit microcontrollers based on the Arm Cortex®-M processor is designed to offer new degrees of freedom to MCU users. It offers products combining very high performance, real-time capabilities, digital signal processing, low-power / low-voltage operation, and connectivity, while maintaining full integration and ease of development.

STM32 STM32 STM32
You might also be interested by

Demo | Accelerometer | Predictive maintenance | MEMS MLC | Tutorial | Wearables | Gyroscope | ST AIoT Craft

Recognize head gestures in wearable devices with ultra low power sensors

Recognize head gestures such as nodding, shaking, and other general head movements through the Machine Learning Core available in MEMS sensors.

Demo | Industrial | Accelerometer | Predictive maintenance | MEMS MLC | Tutorial | ST AIoT Craft

How to monitor and classify fan-coil systems with STWIN.box

Monitor and classify the behavior of a fan (e.g. on HVAC units) through the Machine Learning Core available in MEMS sensors.

Partner | Smart city | Transportation | Vision | STM32Cube.AI | STM32 AI MCU | Video

Number-Plate Recognition (ANPR) based on Vision AI by Irida Labs

Vision AI-powered solution for Automatic Number-Plate Recognition (ANPR) for smart city applications, running on STM32 MCUs