Mechanical sensors for sound recognition

Sound recognition systems, which generate triggers when certain types of sounds are recognized, are being used in different domains these days: speech, music, healthcare, traffic, construction, etc...

Typically, the way this works is that there is an always-on system that constantly samples the audio signal coming in through some input device (microphone), which is then processed by signal processing and/or machine learning algorithms to decide which type of sound is being perceived, and based on this classification a trigger signal can be sent out to take some kind of action, enter an event in a log, etc...

To drive this system, an electric power source is needed, and it's important to keep power consumption as low as possible to either not drain a shared power source or to obtain long operation times from a dedicated battery.
In setups where this makes sense, these systems therefore sometimes operate in two stages: a first simple stage with low power consumption that simply detects if the sound level surpasses a certain threshold, and a second more sophisticated stage that is only used when that threshold level was exceeded and which uses more processing power to classify the sound fragment captured around that moment with a machine learning model.

Researchers at ETH Zurich have now developed a mechanical sensor that uses the vibrational energy contained in sound waves to generate a small electric pulse that can switch on an electronic device that was switched off. The sensor does not require an external power source and it can be tuned to only trigger for certain types of sounds.

The researchers claim that the prototype they developed can distinguish between the spoken words "three" and "four", and that newer variants should be able to distinguish between up to 12 different words, such as "on", "off", "up", "down", ...

The sensor is a "phononic metamaterial" that consists of silicon-based structured plates that are connected to each other via tiny bars, which act like springs. The plates have a specially designed microstructure and it is the way in which the plates are connected via the "springs" that determines whether or not a particular sound triggers the sensor or not.

As the researchers conclude: "By demonstrating that machine learning tasks can be encoded in the response of phononic metamaterials, together with prior experimental results on passive amplitude activated switches, we illuminate a novel path toward zero-power smart devices that can intelligently respond to events."

Looking forward to seeing what comes out of this research!

Article reference:
Title: "In-Sensor Passive Speech Classification with Phononic Metamaterials",
Authors: Tena Dubček, Daniel Moreno-Garcia, Thomas Haag, Parisa Omidvar, Henrik R. Thomsen, Theodor S. Becker, Lars Gebraad, Christoph Bärlocher, Fredrik Andersson, Sebastian D. Huber, Dirk-Jan van Manen, Luis Guillermo Villanueva, Johan O.A. Robertsson, Marc Serra-Garcia
Publication date: 9 January 2024
URL: https://doi.org/10.1002/adfm.202311877