Spiking neural network for detecting speech emotion
Summary
- Became interested in spiking neural networks (SNN) for the opportunities they
offer in the real-time processing of continuous-time signals.
- Read a few papers on spiking neural networks for speech classification as
well as audio-to-spike encoding schemes.
- Implemented a completely biologically-plausible audio-to-spike encoding.
For me, the advantage to using a biologically plausible method is that
processing can now be done in real-time as the signal appears. Which happens to
be how our ears work, but not how common digital audio processing methods work.
- Implemented a spiking neural network with a basic plasticity rule “from
scratch” (using numpy). This was because available SNN libraries (e.g. NEST,
snnTorch) seemed either too complex (intended for modelling physiology) or used
deep learning methods that defeat my goal of real-time processing of a
continuous signal.
- Trained a Support Vector Machine to map readouts of the SNN to emotion labels.
- So far, accuracy is only slightly better than random chance ;_;.
- Currently looking for possible bugs + ways to improve the network’s
architecture.
Skills gained
Longer description
The deeper reason I became interested in training an SNN is that I suspect
neuromorphics will be used in future brain-computer-interfaces.
Hold on, what is a spiking neural network, and what are neuromorphics?
References