The following project uses an M1W Sipeed device, with RISCV architecture. The software would ideally be implemented in MaixPy, but C++ is also acceptable. Sound data would be read by the online microphone. The sound would be sent to a Tensorflow model working in a Mobilenet V.1 (or V.2) framework. The output is then displayed as an index on the screen.
For this project, there will be a training set of 12 sounds of 3 different categories, plus some 'baseline' examples. There are 12 test sounds. Training and test sounds may be sampled at different frequencies.
The assignment is as follows:
-Train a tensorflow (tflite) model to recognize the 3 states and baseline as separate indices (0 for baseline, each other category as 1, 2, etc.).
-Save model as .hdf5/.h5.
-Deploy model in mobilenet framework.
-Load onto Sipeed M1W device.
-Receive (L=3 s) window length of sound from the onboard microphone, sampled at (FS=44100).
-Downsample to 16000 Hz.
-Convert to mel frequency cepstral coefficients.
-Output result on screen, in the form of a numerical index.
-Refresh at (R=1 s) rate.
-The device should be able to identify at least 70% of test sounds. An F1 of 0.8 or greater is preferred.
Later on, the tensorflow model will be changed.
The deliverables are:
-The software necessary to load/run above on a Sipeed M1W ( M1w dock + 2.4 inch LCD + OV2640 ) K210 Dev. Board 1st RV64: [login to view URL]
-A private Git repository with above software.
-Software must function when I change the tensorflow model (load a separate hdf5/h5).
-Functional on a fresh Sipeed M1W device.
If there are memory restrictions, changing sampling rate to at least 16kHz is acceptable. Length of sampling window can be decreased to a minimum of L=1 s.