Quaternion Neural Networks for 3D Sound Source Localization in Reverberant Environments.
The objective of our work is to build a working deep quaternion neural network (DQNN) based network that works with First Order Ambisonics data sets. In particular, we are going to extend DQNN, adding capabilities to both support pre-existing data sets (ansim, resim, etc.) and the FOA one in a smart, modular, performing way. Therefore, other metrics have been added like the SELD score, mainly used in the 2019 paper outcomes evaluation, and a tiny library for a graphical representation of the results.
This project can be easily executed using one of the two proposed notebooks:
The latter gives you the possibility to use a pre-loaded and pre-extracted dataset (~200GB).
A quick view of our CSV files.
| - | description | | —- | —- | | A | training_loss | | B | validation_loss | | C | sed_loss_er | | D | sed_loss_f1 | | E | doa_loss_avg_accuracy | | F | doa_loss_gt | | G | doa_loss_pred | | H | doa_loss_gt_cnt | | | - | description | | —- | —- | | I | doa_loss_pred_cnt | | J | doa_loss_good_frame_cnt | | K | sed_score | | L | doa_score | | M | seld_score | | N | sed_confidence_interval_low | | O | sed_confidence_interval__up | |