项目作者: RickyMexx

项目描述 :
Quaternion Neural Networks for 3D Sound Source Localization in Reverberant Environments.
高级语言: Python
项目地址: git://github.com/RickyMexx/3D-Sound-Localization.git
创建时间: 2020-02-14T10:35:58Z
项目社区:https://github.com/RickyMexx/3D-Sound-Localization

开源协议:Apache License 2.0

下载


3D-Sound-Localization

Quaternion Neural Networks for 3D Sound Source Localization: Implementation using First Order Ambisonics.

The objective of our work is to build a working deep quaternion neural network (DQNN) based network that works with First Order Ambisonics data sets. In particular, we are going to extend DQNN, adding capabilities to both support pre-existing data sets (ansim, resim, etc.) and the FOA one in a smart, modular, performing way. Therefore, other metrics have been added like the SELD score, mainly used in the 2019 paper outcomes evaluation, and a tiny library for a graphical representation of the results.

doa

seld

seld3

Usage

This project can be easily executed using one of the two proposed notebooks:

The latter gives you the possibility to use a pre-loaded and pre-extracted dataset (~200GB).

Model metrics CSV table

A quick view of our CSV files.




| - | description |
| —- | —- |
| A | training_loss |
| B | validation_loss |
| C | sed_loss_er |
| D | sed_loss_f1 |
| E | doa_loss_avg_accuracy |
| F | doa_loss_gt |
| G | doa_loss_pred |
| H | doa_loss_gt_cnt |



| - | description |
| —- | —- |
| I | doa_loss_pred_cnt |
| J | doa_loss_good_frame_cnt |
| K | sed_score |
| L | doa_score |
| M | seld_score |
| N | sed_confidence_interval_low |
| O | sed_confidence_interval__up |