项目作者: DushyantChauhan

项目描述 :
Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis
高级语言: Python
项目地址: git://github.com/DushyantChauhan/ACL-2020-MUStARD-Extension.git
创建时间: 2021-03-13T16:10:38Z
项目社区:https://github.com/DushyantChauhan/ACL-2020-MUStARD-Extension

开源协议:MIT License

下载


SE-MUStARD Dataset for Multimodal Sarcasm Detection

This repository contains the dataset and code for our ACL 2020 paper:
Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis

MUStARD Dataset

The original MUStARD dataset released in Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper). The MUStARD dataset is a multimodal video corpus for research in automated sarcasm discovery. The dataset is compiled from popular TV shows including Friends, The Golden Girls, The Big Bang Theory, and Sarcasmaholics Anonymous. MUStARD consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context, which provides additional information on the scenario where the utterance occurs.

SE-MUStARD Dataset with Sentiment and Emotion Classes

We manually annotate this multi-modal MUStARD sarcasm dataset with sentiment and emotion classes, both implicit and explicit. You can download SE-MUStARD datasets from here (text only). For rest of the modalities i.e., visual and acoustic, please follow this GitHub repository.

Data Format

Key Value
utterance The text of the target utterance to classify.
speaker Speaker of the target utterance.
context List of utterances (in chronological order) preceding the target utterance.
context_speakers Respective speakers of the context utterances.
sarcasm Binary label for sarcasm tag.
implicit-sentiment Three labels for implcit sentiment tag.
explicit-sentiment Three labels for explcit sentiment tag.
implicit-emotion Nine labels for implicit-emotion tag.
explicit-emotion Nine labels for explicit-emotion tag.

Feature Extraction

There are two setups which are as follows;

(1) Speaker Dependent Setup (exMode=True)

Note: see function featuresExtraction_fastext(foldNum, exMode) in trimodal_true.py, where foldNum belongs to [0-4] and exMode = True

Note: see function featuresExtraction_original(foldNum, exMode) in trimodal_true.py, where foldNum belongs to [0-4] and exMode = True


(2) Speaker Independent Setup (exMode=False)

Note: see function featuresExtraction_fastext(foldNum, exMode) in trimodal_false.py, where foldNum = 3 and exMode = False

Note: see function featuresExtraction_original(foldNum, exMode) in trimodal_false.py, where foldNum = 3 and exMode = False

  1. Download all the features and put into the folder **feature_extraction** and then run the code.

Download Trained Weights

There are two setups which are as follows;

(1) Speaker Dependent Setup (exMode=True):

  • Five folds Speaker dependent weights

(2) Speaker Independent Setup (exMode=False):

Run the code

  1. python2 trimodal_true.py (for speaker dependent)
  2. python2 trimodal_false.py (for speaker independent)

Citation

Please cite the following paper if you find this dataset useful in your research:

  1. @inproceedings{chauhan-etal-2020-sentiment,
  2. title = "Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis",
  3. author = "Chauhan, Dushyant Singh and
  4. S R, Dhanush and
  5. Ekbal, Asif and
  6. Bhattacharyya, Pushpak",
  7. booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
  8. month = jul,
  9. year = "2020",
  10. address = "Online",
  11. publisher = "Association for Computational Linguistics",
  12. url = "https://www.aclweb.org/anthology/2020.acl-main.401",
  13. pages = "4351--4360",
  14. }

—versions—

python: 2.7

keras: 2.2.8

tensorflow: 1.9.0