Python implementation of Embed2Detect for event detection in social media
Embed2Detect is an event detection mechanism developed for social media data. Please refer to the paper “Embed2Detect: Temporally Clustered Embedded Words for Event Detection in Social Media“ for more details about this approach.
If you use this system, please consider citing this paper and reference details are given below.
Python 3.7 implementation of Embed2Detect
Used packages are listed in requirements.txt
run main.py given the parameters;
.tsv file formatted as follows;
Completed event detection saves a folder with given input file name in the results_folder_path mentioned under project_config.
This folder contains .txt files where events words are saved as single word per line corresponding to each event window.
Depending on the target data set, data cleaning techniques can be customised. The default flow which
was developed by targeting a Twitter data set is available under the method; preprocessing_flow in
data_preprocessor.py.
General configuration details of the project including word embedding configs, performance configs and file path configs
are available in project_config.py.
@article{hettiarachchi2021embed2detect,
title={{E}mbed2{D}etect: temporally clustered embedded words for event detection in social media},
author={Hettiarachchi, Hansi and Adedoyin-Olowe, Mariam and Bhogal, Jagdev and Gaber, Mohamed Medhat},
journal={Machine Learning},
volume={111},
pages={49--87},
year={2022},
publisher={Springer},
doi = {10.1007/s10994-021-05988-7},
url = "https://doi.org/10.1007/s10994-021-05988-7",
}