项目作者: HaebinShin

项目描述 :
Music Playlist Continuation for Melon
高级语言: Python
项目地址: git://github.com/HaebinShin/melon-playlist-continuation.git
创建时间: 2020-08-25T15:05:59Z
项目社区:https://github.com/HaebinShin/melon-playlist-continuation

开源协议:

下载


Melon Playlist Continuation

This is an extra solution to the Melon Playlist Continuation Challenge by the * Team.
It was inspired by the following two papers: A hybrid two-stage recommender system for automatic playlist continuation, which won 3rd place in the RecSys Challenge ’18; and Relational Learning via Collective Matrix Factorization.

Dataset

As stated in the Challenge README, the dataset in data.tar.gz contains 150K playlists that have been created by Melon users.
To untar the dataset:

  1. tar -xvzf data.tar.gz

The data/train.json contains all the data, whereas data/val.json and data/test.json are just for submission, so only some of the songs and tags are included.
For this repository, we just consider data/val.json and data/test.json as additional information.

Solution

  • Phase 1: Extract candidates using CMF Recommandation(song+tag matrix)
  • Phase 2: Re-rank candidates using Learning-To-Rank Boosting

Preprocessing - Data Partitioning

For local evaluation, we create the new evaluation dataset. The part2 and part3 are for the training and validation datasets for boosting, respectively.
These are divided into question (_q) and answer (_a) parts.
In Phase 1, we train part1+part2_q+part3_q+evaluation_q and optionally include valid.json+test.json as additional information.
In Phase 2, we use part2_q and part3_q as inputs and use part2_a and part3_a as labels, respectively.
Please refer to A hybrid two-stage recommender system for automatic playlist continuation for detailed partitioning.

Usage

Preprocessing

  1. python3 preprocess.py run ./data/train.json

After running the above, the preprocessed directory is as follows.

  1. ├── preprocessed
  2. ├── inputs
  3. ├── part1.json
  4. ├── part2_q.json
  5. ├── part3_q.json
  6. └── evaluation_q.json
  7. └── labels
  8. ├── part2_a.json
  9. ├── part3_a.json
  10. └── evaluation_a.json

Training and Prediction

  1. python3 run.py --dir ./preprocessed --additional ./data/val.json ./data/test.json

The --additional flag is optional.

  1. python3 run.py --dir ./preprocessed

Evaluation

  1. python3 evaluate.py --result ./result.json --answer ./preprocessed/labels/evaluation_a.json

Score

  1. Music nDCG: 0.250488
  2. Tag nDCG: 0.413651
  3. Final Score: 0.274963

Final Score = Music nDCG 0.85 + Tag nDCG 0.15

Running Environment

We tested this implementation using Python 3.6.9 with an Intel Core i7-9700 CPU and 32GB RAM.