基于kafka的ps-lite
A light and efficient implementation of the parameter server
framework. It provides clean yet powerful APIs. For example, a worker node can
communicate with the server nodes by
Push(keys, values)
: push a list of (key, value) pairs to the server nodesPull(keys)
: pull the values from servers for a list of keysWait
: wait untill a push or pull finished.A simple example:
std::vector<uint64_t> key = {1, 3, 5};
std::vector<float> val = {1, 1, 1};
std::vector<float> recv_val;
ps::KVWorker<float> w;
w.Wait(w.Push(key, val));
w.Wait(w.Pull(key, &recv_val));
More features:
ps-lite
requires a C++11 compiler such as g++ >= 4.8
. On Ubuntu >= 13.10, we
can install it by
sudo apt-get update && sudo apt-get install -y build-essential git
Instructions for
older Ubuntu,
Centos,
and
Mac Os X.
Then clone and build
git clone https://github.com/dmlc/ps-lite
cd ps-lite && make -j4
ps-lite
provides asynchronous communication for other projects:
We started to work on the parameter server framework since 2010.
The first generation was
designed and optimized for specific algorithms, such as logistic regression and
LDA, to serve the sheer size industrial machine learning tasks (hundreds billions of
examples and features with 10-100TB data size) .
Later we tried to build a open-source general purpose framework for machine learning
algorithms. The project is available at dmlc/parameter_server.
Given the growing demands from other projects, we created ps-lite
, which provides a clean data communication API and a
lightweight implementation. The implementation is based on dmlc/parameter_server
, but we refactored the job launchers, file I/O and machine
learning algorithms codes into different projects such as dmlc-core
andwormhole
.
From the experience we learned during developing
dmlc/mxnet, we further refactored the API and implementation from v1. The main
changes include