Graph Processing Framework that supports || OpenMP || CAPI
AccelGraph-CAPI is an open source graph processing framework. It is designed as a modular benchmarking suite for graph processing algorithms. It provides an end to end evaluation infrastructure which includes the preprocessing stage of forming the graph structure and the graph algorithm. The OpenMP part of AccelGraph-CAPI has been developed on Ubuntu 18.04, with PowerPC/Intel architecture taken into account.
AccelGraph-CAPI is coded using C giving the researcher full flexibility with modifying data structures and other algorithmic optimizations. Furthermore, this benchmarking suite has been fully integrated with IBM Coherent Accelerator Processor Interface (CAPI), demonstrating the contrast in performance between Shared Memory Accelerators and Parallel Processors.
AccelGraph@CAPI:~$ sudo apt-get install libjudy-dev
AccelGraph@CAPI:~$ sudo apt-get install libomp-dev
HOME
and ALTERAPATH
depend on where you clone the repository and install ModelSim.
#quartus 18.1 env-variables
export ALTERAPATH="${HOME}/intelFPGA/18.1"
export QUARTUS_INSTALL_DIR="${ALTERAPATH}/quartus"
export LM_LICENSE_FILE="${ALTERAPATH}/licenses/psl_A000_license.dat:${ALTERAPATH}/licenses/common_license.dat"
export QSYS_ROOTDIR="${ALTERAPATH}/quartus/sopc_builder/bin"
export PATH=$PATH:${ALTERAPATH}/quartus/bin
export PATH=$PATH:${ALTERAPATH}/nios2eds/bin
#modelsim env-variables
export PATH=$PATH:${ALTERAPATH}/modelsim_ase/bin
#AccelGraph project folder
export PSLSE_ROOT="AccelGraph/01_capi_precis"
#CAPI framework env variables
export PSLSE_INSTALL_DIR="${HOME}/Documents/github_repos/${PSLSE_ROOT}/01_capi_integration/pslse"
export VPI_USER_H_DIR="${ALTERAPATH}/modelsim_ase/include"
export PSLVER=8
export BIT32=n
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$PSLSE_INSTALL_DIR/libcxl:$PSLSE_INSTALL_DIR/afu_driver/src"
#PSLSE env variables
export PSLSE_SERVER_DIR="${HOME}/Documents/github_repos/${PSLSE_ROOT}/01_capi_integration/accelerator_sim/server"
export PSLSE_SERVER_DAT="${PSLSE_SERVER_DIR}/pslse_server.dat"
export SHIM_HOST_DAT="${PSLSE_SERVER_DIR}/shim_host.dat"
export PSLSE_PARMS="${PSLSE_SERVER_DIR}/pslse.parms"
export DEBUG_LOG_PATH="${PSLSE_SERVER_DIR}/debug.log"
AccelGraph@CAPI:~$ git https://github.com/atmughrabi/AccelGraph.git
AccelGraph@CAPI:~$ cd AccelGraph/
AccelGraph@CAPI:~AccelGraph$ git submodule update --init --recursive
openmp
mode:
AccelGraph@CAPI:~AccelGraph$ make
AccelGraph@CAPI:~AccelGraph$ make run
AccelGraph@CAPI:~AccelGraph$ make run-openmp
simulation
this step is not needed when running on real hardware, this just simulates the AFU that resides on your (CAPI supported) FPGA :
AccelGraph@CAPI:~AccelGraph$ make run-vsim
r #recompile design
,c #run simulation
For simulation with floating point IPs (Altera)
ModelSim> rc
ModelSim> rcf
simulation
this step is not needed when running on real hardware, this just emulates the PSL that resides on your (CAPI supported) IBM-PowerPC machine :
AccelGraph@CAPI:~AccelGraph$ make run-pslse
AccelGraph@CAPI:~AccelGraph$ make run-capi-sim
AccelGraph@CAPI:~AccelGraph$ make run-capi-sim-verbose
RESPONSE_COMMANADTYPE_count
.| CYCLE_count : #Cycles |
| DONE_READ_count : (#) Reads successful |
| DONE_PREFETCH_READ_count : (#) Read Prefetches |
| PAGED_count : 0 |
| FLUSHED_count : 0 |
| AERROR_count : 0 |
| DERROR_count : 0 |
| FAILED_count : 0 |
| NRES_count : 0 |
| NLOCK_count : 0 |
### FPGA
#### Synthesize
These steps require ALTERA Quartus synthesis tool, starting from release 15.0 of Quartus II should be fine.
##### Using terminal
1. From the root directory (using terminal)
```console
AccelGraph@CAPI:~AccelGraph$ make run-capi-synth
AccelGraph@CAPI:~AccelGraph$ make run-capi-gui
AccelGraph@CAPI:~AccelGraph$ cd 03_capi_integration/accelerator_synth/
AccelGraph@CAPI:~AccelGraph/03_capi_integration/accelerator_synth$ make
AccelGraph@CAPI:~AccelGraph$ cd 03_capi_integration/accelerator_synth/
AccelGraph@CAPI:~AccelGraph/03_capi_integration/accelerator_synth$ make gui
AccelGraph@CAPI:~AccelGraph$ cd 03_capi_integration/accelerator_bin/
#define DEVICE
you can modify it according to your Power8 system from 00_bench/include/capi_utils/capienv.h
AccelGraph@CAPI:~AccelGraph/03_capi_integration/accelerator_bin$ sudo capi-flash-script accel-graph_GITCOMMIT#_DATETIME.rbf
AccelGraph@CAPI:~AccelGraph$ make run-capi-fpga
This run outputs different AFU-Control stats based on the responses received from the PSL
AccelGraph@CAPI:~AccelGraph$ make run-capi-fpga-verbose
-m, --afu-config=[DEFAULT:0x1]
CAPI FPGA integration: AFU-Control
buffers(read/write/prefetcher) arbitration 0x01
round robin 0x10 fixed priority.
-q, --cu-config=[DEFAULT:0x01]
CAPI FPGA integration: CU configurations for
requests cached/non cached/prefetcher active or
not check README for more explanation.
Usage: open-graph-openmp [OPTION...]
-f <graph file> -d [data structure] -a [algorithm] -r [root] -n
[num threads] [-h -c -s -w]
OpenGraph is an open source graph processing framework, it is designed to be a
benchmarking suite for various graph processing algorithms using pure C.
-a, --algorithm=[DEFAULT:[0]-BFS]
[0]-BFS,
[1]-Page-rank,
[2]-SSSP-DeltaStepping,
[3]-SSSP-BellmanFord,
[4]-DFS,
[5]-SPMV,
[6]-Connected-Components,
[7]-Betweenness-Centrality,
[8]-Triangle Counting,
[9-BUGGY]-IncrementalAggregation.
-b, --delta=[DEFAULT:1]
SSSP Delta value [Default:1].
-c, --convert-format=[DEFAULT:[1]-binary-edgeList]
[serialize flag must be on --serialize to write]
Serialize graph text format (edge list format) to
binary graph file on load example:-f <graph file>
-c this is specifically useful if you have Graph
CSR/Grid structure and want to save in a binary
file format to skip the preprocessing step for
future runs.
[0]-text-edgeList,
[1]-binary-edgeList,
[2]-graphCSR-binary.
-C, --cache-size=<LLC>
LLC cache size for MASK vertex reodering
-d, --data-structure=[DEFAULT:[0]-CSR]
[0]-CSR,
[1]-Grid,
[2]-Adj LinkedList,
[3]-Adj ArrayList
[4-5] same order bitmap frontiers.
-e, --tolerance=[EPSILON:0.0001]
Tolerance value of for page rank
[default:0.0001].
-f, --graph-file=<FILE>
Edge list represents the graph binary format to
run the algorithm textual format change
graph-file-format.
-F, --labels-file=<FILE>
Read and reorder vertex labels from a text file,
Specify the file name for the new graph reorder,
generated from Gorder, Rabbit-order, etc.
-g, --bin-size=[SIZE:512]
You bin vertices's histogram according to this
parameter, if you have a large graph you want to
illustrate.
-i, --num-iterations=[DEFAULT:20]
Number of iterations for page rank to converge
[default:20] SSSP-BellmanFord [default:V-1].
-j, --verbosity=[DEFAULT:[0:no stats output]
For now it controls the output of .perf file and
PageRank .stats (needs --stats enabled)
filesPageRank .stat [1:top-k results] [2:top-k
results and top-k ranked vertices listed.
-k, --remove-duplicate
Removers duplicate edges and self loops from the
graph.
-K, --Kernel-num-threads=[DEFAULT:algo-num-threads]
Number of threads for graph processing kernel
(critical-path) (graph algorithm)
-l, --light-reorder-l1=[DEFAULT:[0]-no-reordering]
Relabels the graph for better cache performance
(first layer).
[0]-no-reordering,
[1]-out-degree,
[2]-in-degree,
[3]-(in+out)-degree,
[4]-DBG-out,
[5]-DBG-in,
[6]-HUBSort-out,
[7]-HUBSort-in,
[8]-HUBCluster-out,
[9]-HUBCluster-in,
[10]-(random)-degree,
[11]-LoadFromFile (used for Rabbit order).
-L, --light-reorder-l2=[DEFAULT:[0]-no-reordering]
Relabels the graph for better cache performance
(second layer).
[0]-no-reordering,
[1]-out-degree,
[2]-in-degree,
[3]-(in+out)-degree,
[4]-DBG-out,
[5]-DBG-in,
[6]-HUBSort-out,
[7]-HUBSort-in,
[8]-HUBCluster-out,
[9]-HUBCluster-in,
[10]-(random)-degree,
[11]-LoadFromFile (used for Rabbit order).
-O, --light-reorder-l3=[DEFAULT:[0]-no-reordering]
Relabels the graph for better cache performance
(third layer).
[0]-no-reordering,
[1]-out-degree,
[2]-in-degree,
[3]-(in+out)-degree,
[4]-DBG-out,
[5]-DBG-in,
[6]-HUBSort-out,
[7]-HUBSort-in,
[8]-HUBCluster-out,
[9]-HUBCluster-in,
[10]-(random)-degree,
[11]-LoadFromFile (used for Rabbit order).
-M, --mask-mode=[DEFAULT:[0:disabled]]
Encodes [0:disabled] the last two bits of
[1:out-degree]-Edgelist-labels
[2:in-degree]-Edgelist-labels or
[3:out-degree]-vertex-property-data
[4:in-degree]-vertex-property-data with hot/cold
hints [11:HOT]|[10:WARM]|[01:LUKEWARM]|[00:COLD]
to specialize caching. The algorithm needs to
support value unmask to work.
-n, --pre-num-threads=[DEFAULT:MAX]
Number of threads for preprocessing (graph
structure) step
-N, --algo-num-threads=[DEFAULT:MAX]
Number of threads for graph processing (graph
algorithm)
-o, --sort=[DEFAULT:[0]-radix-src]
[0]-radix-src,
[1]-radix-src-dest,
[2]-count-src,
[3]-count-src-dst.
-p, --direction=[DEFAULT:[0]-PULL]
[0]-PULL,
[1]-PUSH,
[2]-HYBRID.
NOTE: Please consult the function switch table for each
algorithm.
-r, --root=[DEFAULT:0]
BFS, DFS, SSSP root
-s, --symmetrize
Symmetric graph, create a set of incoming edges.
-S, --stats
Write algorithm stats to file. same directory as
the graph.PageRank: Dumps top-k ranks matching
using QPR similarity metrics.
-t, --num-trials=[DEFAULT:[1 Trial]]
Number of trials for whole run (graph algorithm
run) [default:1].
-w, --generate-weights
Load or Generate weights. Check ->graphConfig.h
#define WEIGHTED 1 beforehand then recompile using
this option.
-x, --serialize
Enable file conversion/serialization use with
--convert-format.
-z, --graph-file-format=[DEFAULT:[1]-binary-edgeList]
Specify file format to be read, is it textual edge
list, or a binary file edge list. This is
specifically useful if you have Graph CSR/Grid
structure already saved in a binary file format to
skip the preprocessing step.
[0]-text edgeList,
[1]-binary edgeList,
[2]-graphCSR binary.
-?, --help Give this help list
--usage Give a short usage message
-V, --version Print program version
00_graph_bench
include
- Major function headersalgorithms
- supported Graph algorithmscapi
- capi integrationBFS.h
- Breadth First SearchDFS.h
- Depth First SearchSSSP.h
- Single Source Shortest PathbellmanFord.h
- Single Source Shortest Path using Bellman FordincrementalAgreggation.h
- Incremental Aggregation for clusteringpageRank.h
- Page Rank AlgorithmSPMV.h
- Sparse Matrix Vector Multiplicationsrc
- Major function Source filesalgorithms
- supported Graph algorithmscapi
- CAPI integrationBFS.c
- Breadth First SearchDFS.c
- Depth First SearchSSSP.c
- Single Source Shortest PathbellmanFord.c
- Single Source Shortest Path using Bellman FordincrementalAgreggation.c
- Incremental Aggregation for clusteringpageRank.c
- Page Rank AlgorithmSPMV.c
- Sparse Matrix Vector MultiplicationMakefile
- Global makefile
(work in progress)
(work in progress)
(work in progress)
Report bugs to atmughra@ncsu.edu