General Installation Guide for Extracting High Centrality Nodes in Distributed Graphs

Pre-Requisites

We have tested our system with Linux – Ubuntu/Debian variant (although all windows equivalent s/w are available and may be used)
MPI runtime (preferable Open MPI 1.6 or higher)
On a single machine

sudo apt-get -y install build-essential g++ python-dev autotools-dev libicu-dev libbz2-dev

sudo apt-get -y install libopenmpi-dev openmpi-bin openmpi-doc

Set up a MPI Cluster (see http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-using-open-mpi-on-linux/)
Boost Library (on each node of the cluster, run the following command)

sudo apt-get -y install libboost-all-dev

Metis Graph Partitioner

sudo apt-get -y install metis

Compile

Once the prerequisite are installed, download the code from the git repository mentioned earlier.

The code is a header only library so you don’t need to compile it to use it in your application. For demonstration, a sample .cpp file is provided which uses the given API. Compile it by running make in the project directory.

Execution

First the given graph (in METIS format) needs to be portioned into desired number of partitions. For example:

gpmetis ./4elt.graph 2

This will create a partition file 4elt.graph.part.2

Psuedo Distributed (single node):

Use the Open MPI’s mpi-run command to start the algorithm using the desired number of processors (NOTE: This should be equal to the number of partitions)

Usage: "mpirun -np 2 ./ebetwn <extract_cnt> <stable_check_cnt> <batch_size> <input_graph> <partition_file> <leaf_compress>"

extract_cnt (K): Number of top centrality vertices to extract (typically 10-100)

stable_check_cnt (S): is used to determine if the set of high centrality vertices is stabilized (for the given number of iterations)

batch_size (B): Number of shortest paths to calculate per iteration before checking the stability of the set

input_graph: input graph in Metis format (See below for details)

partition_file: the partition file generated by the Metis partitioner

leaf_compression: 0 or 1 depending on whether a leaf compression pre processing step should be performed. This improves performance but could potentially reduce the quality of results.

Note: Following conditions must hold 

K << N (where N is the number of vertices in the graph)
B < N
B * S < N

Example:

mpirun -np 2 ./ebetwn 10 3 5 ./4elt.graph 4elt.graph.part.2 1

On MPI Cluster

Run the same command as before from the Cluster’s head node.

API

The code is a header only library so you don’t need to compile it to use it in your application. To construct a distributed Graph use the Boost Parallel Graph Librarie’s Adjacency List data structure. See http://www.boost.org/doc/libs/1_55_0/libs/graph_parallel/doc/html/index.html for details.

We currently provide a single public API call which takes as input the distributed graph and returns the set of high centrality vertices:

void partiotined_chong_extract_high_centrality(Graph &g, CentralityMap& cm, IndexMap index_map, vertex_size_t topk, vertex_size_t bsize, vertex_size_t set_delta, vertex_size_t convergence_cnt)

Further variations and convenience functions will be provided in subsequent versions.

Data Format

We use the standard METIS graph data format for input graphs. See http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/manual.pdf for details.

To summarize, the metis format of the following

First line indicating the number of vertices and edges (and optionally the type of graph data, e.g. weighted or not)
All subsequent lines contain the data per vertex and its (undirected) edges in following format

optional_vertex_data e1 w1 e2 w2 …

For example:

See: http://people.sc.fsu.edu/~jburkardt/data/metis_graph/metis_graph.html for sample metis files.

Sample Metis File:

%  tiny_02.graph
%  A very small example of a graph
%  using weights on edges,
%  stored in the METIS graph file format.
%
%  The first non-comment line lists 
%  the number of vertices (7), edges (11) and the value of FMT.
%
%  FMT has the following meanings:
%    0  the graph has no weights (in this case, you can omit FMT);
%    1  the graph has edge weights;
%   10  the graph has vertex weights;
%   11  the graph has both edge and vertex weights.
%
%  This graph uses edge weights only.  Edge weights must be
%  integers strictly greater than 0.  
%
%  The next line notes that vertex 1 is connected to:
%    vertex 5 on an edge with weight 1, 
%    vertex 3 on an edge with weight 2, and
%    vertex 2 on an edge with weight 1.
%
%  Subsequent lines list the neighbors of successive vertices.
%
 7 11  1
%
%  Here come the (vertex_neighbor,edge_weight) pairs:
%
 5 1   3 2   2 1
 1 1   3 2   4 1
 5 3   4 2   2 2   1 2
 2 1   3 2   6 2   7 5
 1 1   3 3   6 2
 5 2   4 2   7 6
 6 6   4 5

Please see the manual for details on how to represent directed graphs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GeneralInstallationGuide.md

GeneralInstallationGuide.md

General Installation Guide for Extracting High Centrality Nodes in Distributed Graphs

Pre-Requisites

Compile

Execution

On MPI Cluster

API

Data Format

Files

GeneralInstallationGuide.md

Latest commit

History

GeneralInstallationGuide.md

File metadata and controls

General Installation Guide for Extracting High Centrality Nodes in Distributed Graphs

Pre-Requisites

Compile

Execution

On MPI Cluster

API

Data Format