-
We have tested our system with Linux – Ubuntu/Debian variant (although all windows equivalent s/w are available and may be used)
-
MPI runtime (preferable Open MPI 1.6 or higher)
-
On a single machine
sudo apt-get -y install build-essential g++ python-dev autotools-dev libicu-dev libbz2-dev
sudo apt-get -y install libopenmpi-dev openmpi-bin openmpi-doc
-
Set up a MPI Cluster (see http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-using-open-mpi-on-linux/)
-
Boost Library (on each node of the cluster, run the following command)
sudo apt-get -y install libboost-all-dev
- Metis Graph Partitioner
sudo apt-get -y install metis
Once the prerequisite are installed, download the code from the git repository mentioned earlier.
The code is a header only library so you don’t need to compile it to use it in your application. For demonstration, a sample .cpp file is provided which uses the given API. Compile it by running make in the project directory.
- First the given graph (in METIS format) needs to be portioned into desired number of partitions. For example:
gpmetis ./4elt.graph 2
This will create a partition file 4elt.graph.part.2
Psuedo Distributed (single node):
Use the Open MPI’s mpi-run command to start the algorithm using the desired number of processors (NOTE: This should be equal to the number of partitions)
Usage: "mpirun -np 2 ./ebetwn <extract_cnt> <stable_check_cnt> <batch_size> <input_graph> <partition_file> <leaf_compress>"
extract_cnt (K): Number of top centrality vertices to extract (typically 10-100)
stable_check_cnt (S): is used to determine if the set of high centrality vertices is stabilized (for the given number of iterations)
batch_size (B): Number of shortest paths to calculate per iteration before checking the stability of the set
input_graph: input graph in Metis format (See below for details)
partition_file: the partition file generated by the Metis partitioner
leaf_compression: 0 or 1 depending on whether a leaf compression pre processing step should be performed. This improves performance but could potentially reduce the quality of results.
Note: Following conditions must hold
K << N (where N is the number of vertices in the graph)
B < N
B * S < N
Example:
mpirun -np 2 ./ebetwn 10 3 5 ./4elt.graph 4elt.graph.part.2 1
Run the same command as before from the Cluster’s head node.
The code is a header only library so you don’t need to compile it to use it in your application. To construct a distributed Graph use the Boost Parallel Graph Librarie’s Adjacency List data structure. See http://www.boost.org/doc/libs/1_55_0/libs/graph_parallel/doc/html/index.html for details.
We currently provide a single public API call which takes as input the distributed graph and returns the set of high centrality vertices:
void partiotined_chong_extract_high_centrality(Graph &g, CentralityMap& cm, IndexMap index_map, vertex_size_t topk, vertex_size_t bsize, vertex_size_t set_delta, vertex_size_t convergence_cnt)
Further variations and convenience functions will be provided in subsequent versions.
We use the standard METIS graph data format for input graphs. See http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/manual.pdf for details.
To summarize, the metis format of the following
-
First line indicating the number of vertices and edges (and optionally the type of graph data, e.g. weighted or not)
-
All subsequent lines contain the data per vertex and its (undirected) edges in following format
optional_vertex_data e1 w1 e2 w2 …
For example:
See: http://people.sc.fsu.edu/~jburkardt/data/metis_graph/metis_graph.html for sample metis files.
Sample Metis File:
% tiny_02.graph
% A very small example of a graph
% using weights on edges,
% stored in the METIS graph file format.
%
% The first non-comment line lists
% the number of vertices (7), edges (11) and the value of FMT.
%
% FMT has the following meanings:
% 0 the graph has no weights (in this case, you can omit FMT);
% 1 the graph has edge weights;
% 10 the graph has vertex weights;
% 11 the graph has both edge and vertex weights.
%
% This graph uses edge weights only. Edge weights must be
% integers strictly greater than 0.
%
% The next line notes that vertex 1 is connected to:
% vertex 5 on an edge with weight 1,
% vertex 3 on an edge with weight 2, and
% vertex 2 on an edge with weight 1.
%
% Subsequent lines list the neighbors of successive vertices.
%
7 11 1
%
% Here come the (vertex_neighbor,edge_weight) pairs:
%
5 1 3 2 2 1
1 1 3 2 4 1
5 3 4 2 2 2 1 2
2 1 3 2 6 2 7 5
1 1 3 3 6 2
5 2 4 2 7 6
6 6 4 5
Please see the manual for details on how to represent directed graphs.