Create a network with Graph-tool and Pandas

Graph-tool#

Recently, I discovered the python package graph-tool for network analysis. The advantage of graph-tool over the very popular networkx is the speed, as said by its creator:

Some other really cool features (besides speed) of graph-tool are:

  • Filters and views get subgraphs without creating a new python object
  • Interactive drawing
  • Beautiful layouts
  • Topological algorithms

Cons:

  • Could use more documentation examples.

Pandas + graph-tool#

It is very convenient to analyze networks using tables to look at:

  • Structural parameters (e.g. degree: mean, std, max min)
  • Node and edge lists with attributes

And even have the network in the form of an edge-list of the sort:

node 1 node 2 color weight
a b red 2
a c black 5
b c red 1

Where the first two column define the nodes within the edge and the rest of the columns are edge attributes (color and weight). Pandas is a python library for data analysis that focus on creating dataframes to work with tables. Here, we will be reading a table using pandas and converting it to a graph-tool Graph object.

import pandas as pd
import graph_tool.all as gt
import numpy as np

# Load table
df = pd.read_csv("table.csv")

g = gt.Graph()
# Set property maps for edge attributes
weight = g.new_edge_property('int')
color = g.new_edge_property('string')

# Create numpy array of edgelist
edglist = df.values

# Add edges
node_id = g.add_edge_list(edgelist, hashed=True, eprops=[color, weight])

# Access node id of each vertex
for node in range(g.num_vertices()):
    print("Node {} has id: {}".format(node, node_id[node]))

Now we are ready to use the nice algorithms of graph-tool. Before saving your network, remember to also save the node ids or labels as an internal vertex property map of the graph:

g.vertex_properties['node_id'] = node_id

# Same with edge properties `color` and `weight`
g.edge_properties['color'] = color
g.edge_properties['weight'] = weight

# Save graph
g.save('my_network.graphml')