Louvain Algorithm¶
The Louvain algorithm is a fast greedy optimization method for finding communities by maximizing modularity.
Function Signature¶
Parameters¶
- graph: Undirected graph to analyze
- seed: Random seed for reproducibility (optional)
Returns¶
List of communities, where each community is a list of node IDs. Each node appears in exactly one community.
Description¶
The Louvain algorithm optimizes modularity using a greedy approach:
- Start with each node as its own community
- For each node, try moving it to neighboring communities
- Keep move if modularity increases
- Repeat until convergence
- Optionally, recursively apply to supernode graph
Time Complexity¶
O(V + E) to O(V log V) depending on implementation
Space Complexity¶
O(V + E)
Example¶
import pygraphina as pg
# Create a network with natural communities
g = pg.PyGraph()
nodes = [g.add_node(i) for i in range(34)]
# Add edges to create a graph with community structure
# (simplified karate club-like structure)
edges = [
(0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (2, 3),
(5, 6), (5, 7), (5, 8), (6, 7), (6, 8), (7, 8),
(4, 5), (3, 9), (9, 10), (10, 11)
]
for u, v in edges:
g.add_edge(nodes[u], nodes[v], 1.0)
# Detect communities
communities = pg.community.louvain(g, seed=42)
# Analyze results
print(f"Found {len(communities)} communities")
for comm_id, members in enumerate(communities):
print(f"Community {comm_id}: {len(members)} nodes - {members[:5]}...")
Advantages¶
- Very fast - suitable for large graphs
- Good quality communities
- Widely used and well-tested
- Multi-level hierarchical clustering
- Reproducible with seed parameter
Disadvantages¶
- Non-deterministic without seed
- Greedy approach not guaranteed optimal
- May produce different results on different runs without fixed seed
Parameters¶
Seed¶
The seed parameter controls randomness:
- None: Non-deterministic behavior (different results each run)
- Integer: Reproducible results with same seed value
Use a seed when you need:
- Reproducible research results
- Consistent testing
- Comparable experiments
Omit seed when you want:
- Multiple different community structures
- Robustness testing across runs
When to Use¶
- Large graphs (suitable for > 100,000 nodes)
- Need fast results
- Exploring different community resolutions
- Production systems
Reproducibility¶
For reproducible results with the same graph:
- Results are deterministic with same seed (if available)
- Try multiple resolutions
- Average results if needed
Comparison¶
| Algorithm | Speed | Quality | Scalability |
|---|---|---|---|
| Louvain | Very Fast | Good-Excellent | Excellent |
| Girvan-Newman | Slow | Excellent | Poor |
| Label Propagation | Fast | Good | Excellent |
| Spectral | Medium | Good | Medium |