Common Neighbor Centrality Link Prediction¶
Common neighbor centrality is a link prediction metric based on centrality-weighted common neighbors.
Overview¶
Common neighbor centrality combines the common neighbor principle with node centrality weights. The hypothesis: nodes with high centrality that have common neighbors are more likely to form connections.
Function Signature¶
pg.links.common_neighbor_centrality(
graph: Union[PyGraph, PyDiGraph],
alpha: float,
ebunch: Optional[List[Tuple[int, int]]] = None
) -> Dict[Tuple[int, int], float]
Parameters¶
- graph: The graph to analyze
- alpha: Weighting parameter for centrality contribution (typically 0.0 to 1.0)
- ebunch: Optional list of edges to evaluate. If None, evaluates all non-existing edges
Returns¶
Dictionary mapping edge tuples (u, v) to common neighbor centrality scores.
Description¶
The common neighbor centrality score is calculated as:
Where: - N(u), N(v) are neighbor sets - C(u), C(v) are centrality values - alpha controls how much centrality weights the score
Time Complexity¶
O(k·d²) where k is the number of edges to evaluate and d is average degree
Space Complexity¶
O(V + E)
Example¶
import pygraphina as pg
# Create a collaboration network
g = pg.PyGraph()
researchers = [g.add_node(i) for i in range(15)]
# Add collaboration edges
collaborations = [
(0,1), (0,2), (1,2), (1,3), (2,3), (3,4), (4,5), (5,6),
(6,7), (7,8), (8,9), (9,10), (10,11), (11,12), (12,13)
]
for u, v in collaborations:
g.add_edge(researchers[u], researchers[v], 1.0)
# Predict new collaborations using different alpha values
alpha_low = pg.links.common_neighbor_centrality(g, alpha=0.1)
alpha_high = pg.links.common_neighbor_centrality(g, alpha=0.9)
# Get top predictions with high centrality weighting
top_high = sorted(alpha_high.items(), key=lambda x: x[1], reverse=True)[:5]
print("Top 5 predictions (high centrality weighting):")
for (u, v), score in top_high:
if not g.contains_edge(u, v):
print(f" {u}-{v}: {score:.4f}")
# Evaluate on specific edges
candidates = [(0, 4), (2, 5), (5, 11)]
scores = pg.links.common_neighbor_centrality(g, alpha=0.5, ebunch=candidates)
print(f"\nScores for candidate edges: {scores}")
Advantages¶
- Combines local (common neighbors) and global (centrality) information
- Tunable parameter (alpha) for different use cases
- Fast computation
- Works well for networks with hub nodes
- Can focus on specific edges via ebunch parameter
Disadvantages¶
- Requires parameter tuning (alpha)
- May overweight central nodes
- Assumes centrality is predictive
- Less effective in homogeneous networks
When to Use¶
- Influence networks where centrality matters
- Networks with clear hub structure
- When combining local and global information
- Expert identification in collaboration networks
- Recommendation systems using centrality
Parameter Tuning¶
- alpha = 0.0: Pure common neighbor count (ignores centrality)
- alpha = 0.5: Balanced common neighbor and centrality
- alpha = 1.0: Strong centrality weighting
- alpha > 1.0: Very strong centrality emphasis
Comparison with Other Methods¶
| Method | Speed | Accuracy | Global Info |
|---|---|---|---|
| Common Neighbors | ⚡⚡⚡ | Good | No |
| Jaccard | ⚡⚡⚡ | Good | No |
| Common Neighbor Centrality | ⚡⚡ | Better | Yes |
| Centrality Product | ⚡ | Good | Yes |