library(igraph)
library(netrankr)
library(networkdata)3 Centrality
Centrality is one of the most fundamental concepts in network analysis. It provides a way to quantify the importance or prominence of individual nodes within a network. In social networks, centrality helps answer questions like: Who is the most influential person? Who controls the flow of information? Who can reach others most efficiently? There is no single answer to “who is most central” because importance depends on the structural feature we consider relevant. This chapter introduces the most commonly used centrality indices, discusses their interpretation, and demonstrates how to compute them in R.
3.1 Packages Needed for this Chapter
3.2 What Is Centrality?
At its core, a centrality index assigns a numeric value to each node in a network. Higher values indicate greater centrality, but what “central” means depends on the index. Different indices capture fundamentally different notions of importance, each tied to a different structural property of the network. The four most widely used families of centrality can be mapped to distinct social intuitions:
- Degree measures activity: How many direct connections does a node have? A node with many ties is active, popular, or well-connected.
- Closeness measures efficiency: How quickly can a node reach all others? A node that is close to everyone can spread information or access resources with minimal steps.
- Betweenness measures brokerage: How often does a node sit on the shortest path between others? A node with high betweenness controls or mediates the flow between other parts of the network.
- Eigenvector centrality measures prestige: Is a node connected to other well-connected nodes? A node may have few ties, but if those ties are to important nodes, it is central by association.
These four intuitions are not interchangeable. A broker (high betweenness) need not be popular (high degree), and a prestigious node (high eigenvector) need not be efficient at reaching everyone (high closeness). This is precisely why multiple indices exist and why the choice of index should be guided by the research question at hand.
For a more formal treatment of what constitutes a centrality index and the structural properties that underlie different indices, see Chapter 4.
3.3 Centrality Indices in igraph
The igraph package implements a broad range of centrality indices. To illustrate them, we use the dbces11 graph from the netrankr package, shown in Figure 3.1. This small network is useful because different indices disagree on which node is most central, making the distinctions between indices visible.
data("dbces11")
3.3.1 Degree
The most straightforward centrality index is degree, which counts the number of neighbors a node has. A node with high degree is directly connected to many others, making it active or popular in a social sense.
degree(dbces11)A B C D E F G H I J K
1 1 2 2 3 4 4 4 4 4 5
For weighted networks, strength (also called weighted degree) sums the edge weights instead of simply counting ties. This is useful when ties carry different intensities, such as frequency of communication or volume of trade.
strength(g, weights = E(g)$weight)3.3.2 Closeness
Closeness centrality is based on the shortest path distances between nodes. The idea is that a central node can reach all other nodes quickly. Formally, it is defined as the inverse of the sum of distances from a node to all others, so that nodes with shorter total distances receive higher scores.
closeness(dbces11) A B C D E F G
0.03703704 0.02941176 0.04000000 0.04000000 0.05000000 0.05882353 0.05263158
H I J K
0.05555556 0.05555556 0.05263158 0.05555556
Closeness centrality is only well-defined for connected networks. In disconnected networks, distances between components are infinite. igraph handles this by only considering reachable nodes, but results should be interpreted with caution. Alternatives like harmonic closeness (the sum of inverse distances) handle disconnected networks more gracefully.
3.3.3 Betweenness
Betweenness centrality counts how often a node lies on the shortest path between pairs of other nodes. A node with high betweenness acts as a bridge or broker: removing it would increase distances or disconnect parts of the network. The raw count is typically normalized by the total number of shortest paths.
betweenness(dbces11) A B C D E F G H
0.000000 0.000000 0.000000 9.000000 3.833333 9.833333 2.666667 16.333333
I J K
7.333333 1.333333 14.666667
3.3.4 Eigenvector Centrality
Eigenvector centrality extends the idea of degree by weighting each connection by the centrality of the neighbor. A node is central if it is connected to other central nodes. This recursive definition is resolved by computing the leading eigenvector of the adjacency matrix.
eigen_centrality(dbces11)$vector A B C D E F G H
0.2259630 0.0645825 0.3786244 0.2415182 0.5709057 0.9846544 1.0000000 0.8386195
I J K
0.9113529 0.9986474 0.8450304
3.3.5 Subgraph Centrality
Subgraph centrality quantifies the participation of each node in all subgraphs of the network, weighted by the inverse factorial of their size. It captures how embedded a node is in the local structure of the network, counting closed walks of all lengths.
subgraph_centrality(dbces11) A B C D E F G H
1.825100 1.595400 3.148571 2.423091 4.387127 7.807257 7.939410 6.672783
I J K
7.032672 8.242124 7.389559
3.3.6 Comparing Indices
Figure 3.2 shows the most central node according to each of the indices computed above. Each index picks a different node as most central, highlighting how the choice of index determines what we consider “important.”
While this is a toy example, it illustrates an important point: centrality indices can produce substantially different rankings. In empirical settings, this means the choice of index is consequential and should be driven by the research question rather than convenience.
3.3.8 Correlation Between Indices
Because different centrality indices measure different structural properties, it is useful to examine how they relate to each other on a given network. On some networks, all indices will largely agree; on others, they will diverge considerably. We can examine this using a larger network.
data("karate")cent_df <- data.frame(
degree = degree(karate),
closeness = closeness(karate),
betweenness = betweenness(karate),
eigen = eigen_centrality(karate)$vector
)
round(cor(cent_df), 2) degree closeness betweenness eigen
degree 1.00 0.77 0.91 0.92
closeness 0.77 1.00 0.72 0.90
betweenness 0.91 0.72 1.00 0.80
eigen 0.92 0.90 0.80 1.00
The correlation matrix and scatter plots reveal which indices capture similar information and where they diverge. High correlations (e.g., between degree and eigenvector centrality) suggest that the structural features they measure overlap in this network, while low correlations indicate genuinely different dimensions of centrality.
3.4 Normalization
Many centrality indices can be normalized to produce values that are comparable across networks of different sizes. For instance, raw degree depends on the number of nodes in the network, making it difficult to compare across networks. Normalized degree divides by the maximum possible degree (\(n - 1\)), yielding a proportion.
degree(dbces11)A B C D E F G H I J K
1 1 2 2 3 4 4 4 4 4 5
degree(dbces11, normalized = TRUE) A B C D E F G H I J K
0.1 0.1 0.2 0.2 0.3 0.4 0.4 0.4 0.4 0.4 0.5
Normalization is essential when comparing centrality across networks of different sizes. Within a single network, the ranking of nodes is unaffected by normalization, so it only matters when you need scores on a comparable scale.
Most igraph centrality functions accept a normalized argument. For betweenness and closeness, normalization adjusts for both network size and directedness.
3.5 Centralization
While centrality is a node-level property, centralization is a network-level summary that captures how unequal the distribution of centrality is across all nodes. Freeman’s centralization compares the observed network to the theoretical maximum inequality, which occurs in a star graph (one node connected to all others, no other edges).
A centralization score of 1 means the network is maximally centralized (like a star), while a score near 0 means centrality is evenly distributed (like a ring or complete graph).
centr_degree(make_star(10, mode = "undirected"))$centralization[1] 0.8
centr_degree(make_ring(10))$centralization[1] 0
igraph provides centralization functions for degree (centr_degree()), betweenness (centr_betw()), closeness (centr_clo()), and eigenvector centrality (centr_eigen()).
c(
degree = centr_degree(karate)$centralization,
betweenness = centr_betw(karate)$centralization,
closeness = centr_clo(karate)$centralization,
eigen = centr_eigen(karate)$centralization
) degree betweenness closeness eigen
0.3761141 0.4055572 0.2981949 0.6458497
These scores tell us not just who is central, but how centralized the network is as a whole. A highly centralized network depends heavily on a few key nodes, making it potentially vulnerable if those nodes are removed.
3.6 Other Centrality Packages
Beyond igraph, several R packages offer additional centrality indices. The sna package implements indices such as flow betweenness (based on maximum flow rather than shortest paths), information centrality (based on information-theoretic measures), and stress centrality (counting all shortest paths through a node, without normalization).
The centiserve package provides the largest collection, with over 30 additional indices. Packages like CINNA, influenceR, and keyplayer offer smaller, more specialized selections.
The sheer number of available centrality indices can be overwhelming. More indices does not mean better analysis. It is generally more productive to choose one or two indices that align with your research question than to compute all available options and pick the most favorable result.
3.7 Choosing a Centrality Index
With so many indices available, how should one choose? The key is to let the research question guide the choice rather than the other way around:
- If you are interested in activity or popularity, degree (or strength for weighted networks) is the natural choice.
- If you care about efficiency of communication or independence, closeness captures how quickly a node can reach others.
- If brokerage or control is your focus, betweenness identifies nodes that bridge different parts of the network.
- If you are interested in influence through connections, eigenvector centrality or PageRank captures prestige by association.
The worst practice is to compute several indices and then selectively report whichever supports the desired narrative. In the best case, you have a substantive argument for why a specific structural property matters, apply the corresponding index, and let the result speak to your hypothesis.
When multiple indices seem equally defensible and you are uncertain which to choose, the approach introduced in Chapter 4 offers a principled alternative: rather than committing to a single index, you can analyse the partial ordering that most indices agree on.
3.8 Use Case: Florentine Families
A classic application of centrality analysis is the Florentine Families dataset, which records marriage ties among prominent Renaissance families in Florence. This network is included in the networkdata package.
data("flo_marriage")
Marriages in Renaissance Florence were strategic alliances designed to improve a family’s political standing and access to resources. The network in Figure 3.5 shows these marriage ties, with node sizes proportional to each family’s wealth. Although the Strozzi were the wealthiest family, it was ultimately the Medici who rose to become the most powerful. A centrality analysis helps explain why.
The table below shows the centrality ranking of each family across the four most commonly used indices (1 = highest rank).
| Family | Degree | Betweenness | Closeness | Eigenvector |
|---|---|---|---|---|
| Acciaiuoli | 13.5 | 14 | 11.5 | 12 |
| Albizzi | 6.5 | 3 | 3.5 | 9 |
| Barbadori | 10.5 | 8 | 6.5 | 10 |
| Bischeri | 6.5 | 6 | 8.0 | 6 |
| Castellani | 6.5 | 10 | 9.5 | 8 |
| Ginori | 13.5 | 14 | 13.0 | 14 |
| Guadagni | 2.5 | 2 | 5.0 | 5 |
| Lamberteschi | 13.5 | 14 | 14.0 | 13 |
| Medici | 1.0 | 1 | 1.0 | 1 |
| Pazzi | 13.5 | 14 | 15.0 | 15 |
| Peruzzi | 6.5 | 11 | 11.5 | 7 |
| Pucci | 16.0 | 14 | 16.0 | 16 |
| Ridolfi | 6.5 | 5 | 2.0 | 3 |
| Salviati | 10.5 | 4 | 9.5 | 11 |
| Strozzi | 2.5 | 7 | 6.5 | 2 |
| Tornabuoni | 6.5 | 9 | 3.5 | 4 |
The Medici rank first (or nearly first) on every index. Their high degree means they had the most marriage ties, making them the most active family in forming alliances. Their top betweenness ranking reveals that they occupied a critical brokerage position: many of the shortest paths between other families passed through them, giving the Medici control over the flow of information and political favors. Their high closeness means they could reach any other family through fewer intermediaries than anyone else. And their eigenvector centrality shows that they were not just well-connected, but connected to other well-connected families.
The Strozzi, despite their wealth, were structurally peripheral. Their marriage ties connected them to less central families, limiting their ability to broker relationships or influence the network as a whole. This case illustrates a key insight of network analysis: structural position can matter more than individual attributes like wealth.
We can also examine centralization to characterize the network as a whole:
c(
degree = centr_degree(flo_marriage)$centralization,
betweenness = centr_betw(flo_marriage)$centralization
) degree betweenness
0.2333333 0.3834921
The moderately high betweenness centralization confirms that brokerage in this network was concentrated in a few families, with the Medici at the center.
In the next chapter, we explore more advanced questions about centrality: what structural properties underlie different indices, and how to analyse centrality when the choice of index is uncertain.