3  Centrality

Centrality is one of the most fundamental concepts in network analysis. It provides a way to quantify the importance or prominence of individual nodes within a network. In social networks, centrality helps answer questions like: Who is the most influential person? Who controls the flow of information? Who can reach others most efficiently? There is no single answer to “who is most central” because importance depends on the structural feature we consider relevant. This chapter introduces the most commonly used centrality indices, discusses their interpretation, and demonstrates how to compute them in R.

3.1 Packages Needed for this Chapter

library(igraph)
library(netrankr)
library(networkdata)

3.2 What Is Centrality?

At its core, a centrality index assigns a numeric value to each node in a network. Higher values indicate greater centrality, but what “central” means depends on the index. Different indices capture fundamentally different notions of importance, each tied to a different structural property of the network. The four most widely used families of centrality can be mapped to distinct social intuitions:

  • Degree measures activity: How many direct connections does a node have? A node with many ties is active, popular, or well-connected.
  • Closeness measures efficiency: How quickly can a node reach all others? A node that is close to everyone can spread information or access resources with minimal steps.
  • Betweenness measures brokerage: How often does a node sit on the shortest path between others? A node with high betweenness controls or mediates the flow between other parts of the network.
  • Eigenvector centrality measures prestige: Is a node connected to other well-connected nodes? A node may have few ties, but if those ties are to important nodes, it is central by association.

These four intuitions are not interchangeable. A broker (high betweenness) need not be popular (high degree), and a prestigious node (high eigenvector) need not be efficient at reaching everyone (high closeness). This is precisely why multiple indices exist and why the choice of index should be guided by the research question at hand.

Note

For a more formal treatment of what constitutes a centrality index and the structural properties that underlie different indices, see Chapter 4.

3.3 Centrality Indices in igraph

The igraph package implements a broad range of centrality indices. To illustrate them, we use the dbces11 graph from the netrankr package, shown in Figure 3.1. This small network is useful because different indices disagree on which node is most central, making the distinctions between indices visible.

data("dbces11")
Figure 3.1: The dbces11 example network used to illustrate centrality indices.

3.3.1 Degree

The most straightforward centrality index is degree, which counts the number of neighbors a node has. A node with high degree is directly connected to many others, making it active or popular in a social sense.

degree(dbces11)
A B C D E F G H I J K 
1 1 2 2 3 4 4 4 4 4 5 

For weighted networks, strength (also called weighted degree) sums the edge weights instead of simply counting ties. This is useful when ties carry different intensities, such as frequency of communication or volume of trade.

strength(g, weights = E(g)$weight)

3.3.2 Closeness

Closeness centrality is based on the shortest path distances between nodes. The idea is that a central node can reach all other nodes quickly. Formally, it is defined as the inverse of the sum of distances from a node to all others, so that nodes with shorter total distances receive higher scores.

closeness(dbces11)
         A          B          C          D          E          F          G 
0.03703704 0.02941176 0.04000000 0.04000000 0.05000000 0.05882353 0.05263158 
         H          I          J          K 
0.05555556 0.05555556 0.05263158 0.05555556 
Tip

Closeness centrality is only well-defined for connected networks. In disconnected networks, distances between components are infinite. igraph handles this by only considering reachable nodes, but results should be interpreted with caution. Alternatives like harmonic closeness (the sum of inverse distances) handle disconnected networks more gracefully.

3.3.3 Betweenness

Betweenness centrality counts how often a node lies on the shortest path between pairs of other nodes. A node with high betweenness acts as a bridge or broker: removing it would increase distances or disconnect parts of the network. The raw count is typically normalized by the total number of shortest paths.

betweenness(dbces11)
        A         B         C         D         E         F         G         H 
 0.000000  0.000000  0.000000  9.000000  3.833333  9.833333  2.666667 16.333333 
        I         J         K 
 7.333333  1.333333 14.666667 

3.3.4 Eigenvector Centrality

Eigenvector centrality extends the idea of degree by weighting each connection by the centrality of the neighbor. A node is central if it is connected to other central nodes. This recursive definition is resolved by computing the leading eigenvector of the adjacency matrix.

eigen_centrality(dbces11)$vector
        A         B         C         D         E         F         G         H 
0.2259630 0.0645825 0.3786244 0.2415182 0.5709057 0.9846544 1.0000000 0.8386195 
        I         J         K 
0.9113529 0.9986474 0.8450304 

3.3.5 Subgraph Centrality

Subgraph centrality quantifies the participation of each node in all subgraphs of the network, weighted by the inverse factorial of their size. It captures how embedded a node is in the local structure of the network, counting closed walks of all lengths.

subgraph_centrality(dbces11)
       A        B        C        D        E        F        G        H 
1.825100 1.595400 3.148571 2.423091 4.387127 7.807257 7.939410 6.672783 
       I        J        K 
7.032672 8.242124 7.389559 

3.3.6 Comparing Indices

Figure 3.2 shows the most central node according to each of the indices computed above. Each index picks a different node as most central, highlighting how the choice of index determines what we consider “important.”

Figure 3.2: Most central node for different centrality indices in the dbces11 graph. DC = degree, BC = betweenness, CC = closeness, EC = eigenvector, SC = subgraph centrality.

While this is a toy example, it illustrates an important point: centrality indices can produce substantially different rankings. In empirical settings, this means the choice of index is consequential and should be driven by the research question rather than convenience.

3.3.7 Directed Networks: PageRank, Hubs, and Authorities

Several centrality indices are specifically designed for directed networks. PageRank is perhaps the most well-known, originally developed to rank web pages. It assigns importance based on the number and quality of incoming links. Hub and authority scores distinguish between two roles: a good hub points to many good authorities, and a good authority is pointed to by many good hubs.

set.seed(42)
g_dir <- sample_pa(20, directed = TRUE)
page_rank(g_dir)$vector
 [1] 0.19857238 0.21486241 0.13201548 0.07961693 0.07013303 0.02948775
 [7] 0.01593933 0.01593933 0.01593933 0.01593933 0.01593933 0.01593933
[13] 0.02948775 0.01593933 0.01593933 0.04100392 0.01593933 0.02948775
[19] 0.01593933 0.01593933
hub_score(g_dir)$vector
Warning: `hub_score()` was deprecated in igraph 2.0.3.
ℹ Please use `hits_scores()` instead.
 [1] 2.140924e-19 0.000000e+00 1.000000e+00 0.000000e+00 1.000000e+00
 [6] 2.225879e-16 2.225873e-16 0.000000e+00 0.000000e+00 1.000000e+00
[11] 0.000000e+00 0.000000e+00 2.225857e-16 0.000000e+00 1.440822e-16
[16] 0.000000e+00 0.000000e+00 6.556735e-17 0.000000e+00 1.000000e+00
authority_score(g_dir)$vector
Warning: `authority_score()` was deprecated in igraph 2.1.0.
ℹ Please use `hits_scores()` instead.
 [1] 0.000000e+00 0.000000e+00 1.669389e-16 1.080591e-16 1.000000e+00
 [6] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
[11] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
[16] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00

In directed social networks, PageRank can identify individuals who receive attention or endorsement from well-connected others, while hub and authority scores can distinguish between those who actively seek out connections (hubs) and those who attract them (authorities).

3.3.8 Correlation Between Indices

Because different centrality indices measure different structural properties, it is useful to examine how they relate to each other on a given network. On some networks, all indices will largely agree; on others, they will diverge considerably. We can examine this using a larger network.

data("karate")
cent_df <- data.frame(
  degree = degree(karate),
  closeness = closeness(karate),
  betweenness = betweenness(karate),
  eigen = eigen_centrality(karate)$vector
)
round(cor(cent_df), 2)
            degree closeness betweenness eigen
degree        1.00      0.77        0.91  0.92
closeness     0.77      1.00        0.72  0.90
betweenness   0.91      0.72        1.00  0.80
eigen         0.92      0.90        0.80  1.00
Figure 3.3: Pairwise scatter plots of four centrality indices computed on the karate club network.

The correlation matrix and scatter plots reveal which indices capture similar information and where they diverge. High correlations (e.g., between degree and eigenvector centrality) suggest that the structural features they measure overlap in this network, while low correlations indicate genuinely different dimensions of centrality.

3.4 Normalization

Many centrality indices can be normalized to produce values that are comparable across networks of different sizes. For instance, raw degree depends on the number of nodes in the network, making it difficult to compare across networks. Normalized degree divides by the maximum possible degree (\(n - 1\)), yielding a proportion.

degree(dbces11)
A B C D E F G H I J K 
1 1 2 2 3 4 4 4 4 4 5 
degree(dbces11, normalized = TRUE)
  A   B   C   D   E   F   G   H   I   J   K 
0.1 0.1 0.2 0.2 0.3 0.4 0.4 0.4 0.4 0.4 0.5 
Tip

Normalization is essential when comparing centrality across networks of different sizes. Within a single network, the ranking of nodes is unaffected by normalization, so it only matters when you need scores on a comparable scale.

Most igraph centrality functions accept a normalized argument. For betweenness and closeness, normalization adjusts for both network size and directedness.

3.5 Centralization

While centrality is a node-level property, centralization is a network-level summary that captures how unequal the distribution of centrality is across all nodes. Freeman’s centralization compares the observed network to the theoretical maximum inequality, which occurs in a star graph (one node connected to all others, no other edges).

A centralization score of 1 means the network is maximally centralized (like a star), while a score near 0 means centrality is evenly distributed (like a ring or complete graph).

Figure 3.4: A star graph (left) has maximum centralization while a ring graph (right) has minimum centralization.
centr_degree(make_star(10, mode = "undirected"))$centralization
[1] 0.8
centr_degree(make_ring(10))$centralization
[1] 0

igraph provides centralization functions for degree (centr_degree()), betweenness (centr_betw()), closeness (centr_clo()), and eigenvector centrality (centr_eigen()).

c(
  degree = centr_degree(karate)$centralization,
  betweenness = centr_betw(karate)$centralization,
  closeness = centr_clo(karate)$centralization,
  eigen = centr_eigen(karate)$centralization
)
     degree betweenness   closeness       eigen 
  0.3761141   0.4055572   0.2981949   0.6458497 

These scores tell us not just who is central, but how centralized the network is as a whole. A highly centralized network depends heavily on a few key nodes, making it potentially vulnerable if those nodes are removed.

3.6 Other Centrality Packages

Beyond igraph, several R packages offer additional centrality indices. The sna package implements indices such as flow betweenness (based on maximum flow rather than shortest paths), information centrality (based on information-theoretic measures), and stress centrality (counting all shortest paths through a node, without normalization).

The centiserve package provides the largest collection, with over 30 additional indices. Packages like CINNA, influenceR, and keyplayer offer smaller, more specialized selections.

Note

The sheer number of available centrality indices can be overwhelming. More indices does not mean better analysis. It is generally more productive to choose one or two indices that align with your research question than to compute all available options and pick the most favorable result.

3.7 Choosing a Centrality Index

With so many indices available, how should one choose? The key is to let the research question guide the choice rather than the other way around:

  • If you are interested in activity or popularity, degree (or strength for weighted networks) is the natural choice.
  • If you care about efficiency of communication or independence, closeness captures how quickly a node can reach others.
  • If brokerage or control is your focus, betweenness identifies nodes that bridge different parts of the network.
  • If you are interested in influence through connections, eigenvector centrality or PageRank captures prestige by association.

The worst practice is to compute several indices and then selectively report whichever supports the desired narrative. In the best case, you have a substantive argument for why a specific structural property matters, apply the corresponding index, and let the result speak to your hypothesis.

When multiple indices seem equally defensible and you are uncertain which to choose, the approach introduced in Chapter 4 offers a principled alternative: rather than committing to a single index, you can analyse the partial ordering that most indices agree on.

3.8 Use Case: Florentine Families

A classic application of centrality analysis is the Florentine Families dataset, which records marriage ties among prominent Renaissance families in Florence. This network is included in the networkdata package.

data("flo_marriage")
Figure 3.5: Marriage network among Florentine families. Node size is proportional to the wealth of each family.

Marriages in Renaissance Florence were strategic alliances designed to improve a family’s political standing and access to resources. The network in Figure 3.5 shows these marriage ties, with node sizes proportional to each family’s wealth. Although the Strozzi were the wealthiest family, it was ultimately the Medici who rose to become the most powerful. A centrality analysis helps explain why.

The table below shows the centrality ranking of each family across the four most commonly used indices (1 = highest rank).

Family Degree Betweenness Closeness Eigenvector
Acciaiuoli 13.5 14 11.5 12
Albizzi 6.5 3 3.5 9
Barbadori 10.5 8 6.5 10
Bischeri 6.5 6 8.0 6
Castellani 6.5 10 9.5 8
Ginori 13.5 14 13.0 14
Guadagni 2.5 2 5.0 5
Lamberteschi 13.5 14 14.0 13
Medici 1.0 1 1.0 1
Pazzi 13.5 14 15.0 15
Peruzzi 6.5 11 11.5 7
Pucci 16.0 14 16.0 16
Ridolfi 6.5 5 2.0 3
Salviati 10.5 4 9.5 11
Strozzi 2.5 7 6.5 2
Tornabuoni 6.5 9 3.5 4

The Medici rank first (or nearly first) on every index. Their high degree means they had the most marriage ties, making them the most active family in forming alliances. Their top betweenness ranking reveals that they occupied a critical brokerage position: many of the shortest paths between other families passed through them, giving the Medici control over the flow of information and political favors. Their high closeness means they could reach any other family through fewer intermediaries than anyone else. And their eigenvector centrality shows that they were not just well-connected, but connected to other well-connected families.

The Strozzi, despite their wealth, were structurally peripheral. Their marriage ties connected them to less central families, limiting their ability to broker relationships or influence the network as a whole. This case illustrates a key insight of network analysis: structural position can matter more than individual attributes like wealth.

We can also examine centralization to characterize the network as a whole:

c(
  degree = centr_degree(flo_marriage)$centralization,
  betweenness = centr_betw(flo_marriage)$centralization
)
     degree betweenness 
  0.2333333   0.3834921 

The moderately high betweenness centralization confirms that brokerage in this network was concentrated in a few families, with the Medici at the center.

In the next chapter, we explore more advanced questions about centrality: what structural properties underlie different indices, and how to analyse centrality when the choice of index is uncertain.