14  Introduction

Using inferential network analysis, we can determine whether observed patterns — such as transitivity or homophily — are statistically significant or simply the result of random processes. This allows us to differentiate between patterns that are meaningful and those that arise by chance.

Statistical network models bridge the gap between theory and empirical reality, allowing us to test hypotheses, uncover hidden mechanisms, and build predictive models that inform policy and intervention across diverse domains. When we look at a network, we don’t just want to describe what we see. We want to understand why the connections exist and how they shape the bigger picture. Thus to move beyond description and into prediction and explanation, statistical network modeling becomes essential. They allow us to the follwing questions questions: What would happen if we measured the network at a different point in time, among a different set of actors, or under different environmental conditions? Statistical models help estimate expected outcomes along with their variability, providing deeper insights into the nature of social structures.

14.1 The Challenges of Network Modeling

Most traditional data analysis treats the world like a collection of independent units; people, companies, events, each with their own neatly packaged characteristics. This monadic approach assumes that what happens to one entity has no impact on another, much like strangers sitting silently in a waiting room. This assumption, known as independent and identically distributed (i.i.d.) data, underlies most statistical models.

But network data laughs in the face of independence. Here, the unit of observation isn’t the individual, but the relationships between them; whether it’s friendships, collaboration or trade agreements. These relationships (or ties) don’t exist in isolation; they overlap, influence each other, and evolve in complex ways. When walking down the street, you do not let the flip of a coin determine whether you will befriend a by-passer or not.

A defining feature of network data is that one tie can change the likelihood of another forming. Think of social circles: if Alice and Bob are both friends with Carol, odds are Alice and Bob will become friends too. This phenomenon (triadic closure), explains everything from friend groups to why your LinkedIn keeps suggesting you connect with your ex’s new colleague.

Unlike spatial or temporal dependencies, where values near each other in space or time are correlated, network dependencies aren’t just noise to correct for; they’re the main event. Instead of being a nuisance, these dependencies are precisely what we study to understand how networks form and evolve.

14.2 Why Statistical Network Modeling?

Networks do not emerge randomly; rather, they are shaped by underlying social processes that drive the formation of ties and structures. These processes operate at both local and global levels, leading to self-organizing patterns that define social interaction.

At the core of social networks is the idea that structural patterns emerge locally. Individual relationships form through reciprocity, trust, and shared interests, and these small-scale interactions gradually create larger, more complex structures. Network ties do not form in isolation; they are dependent on existing connections, meaning that the presence of one tie often leads to another.

Network theory provides a framework for understanding how these mechanisms shape the growth, stability, and dynamics of social networks. Moreover, inferential statistics allow us to test hypotheses based on these theories with respect to an observed network. Below, a few examples of theories that are directly connected to the social rules governing network formation are given:

  • Social Exchange & Reciprocity – Ties are often formed based on resource exchange, reinforcing reciprocity in network interactions (Gouldner 1960).
  • Structural Balance & Triadic Closure – Networks tend toward stable triads, following the principle of balance theory (Heider 1946; Cartwright and Harary 1956)
  • Structural Holes & Brokerage – Some individuals act as bridges between disconnected groups, gaining influence by controlling information flow. This process is linked to social influence, as brokers shape the diffusion of knowledge and innovation (Burt 1992).
  • Homophily: Selection & Influence (“Birds of a feather flock together”) — People tend to form relationships with those who share similar attributes, attitude and behavior (McPherson, Smith-Lovin, and Cook 2001).
  • The Matthew Effect & Preferential Attachment – Individuals with many connections tend to accumulate even more—a process known in network theory as preferential attachment. This dynamic follows the rich-get-richer principle, reinforcing homopholy (or assortativity) as well-connected individuals attract even more ties (Merton 1968; Barabási and Albert 1999).

When combined with inferential methods, network theory offers more than just structural description—it enables deep insight into how social networks emerge, stabilize, and adapt. This is vital for understanding social influence, the spread of information or disease, and the resilience of communities in the face of disruption.

Statistical models of network data can serve several major purposes. First, they explain social relations and behaviors by identifying the underlying rules and processes that govern the formation and evolution of networks. Understanding these mechanisms helps us reveal why certain network structures arise, such as the tendency for individuals to form triads or clusters.

Second, statistical models are used for predicting social relations and behaviors. By learning from observed data, models can forecast future interactions or changes in network structure, making them invaluable for applications such as organizational planning, public health interventions, and online recommendation systems.

Third, they enable the random generation of networks that resemble observed data. This is essential in fields such as algorithm engineering, where realistic network structures are needed to test algorithm performance. Moreover, simulated networks allow researchers to study processes such as information diffusion or the spread of diseases under controlled conditions.

At the heart of these purposes lies the concept of specifying realistic probability distributions for social networks. By formalizing hypothetical dependencies between ties—such as the likelihood of reciprocity or preferential attachment—researchers can sample graphs at random from these distributions. A good model will produce sampled networks that closely resemble the observed network with respect to key features of interest. This resemblance indicates that the modeled structural effects plausibly explain the emergence of the network.

To determine whether a model is a good fit for the data, we rely on parametric or non-parametric methods, depending on the assumptions we are willing (or able) to make about the data.

14.3 Parametric and Non-Parametric Methods

Parametric methods in network analysis are based on the assumption that the data conform to a specific theoretical probability distribution. This foundation enables formal statistical inference: researchers can evaluate network summary statistics—such as centrality, density, or clustering coefficients—within the framework of known distributions.

A big advantage of parametric approaches is their ability to incorporate dependencies among ties. In social networks, the presence of one tie often influences the likelihood of others forming. Parametric models, such as Exponential Random Graph Models (ERGMs) and Stochastic Actor-Oriented Models (SAOMs), are explicitly designed to account for these interdependencies. They can model complex phenomena like reciprocity (the tendency for ties to be mutual), transitivity (the tendency for friends of friends to become friends), and homophily (the preference for connecting with similar others). When the assumptions underlying parametric methods hold, these models provide powerful explanations and predictions. They allow researchers to simulate alternative network scenarios, test competing hypotheses about network formation, and predict how networks might evolve under different conditions.

In contrast, non-parametric approaches avoid making strong assumptions about the underlying distribution of the data. These methods are especially valuable when working with real-world networks that are too complex, noisy, or irregular to fit neatly into a parametric mold. Rather than relying on predefined distributions, non-parametric approaches use data-driven techniques to generate reference distributions, often through permutation or resampling methods.

Non-parametric methods are particularly useful in exploratory analyses or when assessing the robustness of findings derived from parametric models. They provide a flexible framework for evaluating whether observed patterns—such as high clustering or assortative mixing—are statistically meaningful or likely to have arisen by chance.

The choice between parametric and non-parametric methods should be guided by the research objectives, the nature and structure of the data, and the strength of theoretical assumptions. In the following chapters, we will explore both approaches in greater depth, focusing on their foundational principles, how they align with different data structures, and their practical use in modeling social networks.

Note on data structure and object assignments

In the inferential section of this book, we will be working with matrix, network and graph objects, interchangeably. It is important that you can understand and pay attention to these since some package functions only work with graph objects, and others with network/array objects. We try to keep it clear here by using suffix g, net and mat to clarify object assignment.

References