library(relevent)19 Relational event Models (REMs)
Imagine you are watching a group of people at a party. At any moment, someone might start a conversation with someone else. The question is: who will start talking to whom next?
Many different interactions are possible, but only one actually happens. Understanding why that particular interaction occurs rather than any of the others is the kind of problem that event-based statistical models are designed to address.
One approach to studying such processes is Cox regression, a statistical method used to analyze events that unfold over time. Instead of focusing only on whether something happens, Cox regression models the rate at which events occur, and how different factors make an event more or less likely to happen sooner rather than later. In intuitive terms, it helps us understand which possible events are “more likely to happen next” given the current situation and the characteristics of the actors involved.
Relational Event Models (REMs) build on this logic to study social interactions as they occur in sequence. Rather than treating relationships as static connections between individuals, REMs conceptualize networks as streams of actions (such as messages, conversations, or interactions) unfolding over time.
Traditional approaches in social network analysis have typically relied on two dominant perspectives. On the one hand, static network models, such as Exponential Random Graph Models (Chapter 17), focus on long-term equilibrium patterns in aggregated networks. On the other hand, longitudinal network models, such as Stochastic Actor Oriented Models (Chapter 18), examine how relatively stable ties evolve across discrete time periods. Both approaches emphasize relationships as relatively enduring structures observed over bounded intervals.
In contrast, Relational Event Models shift attention to the fine-grained sequence and timing of discrete behavioral events. Rather than treating ties as fixed or slowly evolving entities, REM conceptualizes networks as streams of interaction. Social structure is not simply observed, it is continuously produced and reproduced through temporally ordered actions.
A key distinction between ERGMs and REM lies in the question each approach seeks to answer. ERGMs ask whether a particular network structure is likely to emerge within a defined time window. REM, by contrast, asks:
Given the history of prior interactions, how likely is it that actor a will direct an event toward actor b at a specific moment?
This represents a conceptual shift from modeling cumulative structures to modeling event rates. In REM, the fundamental unit of analysis is not the tie, but the event. Each event is understood as occurring within a dynamic system shaped by prior interactions. The probability (or rate) of a new event depends on the evolving history of the network.
Relational Event Models pursue two central analytic objectives:
- Prediction: Accurately predict who will send a message to whom, given the prior sequence of events.
- Explanation: Identify the factors that influence the propensity for interaction between two actors.
These factors may include:
- Node attributes (e.g., status, role, demographic characteristics)
- Dyadic attributes (e.g., similarity, shared group membership)
- Historical patterns of communication (e.g., reciprocity, repetition, transitive closure)
- Environmental or contextual factors (e.g., time constraints, institutional settings)
By incorporating these elements, REM enables researchers to test hypotheses about the mechanisms that generate interaction sequences.
This chapter introduces the theoretical foundations and methodological tools of Relational Event Models. It begins by situating REM within the broader landscape of network analysis and clarifying its event-centered logic. It then develops the core components of the model—event histories, risk sets, and rate functions—before examining how structural, attribute-based, and contextual effects are specified and interpreted.
Through this approach, we will see how REM provides a rigorous and flexible framework for modeling dynamic interaction processes, illuminating the micro-level mechanisms through which social structure emerges over time.
19.1 Packages Needed for this Chapter
In this chapter we use the relevent package to estimate relational event models. The package implements the framework introduced by Butts (2008) and provides a straightforward and widely used approach for estimating REMs from sequences of dyadic interactions.
Several alternative implementations are also available. The goldfish package provides a more modern and flexible framework for relational event modeling. In contrast to relevent, which requires users to manually construct event statistics and covariates, goldfish offers a formula-based interface that allows endogenous effects and actor attributes to be specified directly within the model. The package also supports a broader range of structural effects and is designed to scale more efficiently to larger event datasets.
Another option is the rem package, which provides a simplified implementation of relational event models with a more modern interface than relevent. In particular, rem integrates more naturally with tidyverse-style workflows, making it convenient for researchers who prefer a tidy data structure and formula-based model specification. While it supports many core REM features, it is currently less widely used in applied research than relevent or goldfish.
Despite these differences in implementation, the underlying statistical logic remains the same across packages: relational event models estimate the intensity of possible interactions and compare competing events within a risk set. For introductory applications and teaching purposes, relevent remains a convenient and transparent tool, while goldfish or rem may be preferable in more complex or larger-scale analyses.
19.2 Model Specification
When relational event data contain information about the order of events (but not their precise timestamps), the modeling problem can be framed in terms of relative event hazards. This situation is mathematically equivalent to estimating a Cox proportional hazards model (or Cox regression) over a sequence of discrete event opportunities (Butts 2008). Rather than modeling the exact timing of events, we instead model the relative rate (or intensity) at which a particular interaction occurs next, given the prior history of events.
19.2.1 Formal Specification of the Relational Event Model
Following Butts (2008), relational event models are formulated in terms of a conditional event intensity function, which represents the instantaneous rate at which a specific interaction occurs given the past event history.
Let \(\lambda_{ab}(t)\) denote the intensity of an event from actor \(a\) to actor \(b\) at time \(t\), given the history of past events \(H_t\). The model is specified as
\[ \lambda_{ab}(t \mid H_t) = \lambda_0(t)\exp(\theta^\top s_{ab}(t)) \]
where:
- \(\lambda_{ab}(t)\) is the event intensity for interaction \(a \rightarrow b\) at time \(t\),
- \(\lambda_0(t)\) is the baseline rate of events,
- \(s_{ab}(t)\) is a vector of statistics describing the event (e.g., actor attributes, dyadic covariates, or history-based structural effects),
- \(\theta\) is a vector of model parameters.
The statistics \(s_{ab}(t)\) are typically functions of the event history, meaning they evolve as new interactions occur.
When only the order of events is observed (rather than exact timestamps), inference can proceed using the partial likelihood, which compares the intensity of the observed event with the intensities of all other possible events in the risk set \(R_t\). The probability that a specific event \(a \rightarrow b\) occurs next is therefore
\[ P(a \rightarrow b \text{ occurs next}) = \frac{\exp(\theta^\top s_{ab}(t))} {\sum_{(i,j)\in R_t}\exp(\theta^\top s_{ij}(t))} \]
where \(R_t\) denotes the set of all interactions that could occur at time \(t\).
This formulation shows that relational event models explain which interaction occurs next by comparing the relative intensities of all competing events in the risk set.
Although REMs are typically written using event intensities, estimation for ordered event data relies on the same likelihood logic as Cox proportional hazards models.
19.2.2 Relational Event Data Structure
To apply the event-based hazard logic to social interactions, we need a specific type of data structure known as relational event data. Rather than representing relationships as static ties between actors, relational event data record individual interactions as they occur over time.
Each observation typically includes:
- a sender (the actor initiating the interaction),
- a receiver (the actor targeted by the interaction),
- and a time or order indicating when the interaction occurred.
For example, a relational event dataset describing communication might look like this:
| time | sender | receiver |
|---|---|---|
| 1 | Alice | Bob |
| 2 | Bob | Alice |
| 3 | Carol | Alice |
In this format, the network is not represented as a fixed structure but as a chronological sequence of actions. The history of past events becomes crucial, because earlier interactions influence the likelihood of future ones. For instance, a reply may be more likely after receiving a message, or individuals who communicate frequently may continue interacting.
Relational event data therefore allow researchers to study how interaction processes unfold moment by moment, rather than only observing the final network structure.
19.2.3 Intuition: Competing Events
One intuitive way to think about Cox regression is as a race. Suppose several runners are lined up at the starting line. Each runner represents a possible event (for example, a person sending a message to another person). Each runner runs at a different speed depending on certain characteristics:
- Some runners have better shoes (a helpful covariate).
- Some runners are tired (a negative covariate).
- Some runners have run this race before and know the route (history effects).
The runner who reaches the finish line first is the event that occurs. Cox regression does not try to predict the exact finishing time. Instead, it focuses on who is most likely to win the race, given their characteristics.
At the core of Cox regression is the idea of a hazard. The hazard represents the instantaneous chance that an event happens right now, given that it has not yet occurred. A helpful analogy is popcorn in a microwave. Imagine many popcorn kernels heating up at the same time. At any moment, each kernel has some chance of popping. Some kernels pop early, others later. Cox regression models what makes certain kernels more likely to pop sooner. In social science terms, we might ask questions such as:
- Does previous interaction make a new message more likely?
- Are people more likely to communicate with similar others?
- Does status increase the chance that someone initiates contact?
Each of these factors changes the hazard rate, increasing or decreasing the likelihood that a particular event occurs next.
One of the most elegant features of Cox regression is that it does not require specifying the overall timing pattern of events. Events might occur very quickly at the beginning and slow down later, or slowly at first and accelerate over time. Cox regression essentially says:
That’s fine. We will ignore the overall clock and focus on comparing the competitors.
Instead of modeling absolute time, the model compares relative chances:
At this moment, which event is most likely to happen?
This is achieved through the partial likelihood, which evaluates the relative hazards of all possible events at each moment in time.
So why does this work so well for social interactions? In many datasets, especially relational event data, the exact timing of events is often less important than the order in which events occur.
Consider the following sequence:
- Alice messages Bob
- Bob replies to Alice
- Carol messages Alice
At step 3, several interactions were possible, including:
- Alice → Bob
- Alice → Carol
- Bob → Alice
- Bob → Carol
- Carol → Alice
- Carol → Bob
This set of possible interactions is called the risk set (which we return to more formally in the next section).
Cox regression allows us to ask:
Given the situation and the history of interactions, why did Carol message Alice rather than another interaction occurring?
The model compares the rate (or intensity) of the observed event with those of all other possible events in the same situation.
19.3 Modeling Procedure
Estimation of relational event models proceeds in three main steps. These steps transform the sequence of observed interactions into a dataset that can be estimated using a Cox-type likelihood.
Step 1: Constructing the Risk Sets
At each point in the event sequence, one interaction is observed, but many other interactions were possible and did not occur. All dyads that could potentially interact at that moment form the risk set.
Each event time can therefore be understood as a discrete choice situation, in which one event is selected from among several competing alternatives. The Cox partial likelihood compares the covariates of the observed event with those of all other possible events in the same risk set.
For each observed interaction, we identify all dyads that were eligible to interact at that time. We then augment the dataset by adding all unobserved but possible events corresponding to that risk set. This transforms the event history into a series of choice sets consisting of one observed event and multiple non-events at each step in the sequence.
Step 2: Computing Event Statistics
For both observed and unobserved events, we compute a set of statistics that capture factors influencing the likelihood of interaction. These statistics serve as explanatory variables in the relational event model.
The statistics can capture several types of mechanisms, including:
- Endogenous structural effects, derived from the history of previous interactions,
- Node-level attributes, describing characteristics of the actors involved,
- Dyadic covariates, capturing properties of the sender–receiver pair,
- Contextual variables, describing features of the environment in which interactions occur.
All statistics are evaluated dynamically, meaning they are recalculated at each event step to reflect the evolving interaction history.
The specific statistics included in our model are described in the following section.
Step 3: Estimating the Model
Finally, we estimate the relational event model using a Cox proportional hazards framework. Event occurrence is treated as the dependent variable, while the computed statistics serve as explanatory variables.
Estimation relies on the partial likelihood, which compares the intensity of the observed event with the intensities of all other possible events within the same risk set.
19.3.1 Endogenous Statistics
Relational event models allow us to incorporate endogenous statistics, which capture patterns that arise from the history of interactions itself. These statistics represent mechanisms through which past events influence the likelihood of future events. Some of these endogenous statistics are summarized in the following and in Table 19.1.
Let \(V=\{1,\dots,n\}\) denote the set of actors. An event at time \(t\) is represented as \(e_t=(i,j,t)\), where actor \(i\) sends an interaction to actor \(j\). Let \(Y_{ij}(t)\) be an indicator variable that equals 1 if the event \(i \rightarrow j\) occurs at time \(t\) and 0 otherwise. Further, let \(N_{ij}(t)\) denote the total number of interactions from \(i\) to \(j\) that have occurred up to time \(t\).
Reciprocity
Reciprocity captures the tendency for actors to immediately respond to interactions they have received. If the previous event was \(i \rightarrow j\), the statistic equals 1 when the next event is \(j \rightarrow i\).
\[ s_{\text{reciprocity}}(i,j,t) = Y_{ij}(t-1) \cdot Y_{ji}(t) \]
Repetition
Repetition captures the continuation of a dyadic interaction, meaning the same sender repeats an interaction with the same receiver.
\[ s_{\text{repetition}}(i,j,t) = Y_{ij}(t-1) \cdot Y_{ij}(t) \]
Past Interaction
This statistic measures the strength of the historical relationship between the sender and receiver. It is defined as the proportion of the sender’s previous interactions that were directed toward the receiver.
\[ s_{\text{past}}(i,j,t) = \frac{N_{ij}(t-1)} {\sum_{k=1}^{n} N_{ik}(t-1) + \sum_{k=1}^{n} N_{ki}(t-1)} \]
Sender Activity
Sender activity captures how frequently the sender has initiated interactions in the past. It is defined as the proportion of all previous events that were sent by the current sender.
\[ s_{\text{sender activity}}(i,j,t) = \frac{\sum_{k=1}^{n} N_{ik}(t-1)} {\sum_{p \neq q} N_{pq}(t-1)} \]
Sender Popularity
Sender popularity measures how often the sender has previously received interactions from others.
\[ s_{\text{sender popularity}}(i,j,t) = \frac{\sum_{k=1}^{n} N_{ki}(t-1)} {\sum_{p \neq q} N_{pq}(t-1)} \]
Receiver Activity
Receiver activity captures how frequently the receiver has initiated interactions in the past.
\[ s_{\text{receiver activity}}(i,j,t) = \frac{\sum_{k=1}^{n} N_{jk}(t-1)} {\sum_{p \neq q} N_{pq}(t-1)} \]
Receiver Popularity
Receiver popularity measures how frequently the receiver has previously been the target of interactions.
\[ s_{\text{receiver popularity}}(i,j,t) = \frac{\sum_{k=1}^{n} N_{kj}(t-1)} {\sum_{p \neq q} N_{pq}(t-1)} \]
| Statistic | Description | Formula |
|---|---|---|
| Reciprocity | Immediate response to previous interaction | \(Y_{ij}(t-1)Y_{ji}(t)\) |
| Repetition | Same sender repeats interaction | \(Y_{ij}(t-1)Y_{ij}(t)\) |
| Past interaction | Proportion of sender’s past interactions directed toward receiver | \(\frac{N_{ij}(t-1)}{\sum_k N_{ik}(t-1)+\sum_k N_{ki}(t-1)}\) |
| Sender activity | Proportion of past events sent by the sender | \(\frac{\sum_k N_{ik}(t-1)}{\sum_{p\ne q}N_{pq}(t-1)}\) |
| Sender popularity | Proportion of past events received by the sender | \(\frac{\sum_k N_{ki}(t-1)}{\sum_{p\ne q}N_{pq}(t-1)}\) |
| Receiver activity | Proportion of past events sent by the receiver | \(\frac{\sum_k N_{jk}(t-1)}{\sum_{p\ne q}N_{pq}(t-1)}\) |
| Receiver popularity | Proportion of past events received by the receiver | \(\frac{\sum_k N_{kj}(t-1)}{\sum_{p\ne q}N_{pq}(t-1)}\) |
19.3.2 Exogenous Covariates
In addition to endogenous statistics derived from the event history, relational event models can also incorporate exogenous covariates. These variables capture attributes of actors or dyads that may influence the likelihood of interactions but are not themselves determined by the evolving sequence of events.
Exogenous covariates may include actor-level attributes (e.g., demographic characteristics, roles, or status), dyadic attributes (e.g., similarity or shared group membership), or other contextual variables describing the environment in which interactions occur.
Let \(c_k\) denote an attribute associated with actor \(k\). For binary attributes, this can be represented as
\[ c_k = \begin{cases} 1 & \text{if actor } k \text{ possesses the attribute} \\ 0 & \text{otherwise} \end{cases} \]
Using this attribute, several covariates can be constructed for a potential event \(i \rightarrow j\). For example:
- Sender attribute: indicates whether the sender possesses the attribute
- Receiver attribute: indicates whether the receiver possesses the attribute
- Sender–receiver interaction: indicates whether both sender and receiver possess the attribute
These covariates allow the model to test whether actor characteristics influence interaction patterns, such as whether certain actors are more likely to initiate communication, receive interactions, or interact preferentially with others sharing the same attribute.
The exogenous covariates included in the model are summarized in Table 19.2.
| Covariate | Description | Formula |
|---|---|---|
| Sender attribute | Indicates whether the sender possesses the attribute | \(c_i\) |
| Receiver attribute | Indicates whether the receiver possesses the attribute | \(c_j\) |
| Sender–receiver interaction | Indicates whether both actors possess the attribute | \(c_i c_j\) |
19.4 REM in R
Here we estimate a relational event model for the dialogue interactions among characters in Frozen. In the relevent package, relational event models are estimated using the function rem.dyad(), which fits a dyadic relational event model based on the observed sequence of interactions.
The main arguments of the rem.dyad() function are:
edgelist: the event sequence, provided as a matrix or data frame containing the time (or order) of each event, the sender, and the receiver.n: the number of actors in the network.effects: a vector specifying the endogenous effects to include in the model (e.g., reciprocity, repetition, or activity effects).covar: a list of covariate matrices corresponding to the exogenous effects specified ineffects.ordinal: a logical argument indicating whether the event data are ordinal. SetTRUEif the data only record the order of events (without exact timestamps), andFALSEif precise event times are available.
In our case, the dataset consists of a sequence of dialogue interactions between characters in Frozen, where each event records which character speaks to another and the order in which the interaction occurs. Because the data contain only the ordering of events rather than precise timestamps, we specify ordinal = TRUE when estimating the model.