library(igraph)
library(networkdata)
5 Two-Mode Networks
5.1 Introduction
A two-mode network is a network that consists of two disjoint sets of nodes (like people and events). Ties connect the two sets, for example participation of people in events. Other examples are
- Affiliation networks (Membership in institutions/clubs)
- Voting/Sponsorship networks (politicians and bills)
- Citation network (authors and papers)
- Co-Authorship networks (also authors and papers)
There are two ways of analysing a two-mode network. Either directly by using methods specifically created for such networks, or by projecting it to a regular one-mode network. The advantage of the former is that there is no information loss and the advantage of the latter is that we are working with more familiar data structures. The projection approach is more popular these days, but we will still introduce some direct methods to analyse two-mode networks. The main part of this chapter will however deal with the projection approach.
5.2 Two-mode data structure
We will discuss some methods tailored for two-mode networks via the famous “southern women” dataset consisting of 18 women who attended a series of 14 events. The network is included in the networkdata
package.
data("southern_women")
southern_women
IGRAPH 1074643 UN-B 32 89 --
+ attr: type (v/l), name (v/c)
+ edges from 1074643 (vertex names):
[1] EVELYN --6/27 EVELYN --3/2 EVELYN --4/12 EVELYN --9/26
[5] EVELYN --2/25 EVELYN --5/19 EVELYN --9/16 EVELYN --4/8
[9] LAURA --6/27 LAURA --3/2 LAURA --4/12 LAURA --2/25
[13] LAURA --5/19 LAURA --3/15 LAURA --9/16 THERESA --3/2
[17] THERESA --4/12 THERESA --9/26 THERESA --2/25 THERESA --5/19
[21] THERESA --3/15 THERESA --9/16 THERESA --4/8 BRENDA --6/27
[25] BRENDA --4/12 BRENDA --9/26 BRENDA --2/25 BRENDA --5/19
[29] BRENDA --3/15 BRENDA --9/16 CHARLOTTE--4/12 CHARLOTTE--9/26
+ ... omitted several edges
igraph
interprets a network as a two-mode network if it has a logical node attribute called type
.
table(V(southern_women)$type)
FALSE TRUE
18 14
The adjacency matrix of a two-mode network is referred to as biadjacency matrix and can be obtained via as_biadjacency_matrix()
.
<- as_biadjacency_matrix(southern_women)
A A
6/27 3/2 4/12 9/26 2/25 5/19 3/15 9/16 4/8 6/10 2/23 4/7 11/21 8/3
EVELYN 1 1 1 1 1 1 0 1 1 0 0 0 0 0
LAURA 1 1 1 0 1 1 1 1 0 0 0 0 0 0
THERESA 0 1 1 1 1 1 1 1 1 0 0 0 0 0
BRENDA 1 0 1 1 1 1 1 1 0 0 0 0 0 0
CHARLOTTE 0 0 1 1 1 0 1 0 0 0 0 0 0 0
FRANCES 0 0 1 0 1 1 0 1 0 0 0 0 0 0
ELEANOR 0 0 0 0 1 1 1 1 0 0 0 0 0 0
PEARL 0 0 0 0 0 1 0 1 1 0 0 0 0 0
RUTH 0 0 0 0 1 0 1 1 1 0 0 0 0 0
VERNE 0 0 0 0 0 0 1 1 1 0 0 1 0 0
MYRNA 0 0 0 0 0 0 0 1 1 1 0 1 0 0
KATHERINE 0 0 0 0 0 0 0 1 1 1 0 1 1 1
SYLVIA 0 0 0 0 0 0 1 1 1 1 0 1 1 1
NORA 0 0 0 0 0 1 1 0 1 1 1 1 1 1
HELEN 0 0 0 0 0 0 1 1 0 1 1 1 0 0
DOROTHY 0 0 0 0 0 0 0 1 1 0 0 0 0 0
OLIVIA 0 0 0 0 0 0 0 0 1 0 1 0 0 0
FLORA 0 0 0 0 0 0 0 0 1 0 1 0 0 0
5.3 Direct Approach
The tnet
and bipartite
packages offer some methods to analyse two mode networks directly, by adapting tools for standard (one-mode) networks, like the methods described in previous sections.
library(tnet)
tnet
implements a version of the clustering coefficient for two-mode networks. Remember that its one-mode equivalent is based on triangle counts, a structure that cannot exist in two-mode networks (think about it for a second).
transitivity(southern_women)
[1] 0
transitivity(southern_women, type = "local")
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The version implemented in tnet
is based on cycles of length 6, which involves three nodes of each mode.
<- as_edgelist(southern_women, names = FALSE)
el_women
clustering_tm(el_women)
[1] 0.771897
# coefficient for first mode
clustering_local_tm(el_women)
node lc
1 1 0.766667
2 2 0.842175
3 3 0.752344
4 4 0.838791
5 5 1.000000
6 6 0.869048
7 7 0.795918
8 8 0.646259
9 9 0.670251
10 10 0.674089
11 11 0.713881
12 12 0.769556
13 13 0.746193
14 14 0.837950
15 15 0.815920
16 16 0.540741
17 17 0.580645
18 18 0.580645
# coefficient for second mode
clustering_local_tm(el_women[, 2:1])
node lc
1 1 NaN
2 2 NaN
3 3 NaN
4 4 NaN
5 5 NaN
6 6 NaN
7 7 NaN
8 8 NaN
9 9 NaN
10 10 NaN
11 11 NaN
12 12 NaN
13 13 NaN
14 14 NaN
15 15 NaN
16 16 NaN
17 17 NaN
18 18 NaN
19 19 1.000000
20 20 0.948718
21 21 0.953297
22 22 0.964497
23 23 0.962825
24 24 0.813559
25 25 0.717182
26 26 0.779158
27 27 0.735363
28 28 0.854460
29 29 0.955556
30 30 0.884477
31 31 0.870968
32 32 0.870968
Note that it is very cumbersome to count these cycles. It is advisable to run this function only on fairly small networks.
The package does include some more two-mode specific functions (look for *_tm()
), but the outcomes are equivalent to using its counterpart in igraph.
The bipartite
package is tailored towards ecological network analysis. Relevant functions for standard two-mode networks are the same as in tnet
.
5.4 Projection Approach
5.4.1 Weighted Projection
Besides analyzing a two-mode network as-is, there is also the possibility to project it to one mode. Mathematically, this is done by calculating \(AA^T\) or \(A^TA\), depending which mode we project on. As an example, consider the southern women dataset again.
<- A %*% t(A)
B B
EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH
EVELYN 8 6 7 6 3 4 3 3 3
LAURA 6 7 6 6 3 4 4 2 3
THERESA 7 6 8 6 4 4 4 3 4
BRENDA 6 6 6 7 4 4 4 2 3
CHARLOTTE 3 3 4 4 4 2 2 0 2
FRANCES 4 4 4 4 2 4 3 2 2
ELEANOR 3 4 4 4 2 3 4 2 3
PEARL 3 2 3 2 0 2 2 3 2
RUTH 3 3 4 3 2 2 3 2 4
VERNE 2 2 3 2 1 1 2 2 3
MYRNA 2 1 2 1 0 1 1 2 2
KATHERINE 2 1 2 1 0 1 1 2 2
SYLVIA 2 2 3 2 1 1 2 2 3
NORA 2 2 3 2 1 1 2 2 2
HELEN 1 2 2 2 1 1 2 1 2
DOROTHY 2 1 2 1 0 1 1 2 2
OLIVIA 1 0 1 0 0 0 0 1 1
FLORA 1 0 1 0 0 0 0 1 1
VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA
EVELYN 2 2 2 2 2 1 2 1 1
LAURA 2 1 1 2 2 2 1 0 0
THERESA 3 2 2 3 3 2 2 1 1
BRENDA 2 1 1 2 2 2 1 0 0
CHARLOTTE 1 0 0 1 1 1 0 0 0
FRANCES 1 1 1 1 1 1 1 0 0
ELEANOR 2 1 1 2 2 2 1 0 0
PEARL 2 2 2 2 2 1 2 1 1
RUTH 3 2 2 3 2 2 2 1 1
VERNE 4 3 3 4 3 3 2 1 1
MYRNA 3 4 4 4 3 3 2 1 1
KATHERINE 3 4 6 6 5 3 2 1 1
SYLVIA 4 4 6 7 6 4 2 1 1
NORA 3 3 5 6 8 4 1 2 2
HELEN 3 3 3 4 4 5 1 1 1
DOROTHY 2 2 2 2 1 1 2 1 1
OLIVIA 1 1 1 1 2 1 1 2 2
FLORA 1 1 1 1 2 1 1 2 2
This matrix can now be interpreted as a weighted network among the 18 women. Each entry corresponds to the number of times two women went to the same event.
The same can be achieved with the function bipartite_projection()
, which returns both projections.
<- bipartite_projection(southern_women)
projs projs
$proj1
IGRAPH 2a712ab UNW- 18 139 --
+ attr: name (v/c), weight (e/n)
+ edges from 2a712ab (vertex names):
[1] EVELYN --LAURA EVELYN --BRENDA EVELYN --THERESA EVELYN --CHARLOTTE
[5] EVELYN --FRANCES EVELYN --ELEANOR EVELYN --RUTH EVELYN --PEARL
[9] EVELYN --NORA EVELYN --VERNE EVELYN --MYRNA EVELYN --KATHERINE
[13] EVELYN --SYLVIA EVELYN --HELEN EVELYN --DOROTHY EVELYN --OLIVIA
[17] EVELYN --FLORA LAURA --BRENDA LAURA --THERESA LAURA --CHARLOTTE
[21] LAURA --FRANCES LAURA --ELEANOR LAURA --RUTH LAURA --PEARL
[25] LAURA --NORA LAURA --VERNE LAURA --SYLVIA LAURA --HELEN
[29] LAURA --MYRNA LAURA --KATHERINE LAURA --DOROTHY THERESA--BRENDA
+ ... omitted several edges
$proj2
IGRAPH 8c38084 UNW- 14 66 --
+ attr: name (v/c), weight (e/n)
+ edges from 8c38084 (vertex names):
[1] 6/27--3/2 6/27--4/12 6/27--9/26 6/27--2/25 6/27--5/19 6/27--9/16
[7] 6/27--4/8 6/27--3/15 3/2 --4/12 3/2 --9/26 3/2 --2/25 3/2 --5/19
[13] 3/2 --9/16 3/2 --4/8 3/2 --3/15 4/12--9/26 4/12--2/25 4/12--5/19
[19] 4/12--9/16 4/12--4/8 4/12--3/15 9/26--2/25 9/26--5/19 9/26--9/16
[25] 9/26--4/8 9/26--3/15 2/25--5/19 2/25--9/16 2/25--4/8 2/25--3/15
[31] 5/19--9/16 5/19--4/8 5/19--3/15 5/19--6/10 5/19--2/23 5/19--4/7
[37] 5/19--11/21 5/19--8/3 3/15--9/16 3/15--4/8 3/15--4/7 3/15--6/10
[43] 3/15--11/21 3/15--8/3 3/15--2/23 9/16--4/8 9/16--4/7 9/16--6/10
+ ... omitted several edges
As you can see, the network is weighted and very dense. In principle it is possible to analyze the network as is, but a very common step is to binarize the network. In doing so, we basically turn the network into a simple undirected one-mode network. This makes all methods we described in the first few sections applicable to the network (at least in theory).
5.4.2 Simple Binary Projections
The simplest way of binarizing a weighted projection is to define a global threshold and remove a tie if its weight is below the global threshold. A popular choice is to take the mean edge weight (sometimes also plus the 1-2 times the standard deviation).
<- projs$proj1
women_proj <- mean(E(projs$proj1)$weight)
threshold <- delete_edges(women_proj, which(E(women_proj)$weight <= threshold))
women_bin <- delete_edge_attr(women_bin, "weight")
women_bin women_bin
IGRAPH 989d041 UN-- 18 46 --
+ attr: name (v/c)
+ edges from 989d041 (vertex names):
[1] EVELYN --LAURA EVELYN --BRENDA EVELYN --THERESA EVELYN --CHARLOTTE
[5] EVELYN --FRANCES EVELYN --ELEANOR EVELYN --RUTH EVELYN --PEARL
[9] LAURA --BRENDA LAURA --THERESA LAURA --CHARLOTTE LAURA --FRANCES
[13] LAURA --ELEANOR LAURA --RUTH THERESA--BRENDA THERESA--CHARLOTTE
[17] THERESA--FRANCES THERESA--ELEANOR THERESA--RUTH THERESA--PEARL
[21] THERESA--NORA THERESA--VERNE THERESA--SYLVIA BRENDA --CHARLOTTE
[25] BRENDA --FRANCES BRENDA --ELEANOR BRENDA --RUTH FRANCES--ELEANOR
[29] ELEANOR--RUTH RUTH --VERNE RUTH --SYLVIA VERNE --SYLVIA
+ ... omitted several edges
5.4.3 Model-based Binary Projections
The global threshold method is very simple but in many cases leads to undesirable structural features. More sophisticated tools work with statistical models in the background which determine if an edge weight differs enough from the expected value of an underlying null model. If so, the edge is kept in the binary projection. Many of such models are implemented in the backbone
package.
library(backbone)
The idea behind all of the models is always the same:
- Create the weighted projection of interest, e.g.
B <- A%*%t(A)
- Generate random two-mode networks according to a given model.
- Compare if the values
B[i,j]
differ significantly from the distribution of values in the random projections.
The only difference in all models is the construction of the random two-mode networks which follow different rules:
- Fixed Degree Sequence Model
fdsm()
: Create random two-mode networks with the same row and column sums asA
. - Fixed Column Model
fixedcol()
: Create random two-mode networks with the same column sums asA
. - Fixed Row Model
fixedrow()
: Create random two-mode networks with the same row sums asA
. - Fixed Fill Model
fixedfill()
: Create random two-mode networks with the same number of ones asA
. - Stochastic Degree Sequence Model
sdsm()
: Create random two-mode networks with approximately the same row and column sums asA
.
Before we move to an actual use case, you may ask: So which model is the right one for me? That is actually quite a tricky question. There is some guidance available but in general you can follow these rough guidelines:
- Use the model that fits you empirical setting or a known link formation process. If that link formation process dictates that row sums are fixed but column sums not, then choose
fixedow()
. - Use
fdsm()
if your network is small enough. Sampling from the FDSM is quite expensive. - Use the
sdsm()
for large networks.
Given that there is never a “ground-truth” binary projection, any choice of model is fine as long as it is motivated substantively and not merely because it fits the papers narrative best.
To illustrate the model fitting, we use a bill cosponsorship of the Senate 2015. A link between a senator and a bill exists, if they sponsored it. We are no interested in how the binary projection of Senators looks like.
data("cosponsor")
cosponsor
IGRAPH 6eddec8 UN-B 3984 26392 --
+ attr: name (v/c), type (v/l), party (v/c)
+ edges from 6eddec8 (vertex names):
[1] 115s1 --Enzi, Michael B. 115s10 --Cardin, Benjamin L.
[3] 115s10 --Wicker, Roger F. 115s100 --Alexander, Lamar
[5] 115s1000--Franken, Al 115s1000--Murray, Patty
[7] 115s1000--Brown, Sherrod 115s1000--Warren, Elizabeth
[9] 115s1000--Markey, Edward J. 115s1001--Crapo, Mike
[11] 115s1001--Blumenthal, Richard 115s1001--Murphy, Christopher
[13] 115s1001--Cassidy, Bill 115s1001--Alexander, Lamar
[15] 115s1001--Bennet, Michael F. 115s1002--Moran, Jerry
+ ... omitted several edges
Given that the network is fairly large, we will use the SDSM. Note that all models create the projection for the mode where type == FALSE
. If you want to project on the TRUE
mode, you need to invert the type attribute.
<- sdsm(cosponsor, alpha = 0.05, signed = FALSE)
senators senators
IGRAPH ef65957 UNW- 110 1591 --
+ attr: name (v/c), party (v/c), weight (e/n), sign (e/n)
+ edges from ef65957 (vertex names):
[1] Enzi, Michael B.--Wicker, Roger F. Enzi, Michael B.--Alexander, Lamar
[3] Enzi, Michael B.--Crapo, Mike Enzi, Michael B.--Moran, Jerry
[5] Enzi, Michael B.--Scott, Tim Enzi, Michael B.--Daines, Steve
[7] Enzi, Michael B.--Perdue, David Enzi, Michael B.--Blunt, Roy
[9] Enzi, Michael B.--Inhofe, James M. Enzi, Michael B.--Barrasso, John
[11] Enzi, Michael B.--Fischer, Deb Enzi, Michael B.--Ernst, Joni
[13] Enzi, Michael B.--Rounds, Mike Enzi, Michael B.--Kennedy, John
[15] Enzi, Michael B.--Flake, Jeff Enzi, Michael B.--Hoeven, John
+ ... omitted several edges
For signed = FALSE
, a one-tailed test is performed for each edge with a non-zero weight. It yields a projection that preserves edges whose weights are significantly stronger than expected in the null model.
When signed = TRUE
, a two-tailed test is performed for every pair of nodes. It yields a backbone that contains positive edges for edges whose weights are significantly stronger, and negative edges for edges whose weights are significantly weaker, than expected in the chosen null model. The projections thus becomes a signed network (see Chapter 6).
The figure below shows the not so surprising result that Democrats and Republicans do not tend to significantly cosponsor the same bills.
5.5 Notable Packages
incidentally
to create random two-mode networks with given structural features
5.6 Scientific Reading
Faust, K. (1997). Centrality in affiliation networks. Social networks, 19(2), 157-191.
Everett, M. G., & Borgatti, S. P. (2013). The dual-projection approach for two-mode networks. Social networks, 35(2), 204-210.
Opsahl, T. (2013). Triadic closure in two-mode networks: Redefining the global and local clustering coefficients. Social networks, 35(2), 159-167.
Neal, Z. P. (2014). The backbone of bipartite projections: Inferring relationships from co-authorship, co-sponsorship, co-attendance, and other co-behaviors. Social Networks, 39, 84-97.
Neal, Z. P., Domagalski, R., and Sagan, B. (2021). Comparing Alternatives to the Fixed Degree Sequence Model for Extracting the Backbone of Bipartite Projections. Scientific Reports, 11, 23929.