This vignette describes the importance of indirect relations on
networks, how they are used in centrality indices and how they are
implemented in the netrankr
package.
Theoretical Background
A one-mode network can be described as a dyadic variable
,
where
is the value range of the network (in the simple case of unweighted
networks
)
and
describes the dyadic domain of actors
.
Observed presence or absence of ties (the value range is binary) is
usually not the relation of interest for network analytic tasks.
Instead, mostly implicitly, relations are transformed into a
new set of indirect relations on the basis of the
observed relations. As an example, consider (shortest path)
distances in the underlying graph. While they are fairly easy to derive
from an observed network of contacts, it is impossible for actors in a
network to answer the question “How far away are you from others you are
not connected with?”. We denote generic transformed networks from an
observed network
as
.
With this notion of indirect relations, we can express centrality
indices in a common framework as
Degree and closeness centrality, for
instance, can be obtained by setting
and
,
respectively. Others need several additional specifications which can be
found in Brandes
(2016) or Schoch
& Brandes (2016).
With this framework, we can characterize centrality indices as
degree-like measures in a suitably transformed network
.
Indirect relations in the netrankr
package
The netrankr
package implements a great variety of
indirect relations that are (or could be) used for centrality related
considerations in a network. All indirect relations can be computed with
the indirect_relations()
function, by specifying the
type
parameter.
data("dbces11")
g <- dbces11
# adjacency
A <- indirect_relations(g, type = "adjacency")
# shortest path distances
D <- indirect_relations(g, type = "dist_sp")
# dyadic dependencies (as used in betweenness centrality)
B <- indirect_relations(g, type = "depend_sp")
# resistance distance (as used in information centrality)
R <- indirect_relations(g, type = "dist_resist")
# Logarithmic forest distance (parametrized family of distances)
LF <- indirect_relations(g, type = "dist_lf", lfparam = 1)
# Walk distance (parametrized family of distances)
WD <- indirect_relations(g, type = "dist_walk", dwparam = 0.001)
# Random walk distance
WD <- indirect_relations(g, type = "dist_rwalk")
# See ?indirect_relations for further options
Indirect relations are represented as matrices, similar to the adjacency matrix. The below matrices show the distance matrix based on sahortest paths, and the pairwise dependencies (used for e.g. betweenness).
D
## A B C D E F G H I J K
## A 0 5 2 4 2 2 2 3 3 3 1
## B 5 0 5 1 4 3 4 2 3 3 4
## C 2 5 0 4 1 2 2 3 2 3 1
## D 4 1 4 0 3 2 3 1 2 2 3
## E 2 4 1 3 0 2 2 2 1 2 1
## F 2 3 2 2 2 0 1 1 2 1 1
## G 2 4 2 3 2 1 0 2 1 1 1
## H 3 2 3 1 2 1 2 0 1 1 2
## I 3 3 2 2 1 2 1 1 0 1 2
## J 3 3 3 2 2 1 1 1 1 0 2
## K 1 4 1 3 1 1 1 2 2 2 0
B
## A B C D E F G H I
## A 0.0 0.0000000 0.0000000 0.0000000 0.0 0.0000000 0.0 0.0000000 0.0000000
## B 0.0 0.0000000 0.0000000 0.0000000 0.0 0.0000000 0.0 0.0000000 0.0000000
## C 0.0 0.0000000 0.0000000 0.0000000 0.0 0.0000000 0.0 0.0000000 0.0000000
## D 1.0 9.0000000 1.0000000 0.0000000 1.0 1.0000000 1.0 1.0000000 1.0000000
## E 0.5 0.5000000 2.8333333 0.5000000 0.0 0.0000000 0.0 0.5000000 2.0000000
## F 3.5 2.8333333 1.8333333 2.8333333 0.0 0.0000000 1.0 2.8333333 0.0000000
## G 1.0 0.0000000 0.3333333 0.0000000 0.0 0.3333333 0.0 0.0000000 1.3333333
## H 2.0 8.0000000 2.0000000 8.0000000 2.0 2.3333333 2.0 0.0000000 2.3333333
## I 0.0 1.8333333 1.8333333 1.8333333 4.5 0.0000000 1.5 1.8333333 0.0000000
## J 0.0 0.3333333 0.0000000 0.3333333 0.0 0.3333333 1.0 0.3333333 0.3333333
## K 9.0 1.5000000 5.1666667 1.5000000 2.5 3.0000000 2.5 1.5000000 1.0000000
## J K
## A 0.0000000 0.0
## B 0.0000000 0.0
## C 0.0000000 0.0
## D 1.0000000 1.0
## E 0.3333333 0.5
## F 1.3333333 3.5
## G 1.3333333 1.0
## H 2.0000000 2.0
## I 1.3333333 0.0
## J 0.0000000 0.0
## K 1.6666667 0.0
The function takes an additional parameter FUN
which can
be used to pass a function to further transform relations. The main use
is to obtain indirect relations based on walk counts.
# count the limit proportion of walks (used for eigenvector centrality)
W <- indirect_relations(g, type = "walks", FUN = walks_limit_prop)
# count the number of walks of arbitrary length between nodes, weighted by
# the inverse factorial of their length (used for subgraph centrality)
S <- indirect_relations(g, type = "walks", FUN = walks_exp)
Additional parameters can also be passed to calculate parameterized versions of relations.
# Calculate dist(s,t)^-alpha
D <- indirect_relations(g, type = "dist_sp", FUN = dist_dpow, alpha = 2)
To view all predefined transformation functions see
?transform_relations
. The predefined functions follow the
naming scheme <relation>_<transformation>
. The
functions dist_
are thus only meaningful fordistance type
relations such as type="dist_sp"
or
type="dist_resist"
. Equivalently, walks_
for
type="walks"
. The predefined functions are not exhaustive
and just constitute the most common transformations. It is, however,
straightforward to pass your own transformation function.
dist_integration <- function(x) {
x <- 1 - (x - 1) / max(x)
}
D <- indirect_relations(g, type = "dist_sp", FUN = dist_integration)
The function dist_integration()
computes
which is used in the centrality index
integration defined by Valente and Foreman
(1998)
The computed relations CAN be used to build centrality indices
(e.g. with the provided Rstudio index_builder()
), but also
to derive partial rankings with positional_dominance()
.
Consult the respective vignette
for help.