9  Advanced Layouts

library(igraph)
library(ggraph)
library(graphlayouts)
library(networkdata)
library(igraph)
library(ggraph)
library(graphlayouts)
library(ggforce)
library(networkdata)
data("got")

gotS1 <- got[[1]]

got_palette <- c(
    "#1A5878", "#C44237", "#AD8941", "#E99093",
    "#50594B", "#8968CD", "#9ACD32"
)

## compute a clustering for node colors
V(gotS1)$clu <- as.character(membership(cluster_louvain(gotS1)))

## compute degree as node size
V(gotS1)$size <- degree(gotS1)

While “stress” is the key layout algorithm in graphlayouts, there are other, more specialized layouts that can be used for different purposes. In this part, we work through some examples with concentric layouts and learn how to disentangle extreme “hairball” networks.

9.1 Large Networks

The stress layout also works well with medium to large graphs.

The network shows the biggest componentn of the co-authorship network of R package developers on CRAN (~12k nodes)

If you want to go beyond ~20k nodes, then you may want to switch to layout_with_pmds() or layout_with_sparse_stress() which are optimized to work with large graphs.

These are capable to deal with networks with several 100,000 nodes.

9.2 Concentric Layouts

Circular layouts are generally not advisable. Concentric circles, on the other hand, help to emphasize the position of certain nodes in the network. The graphlayouts package has two function to create concentric layouts, layout_with_focus() and layout_with_centrality().

The first one allows to focus the network on a specific node and arrange all other nodes in concentric circles (depending on the geodesic distance) around it. Below we focus on the character Ned Stark.

ggraph(gotS1, layout = "focus", focus = 1) +
    geom_edge_link0(aes(edge_linewidth = weight), edge_colour = "grey66") +
    geom_node_point(aes(fill = clu, size = size), shape = 21) +
    geom_node_text(aes(filter = (name == "Ned"), size = size, label = name),
        family = "serif"
    ) +
    scale_edge_width_continuous(range = c(0.2, 1.2)) +
    scale_size_continuous(range = c(1, 5)) +
    scale_fill_manual(values = got_palette) +
    coord_fixed() +
    theme_graph() +
    theme(legend.position = "none")

The parameter focus in the first line is used to choose the node id of the focal node. The function coord_fixed() is used to always keep the aspect ratio at one (i.e. the circles are always displayed as a circle and not an ellipse).

The function draw_circle() can be used to add the circles explicitly.

ggraph(gotS1, layout = "focus", focus = 1) +
    draw_circle(col = "#00BFFF", use = "focus", max.circle = 3) +
    geom_edge_link0(aes(width = weight), edge_colour = "grey66") +
    geom_node_point(aes(fill = clu, size = size), shape = 21) +
    geom_node_text(aes(filter = (name == "Ned"), size = size, label = name),
        family = "serif"
    ) +
    scale_edge_width_continuous(range = c(0.2, 1.2)) +
    scale_size_continuous(range = c(1, 5)) +
    scale_fill_manual(values = got_palette) +
    coord_fixed() +
    theme_graph() +
    theme(legend.position = "none")

layout_with_centrality() works in a similar way. You can specify any centrality index (or any numeric vector for that matter), and create a concentric layout where the most central nodes are put in the center and the most peripheral nodes in the biggest circle. The numeric attribute used for the layout is specified with the cent parameter. Here, we use the weighted degree of the characters.

ggraph(gotS1, layout = "centrality", cent = graph.strength(gotS1)) +
    geom_edge_link0(aes(edge_linewidth = weight), edge_colour = "grey66") +
    geom_node_point(aes(fill = clu, size = size), shape = 21) +
    geom_node_text(aes(size = size, label = name), family = "serif") +
    scale_edge_width_continuous(range = c(0.2, 0.9)) +
    scale_size_continuous(range = c(1, 8)) +
    scale_fill_manual(values = got_palette) +
    coord_fixed() +
    theme_graph() +
    theme(legend.position = "none")

(Concentric layouts are not only helpful to focus on specific nodes, but also make for a good tool to visualize ego networks.)

9.3 Backbone Layout

layout_as_backbone() is a layout algorithm that can help emphasize hidden group structures. To illustrate the performance of the algorithm, we create an artificial network with a subtle group structure using sample_islands() from igraph.

g <- sample_islands(9, 40, 0.4, 15)
g <- simplify(g)
V(g)$grp <- as.character(rep(1:9, each = 40))

The network consists of 9 groups with 40 vertices each. The density within each group is 0.4 and there are 15 edges running between each pair of groups. Let us try to visualize the network with what we have learned so far.

ggraph(g, layout = "stress") +
    geom_edge_link0(edge_colour = "black", edge_linewidth = 0.1, edge_alpha = 0.5) +
    geom_node_point(aes(fill = grp), shape = 21) +
    scale_fill_brewer(palette = "Set1") +
    theme_graph() +
    theme(legend.position = "none")

As you can see, the graph seems to be a proper “hairball” without any special structural features standing out. In this case, though, we know that there should be 9 groups of vertices that are internally more densely connected than externally. To uncover this group structure, we turn to the “backbone layout”.

bb <- layout_as_backbone(g, keep = 0.4)
E(g)$col <- FALSE
E(g)$col[bb$backbone] <- TRUE

The idea of the algorithm is as follows. For each edge, an embeddedness score is calculated which serves as an edge weight attribute. These weights are then ordered and only the edges with the highest score are kept. The number of edges to keep is controlled with the keep parameter. In our example, we keep the top 40%. The parameter usually requires some experimenting to find out what works best. Since this may result in an unconnected network, we add all edges of the union of all maximum spanning trees. The resulting network is the “backbone” of the original network and the “stress” layout algorithm is applied to this network. Once the layout is calculated, all edges are added back to the network.

The output of the function are the x and y coordinates for nodes and a vector that gives the ids of the edges in the backbone network. In the code above, we use this vector to create a binary edge attribute that indicates if an edge is part of the backbone or not.

ggraph(g, layout = "backbone", keep = 0.4) +
    geom_edge_link0(aes(edge_colour = backbone), edge_linewidth = 0.1) +
    geom_node_point(aes(fill = grp), shape = 21) +
    scale_fill_brewer(palette = "Set1") +
    scale_edge_color_manual(values = c(rgb(0, 0, 0, 0.3), rgb(0, 0, 0, 1))) +
    theme_graph() +
    theme(legend.position = "none")

The groups are now clearly visible! Of course the network used in the example is specifically tailored to illustrate the power of the algorithm. Using the backbone layout in real world networks may not always result in such a clear division of groups. It should thus not be seen as a universal remedy for drawing hairball networks. Keep in mind: It can only emphasize a hidden group structure if it exists.

The plot below shows an empirical example where the algorithm was able to uncover a hidden group structure. The network shows facebook friendships of a university in the US. Node colour corresponds to dormitory of students. Left is the ordinary stress layout and right the backbone layout.

9.4 Longitudinal Networks

Longitudinal network data usually comes in the form of panel data, gathered at different points in time. We thus have a series of snapshots that need to be visualized in a way that individual nodes are easy to trace without the layout becoming to awkward.

For this part of the tutorial, you will need two additional packages.

library(gganimate)
library(ggplot2)
library(patchwork)

We will be using the 50 actor excerpt from the Teenage Friends and Lifestyle Study from the RSiena data repository as an example. The data is part of the networkdata package.

data("s50")

The dataset consists of three networks with 50 actors each and a vertex attribute for the smoking behavior of students. The function layout_as_dynamic() from graphlayouts can be used to visualize the three networks. The implemented algorithm calculates a reference layout which is a layout of the union of all networks and individual layouts based on stress minimization and combines those in a linear combination which is controlled by the alpha parameter. For alpha=1, only the reference layout is used and all graphs have the same layout. For alpha=0, the stress layout of each individual graph is used. Values in-between interpolate between the two layouts.

xy <- layout_as_dynamic(s50, alpha = 0.2)

Now you could use ggraph in conjunction with patchwork to produce a static plot with all networks side-by-side.

pList <- vector("list", length(s50))

for (i in 1:length(s50)) {
    pList[[i]] <- ggraph(s50[[i]], layout = "manual", x = xy[[i]][, 1], y = xy[[i]][, 2]) +
        geom_edge_link0(edge_linewidth = 0.6, edge_colour = "grey66") +
        geom_node_point(shape = 21, aes(fill = as.factor(smoke)), size = 6) +
        geom_node_text(label = 1:50, repel = FALSE, color = "white", size = 4) +
        scale_fill_manual(
            values = c("forestgreen", "grey25", "firebrick"),
            guide = ifelse(i != 2, "none", "legend"),
            name = "smoking",
            labels = c("never", "occasionally", "regularly")
        ) +
        theme_graph() +
        theme(legend.position = "bottom") +
        labs(title = paste0("Wave ", i))
}

wrap_plots(pList)

This is nice but of course we want to animate the changes. This is where we have to get inventive, because ggraph does not (yet) work with gganimate out of the box. For the time being, the function below provides a hacky workaround to produce a data structure that can be passed to gganimate.

aninet_data <- function(gList, alpha = 0.2, nodes = NULL) {
    # check for absent nodes and add them
    if (is.null(nodes)) {
        all_nodes <- unique(unlist(sapply(gList, function(x) vertex_attr(x, "name"))))
        for (i in 1:length(gList)) {
            idx <- which(!all_nodes %in% V(gList[[i]])$name)
            if (length(idx) > 0) {
                gList[[i]] <- add_vertices(gList[[i]], length(idx), attr = list(name = all_nodes[idx]))
            }
        }
    } else if (nodes >= 0) {
        all_nodes <- unlist(sapply(gList, function(x) vertex_attr(x, "name")))
        all_nodes <- names(which(table(all_nodes) >= nodes * length(gList)))

        for (i in 1:length(gList)) {
            idx <- which(!all_nodes %in% V(gList[[i]])$name)
            if (length(idx) > 0) {
                gList[[i]] <- add_vertices(gList[[i]], length(idx), attr = list(name = all_nodes[idx]))
            }
        }

        for (i in 1:length(gList)) {
            idx <- which(!V(gList[[i]])$name %in% all_nodes)
            if (length(idx) > 0) {
                gList[[i]] <- delete_vertices(gList[[i]], idx)
            }
        }
    }

    xy <- graphlayouts::layout_as_dynamic(gList, alpha = alpha)
    nodes_lst <- lapply(1:length(gList), function(i) {
        cbind(igraph::as_data_frame(gList[[i]], "vertices"),
            x = xy[[i]][, 1], y = xy[[i]][, 2], frame = i
        )
    })

    edges_lst <- lapply(1:length(gList), function(i) cbind(igraph::as_data_frame(gList[[i]], "edges"), frame = i))

    edges_lst <- lapply(1:length(gList), function(i) {
        edges_lst[[i]]$x <- nodes_lst[[i]]$x[match(edges_lst[[i]]$from, nodes_lst[[i]]$name)]
        edges_lst[[i]]$y <- nodes_lst[[i]]$y[match(edges_lst[[i]]$from, nodes_lst[[i]]$name)]
        edges_lst[[i]]$xend <- nodes_lst[[i]]$x[match(edges_lst[[i]]$to, nodes_lst[[i]]$name)]
        edges_lst[[i]]$yend <- nodes_lst[[i]]$y[match(edges_lst[[i]]$to, nodes_lst[[i]]$name)]
        edges_lst[[i]]$id <- paste0(edges_lst[[i]]$from, "-", edges_lst[[i]]$to)
        edges_lst[[i]]$status <- TRUE
        edges_lst[[i]]
    })

    all_edges <- do.call("rbind", lapply(gList, get.edgelist))
    all_edges <- all_edges[!duplicated(all_edges), ]
    all_edges <- cbind(all_edges, paste0(all_edges[, 1], "-", all_edges[, 2]))

    edges_lst <- lapply(1:length(gList), function(i) {
        idx <- which(!all_edges[, 3] %in% edges_lst[[i]]$id)
        if (length(idx != 0)) {
            tmp <- data.frame(from = all_edges[idx, 1], to = all_edges[idx, 2], id = all_edges[idx, 3])
            tmp$x <- nodes_lst[[i]]$x[match(tmp$from, nodes_lst[[i]]$name)]
            tmp$y <- nodes_lst[[i]]$y[match(tmp$from, nodes_lst[[i]]$name)]
            tmp$xend <- nodes_lst[[i]]$x[match(tmp$to, nodes_lst[[i]]$name)]
            tmp$yend <- nodes_lst[[i]]$y[match(tmp$to, nodes_lst[[i]]$name)]
            tmp$frame <- i
            tmp$status <- FALSE
            idy <- which(!names(edges_lst[[i]]) %in% names(tmp))
            if (length(idy) > 0) {
                tmp[names(edges_lst[[i]])[idy]] <- NA
            }
            edges_lst[[i]] <- rbind(edges_lst[[i]], tmp)
        }
        edges_lst[[i]]
    })

    edges_df <- do.call("rbind", edges_lst)
    nodes_df <- do.call("rbind", nodes_lst)
    list(nodes = nodes_df, edges = edges_df)
}
#| label: build
edges_df <- do.call("rbind", edges_lst)
nodes_df <- do.call("rbind", nodes_lst)

dat <- aninet(s50)

And that’s it in terms of data wrangling. All that is left is to plot/animate the data.

ggplot() +
  geom_segment(
    data = dat$edges_df,
    aes(x = x, xend = xend, y = y, yend = yend, group = id, alpha = status),
    show.legend = FALSE
  ) +
  geom_point(
    data = dat$nodes_df, aes(x, y, group = name, fill = as.factor(smoke)),
    shape = 21, size = 4, show.legend = FALSE
  ) +
  scale_fill_manual(values = c("forestgreen", "grey25", "firebrick")) +
  scale_alpha_manual(values = c(0, 1)) +
  ease_aes("quadratic-in-out") +
  transition_states(frame, state_length = 0.5, wrap = FALSE) +
  labs(title = "Wave {closest_state}") +
  theme_void()

9.5 Multilevel networks

In this section, you will get to know layout_as_multilevel(), a layout algorithm in the raphlayouts package which can be use to visualize multilevel networks.

A multilevel network consists of two (or more) levels with different node sets and intra-level ties. For instance, one level could be scientists and their collaborative ties and the second level are labs and ties among them, and inter-level edges are the affiliations of scientists and labs.

The graphlayouts package contains an artificial multilevel network which will be used to illustrate the algorithm.

data("multilvl_ex")

The package assumes that a multilevel network has a vertex attribute called lvl which holds the level information (1 or 2).

The underlying algorithm of layout_as_multilevel() has three different versions, which can be used to emphasize different structural features of a multilevel network.

Independent of which option is chosen, the algorithm internally produces a 3D layout, where each level is positioned on a different y-plane. The 3D layout is then mapped to 2D with an isometric projection. The parameters alpha and beta control the perspective of the projection. The default values seem to work for many instances, but may not always be optimal. As a rough guideline: beta rotates the plot around the y axis (in 3D) and alpha moves the POV up or down.

9.5.1 Complete layout

A layout for the complete network can be computed via layout_as_multilevel() setting type = "all". Internally, the algorithm produces a constrained 3D stress layout (each level on a different y plane) which is then projected to 2D. This layout ignores potential differences in each level and optimizes only the overall layout.

xy <- layout_as_multilevel(multilvl_ex, type = "all", alpha = 25, beta = 45)

To visualize the network with ggraph, you may want to draw the edges for each level (and inter level edges) with a different edge geom. This gives you more flexibility to control aesthetics and can easily be achieved with a filter.

ggraph(multilvl_ex, "manual", x = xy[, 1], y = xy[, 2]) +
    geom_edge_link0(
        aes(filter = (node1.lvl == 1 & node2.lvl == 1)),
        edge_colour = "firebrick3",
        alpha = 0.5,
        edge_linewidth = 0.3
    ) +
    geom_edge_link0(
        aes(filter = (node1.lvl != node2.lvl)),
        alpha = 0.3,
        edge_linewidth = 0.1,
        edge_colour = "black"
    ) +
    geom_edge_link0(
        aes(filter = (node1.lvl == 2 &
            node2.lvl == 2)),
        edge_colour = "goldenrod3",
        edge_linewidth = 0.3,
        alpha = 0.5
    ) +
    geom_node_point(aes(shape = as.factor(lvl)), fill = "grey25", size = 3) +
    scale_shape_manual(values = c(21, 22)) +
    theme_graph() +
    coord_cartesian(clip = "off", expand = TRUE) +
    theme(legend.position = "none")

9.5.2 Separate layouts for both levels

In many instances, there may be different structural properties inherent to the levels of the network. In that case, two layout functions can be passed to layout_as_multilevel() to deal with these differences. In our artificial network, level 1 has a hidden group structure and level 2 has a core-periphery structure.

To use this layout option, set type = "separate" and specify two layout functions with FUN1 and FUN2. You can change internal parameters of these layout functions with named lists in the params1 and params2 argument. Note that this version optimizes inter-level edges only minimally. The emphasis is on the intra-level structures.

xy <- layout_as_multilevel(multilvl_ex,
    type = "separate",
    FUN1 = layout_as_backbone,
    FUN2 = layout_with_stress,
    alpha = 25, beta = 45
)

Again, try to include an edge geom for each level.

cols2 <- c(
    "#3A5FCD", "#CD00CD", "#EE30A7", "#EE6363",
    "#CD2626", "#458B00", "#EEB422", "#EE7600"
)

ggraph(multilvl_ex, "manual", x = xy[, 1], y = xy[, 2]) +
    geom_edge_link0(
        aes(
            filter = (node1.lvl == 1 & node2.lvl == 1),
            edge_colour = col
        ),
        alpha = 0.5, edge_linewidth = 0.3
    ) +
    geom_edge_link0(
        aes(filter = (node1.lvl != node2.lvl)),
        alpha = 0.3,
        edge_linewidth = 0.1,
        edge_colour = "black"
    ) +
    geom_edge_link0(
        aes(
            filter = (node1.lvl == 2 & node2.lvl == 2),
            edge_colour = col
        ),
        edge_linewidth = 0.3, alpha = 0.5
    ) +
    geom_node_point(aes(
        fill = as.factor(grp),
        shape = as.factor(lvl),
        size = nsize
    )) +
    scale_shape_manual(values = c(21, 22)) +
    scale_size_continuous(range = c(1.5, 4.5)) +
    scale_fill_manual(values = cols2) +
    scale_edge_color_manual(values = cols2, na.value = "grey12") +
    scale_edge_alpha_manual(values = c(0.1, 0.7)) +
    theme_graph() +
    coord_cartesian(clip = "off", expand = TRUE) +
    theme(legend.position = "none")

9.5.3 Fix only one level

This layout can be used to emphasize one intra-level structure. The layout of the second level is calculated in a way that optimizes inter-level edge placement. Set type = "fix1" and specify FUN1 and possibly params1 to fix level 1 or set type = "fix2" and specify FUN2 and possibly params2 to fix level 2.

xy <- layout_as_multilevel(multilvl_ex,
    type = "fix2",
    FUN2 = layout_with_stress,
    alpha = 25, beta = 45
)

ggraph(multilvl_ex, "manual", x = xy[, 1], y = xy[, 2]) +
    geom_edge_link0(
        aes(
            filter = (node1.lvl == 1 & node2.lvl == 1),
            edge_colour = col
        ),
        alpha = 0.5, edge_linewidth = 0.3
    ) +
    geom_edge_link0(
        aes(filter = (node1.lvl != node2.lvl)),
        alpha = 0.3,
        edge_linewidth = 0.1,
        edge_colour = "black"
    ) +
    geom_edge_link0(
        aes(
            filter = (node1.lvl == 2 & node2.lvl == 2),
            edge_colour = col
        ),
        edge_linewidth = 0.3, alpha = 0.5
    ) +
    geom_node_point(aes(
        fill = as.factor(grp),
        shape = as.factor(lvl),
        size = nsize
    )) +
    scale_shape_manual(values = c(21, 22)) +
    scale_size_continuous(range = c(1.5, 4.5)) +
    scale_fill_manual(values = cols2) +
    scale_edge_color_manual(values = cols2, na.value = "grey12") +
    scale_edge_alpha_manual(values = c(0.1, 0.7)) +
    theme_graph() +
    coord_cartesian(clip = "off", expand = TRUE) +
    theme(legend.position = "none")

9.5.4 3D with threejs

Instead of the default 2D projection, layout_as_multilevel() can also return the 3D layout by setting project2d = FALSE. The 3D layout can then be used with e.g. threejs to produce an interactive 3D visualization.

library(threejs)
xyz <- layout_as_multilevel(multilvl_ex,
    type = "separate",
    FUN1 = layout_as_backbone,
    FUN2 = layout_with_stress,
    project2D = FALSE
)
multilvl_ex$layout <- xyz
V(multilvl_ex)$color <- c("#00BFFF", "#FF69B4")[V(multilvl_ex)$lvl]
V(multilvl_ex)$vertex.label <- V(multilvl_ex)$name

graphjs(multilvl_ex, bg = "black", vertex.shape = "sphere")