R : how to control behaviour of edges in ggraph

2019-07-17 06:36发布

问题:

I'm facing this issue: I got some data like these:

library(tidyverse)
library(tidygraph)
library(ggraph)
library(ggrepel)

edges <- data.frame(a=c('k','k','k','k','k','z','z'),
                    b=c('b','b','b','b','c','b','c'), costant = 1)
  a b costant
1 k b       1
2 k b       1
3 k b       1
4 k b       1
5 k c       1
6 z b       1
7 z c       1

Now I would lik to have a graph with ggraph that have nodes and edges with weights. So I worked this way:

# first I calculated the edges weights
edges1 <- edges%>% group_by(a,b) %>% summarise(weight = sum(costant))
> edges1
# A tibble: 4 x 3
# Groups:   a [?]
  a     b     weight
  <fct> <fct>  <dbl>
1 k     b          4
2 k     c          1
3 z     b          1
4 z     c          1

Then the nodes:

nodes <- rbind(data.frame(word = edges$a, n = 1),data.frame(word = edges$b, n = 1)) %>%
 group_by(word) %>%
summarise(n = sum(n))
> nodes
# A tibble: 4 x 2
  word      n
  <fct> <dbl>
1 k         5
2 z         2
3 b         5
4 c         2

Till now, everything works fine. Now, following this as example:

tidy <- tbl_graph(nodes = nodes, edges = edges1, directed = T)
tidy <- tidy %>% 
  activate(edges) %>% 
  arrange(desc(weight)
)

Suddently I plotted the graph:

ggraph(tidy, layout = "gem") + 
  geom_node_point(aes(size=n)) +
  geom_edge_link(aes(width = weight), alpha = 0.8) + 
  scale_edge_width(range = c(0.2, 2)) +
  geom_text_repel(aes(x = x, y=y , label=word)) 

But the result is this:

And I cannot figure out why there is a line between k and z, because that edges does not exists.

Thank in advance.

回答1:

It seems it's due to the fact that tbl_graph converts edge1 tibble's nodes from factor to integer by as.integer without considering the nodes tibble, this is source of the error. If we pre-convert the edge node's to integers correctly it will work as expected.

edges <- data.frame(a=c('k','k','k','k','k','z','z'),
                    b=c('b','b','b','b','c','b','c'), costant = 1)
edges1 <- edges%>% group_by(a,b) %>% summarise(weight = sum(costant))

nodes <- rbind(data.frame(word = edges$a, n = 1),data.frame(word = edges$b, n = 1)) %>%
  group_by(word) %>%
  summarise(n = sum(n))

edges2 <- edges1 # save edges with factor node labels into edge2
# convert 'from' and 'to' factor columns to integer columns correctly 
# with the nodes tibble's corresponding matched index values 
edges1$a <- match(edges1$a, nodes$word) 
edges1$b <- match(edges1$b, nodes$word)

tidy <- tbl_graph(nodes = nodes, edges = edges1, directed = T)
tidy <- tidy %>% 
  activate(edges) %>% 
  arrange(desc(weight)
  ) 

ggraph(tidy, layout = "gem") + 
   geom_node_point(aes(size=n)) +
   geom_edge_link(aes(width = weight), arrow = arrow(length = unit(4, 'mm')), end_cap = circle(3, 'mm'), alpha = 0.8) + 
   scale_edge_width(range = c(0.2, 2)) +
   geom_text_repel(aes(x = x, y=y , label=word)) 

edges2 # compare the edges in the following tibble with the next figure
# A tibble: 4 x 3
# Groups:   a [?]
    a     b     weight
  <fct> <fct>  <dbl>
#1 k     b       4
#2 k     c       1
#3 z     b       1
#4 z     c       1



标签: r ggplot2 ggraph