how to plot degree distribution in R

2019-04-12 17:23发布

问题:

I would like to know whether the output of a script to plot a degree distribution can be correct.

So the script is ( where the vector with the degrees of all my vertices is stored in x):

x is

x
 [1] 7 9 8 5 6 2 8 9 7 5 2 4 6 9 2 6 10 8 

x is the degree of a certain network vertice - like vertice 1 has degree 7, vertice 2 has degree 9 and so on x <- v2 summary(x)

library(igraph)
split.screen(c(1,2))
screen(1)
plot (tabulate(x), log = "xy", ylab = "Frequency (log scale)", xlab = "Degree (log scale)", main = "Log-log plot of degree distribution")
screen(2)
y <- (length(x) - rank(x, ties.method = "first"))/length(x)
plot(x, y, log = "xy", ylab = "Fraction with min. degree k (log scale)", xlab = "Degree (k) (log scale)", main = "Cumulative log-log plot of degree distribution")
close.screen(all = TRUE)
power.law.fit(x, xmin = 50)

My problem is that the log-log plot seems to be incorrect - for instance, I have the degree '7' 8 times overall so shouldn't this point on a log-log plot become 0.845 (log 7)/ 0.903 (log(8) as in (x/y)?

Moreover, can somebody tell me how to fit the line ( the power-law on the log-log scale) to the plot in the screen 2 ?

回答1:

I'm not familar with the igraph package, so can't you help with that specific package. However, here is some code for plotting distributions on the log-log plot. First some data:

set.seed(1)
x = ceiling(rlnorm(1000, 4))

Then we need to rearrange the to get the inverse CDF:

occur = as.vector(table(x))
occur = occur/sum(occur)
p = occur/sum(occur)
y = rev(cumsum(rev(p)))
x = as.numeric(names(table(x)))
plot(x, y, log="xy", type="l")

Gives

Regarding your fitting question, I think the discrepancy arises because igraph uses the MLE whereas you are doing simple linear regression (which is not recommended).


As a bit of a plug, I've started work on a package for fitting and plotting powerlaws. So, using this package you get:

library(poweRlaw)

##Create a displ object
m = displ$new(x)
##Estimate the cut-off
estimate_xmin(m)
m$setXmin(105); m$setPars(2.644)

##Plot the data and the PL line
plot(m)
lines(m, col=2)