How to plot CCDF graph on a logarithmic scale?

2020-07-30 01:26发布

I want to plot a CCDF graph for some of my simulated power-law tail data on a log-log axis, below is my R code of plotting a CCDF graph on a normal axis, I used the code on the link: (How to plot a CCDF gragh?)

> load("fakedata500.Rda")
> x<-fakedata500
> f<-ecdf(x)
> f
Empirical CDF 
Call: ecdf(x)
 x[1:500] = 0.50174, 0.50307, 0.50383,  ..., 81.674, 140.63
> plot(f)

Below is the ECDF graph:

enter image description here

> plot(sort(x), 1-f(sort(x)), type="s", lwd=1)

and this command gives me the CCDF graph:

enter image description here

However, I would like to plot my CCDF graph on a log-log axis to produce a graph like the picture below: (graph from "Minimizing errors in identifying Lévy flight behaviour of organisms.")

enter image description here

Is there a way to do it in R?

If so, how to do a linear regression on the CCDF graph as well? I have tried using command below but that just not working for me.

a<-plot(sort(x), 1-f(sort(x)), type="s", lwd=1)
> a
NULL
> res=lm(a)
Error in terms.formula(formula, data = data) : 
  argument is not a valid model

Much appreciate.


UPDATE:

I used the code given by @BondedDust and successfully generated a CCDF graph:

(plot(sort(x) , 1-ecdf(x)(sort(x) ), log="xy"))

enter image description here

Below is the code of how I generated my dataset:

u<-runif(500)
fakedata500<-((2*(1-u))^(-1))

3条回答
神经病院院长
2楼-- · 2020-07-30 01:54

You should use the [poweRlaw][1] package for this.

Load the package and generate some data:

library(poweRlaw)
x = rplcon(1000, 1, 2)

Next create a continuous powerlaw object

m = conpl$new(x)

Plot

plot(m)

Estimate the scaling parameter,

p = estimate_pars(m)
m$setPars(p)

and add the fitted line to the plot

lines(m, col=2)

See the package vignettes for more details.

BTW, avoid simple linear regression on estimating the scaling parameter (all assumptions are broken).

查看更多
祖国的老花朵
3楼-- · 2020-07-30 02:00

You give no data so no regression output will be attempted, but it should be pretty easy to construct the plot:

plot(sort(x) , 1-ecdf(x)(sort(x) ), log="xy")

You do get a warning because the most extreme point will have ecdf of 1 and 1-1 cannot have a log value at it is omitted. With a random sample of lognormal variable that plot looks like. Alternate labeling of the axis can be done by suppressing axis ticks in the plot call and then adding them with the axis function. This was done with a sample of 500 realizations from the lognormal distribution with default paramaters:

enter image description here

查看更多
可以哭但决不认输i
4楼-- · 2020-07-30 02:16

Here's another way using quantiles.

library(VGAM)  # for rpareto(...)
set.seed(1)    # for reproducible example
X <- rpareto(1000,location=1,shape=1)
p <- ppoints(100)
par(mfrow=c(1,3))
plot(quantile(X,p=p),p,type="l",ylab="P(X < x)",xlab="x",main="CDF")
plot(quantile(X,p=p),1-p,type="l",ylab="P(X > x)",xlab="x",main="CCDF")
plot(log(quantile(X,p=p)),log(1-p),
     ylab="log[P(X > x)]",xlab="log(x)",main="CCDF: log-log")

And here's the regression.

df  <- data.frame(x=log(1-p),y=log(quantile(X,p=p)))
fit <- lm(y~x,df)
summary(fit)
# ...
# Coefficients:
#              Estimate Std. Error  t value Pr(>|t|)    
# (Intercept)  0.039559   0.007584    5.216 1.02e-06 ***
# x           -0.944380   0.005427 -174.028  < 2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.05317 on 98 degrees of freedom
# Multiple R-squared:  0.9968,  Adjusted R-squared:  0.9967 
# F-statistic: 3.029e+04 on 1 and 98 DF,  p-value: < 2.2e-16
查看更多
登录 后发表回答