How to plot CCDF graph on a logarithmic scale?

2020-07-30 02:03发布

问题:

I want to plot a CCDF graph for some of my simulated power-law tail data on a log-log axis, below is my R code of plotting a CCDF graph on a normal axis, I used the code on the link: (How to plot a CCDF gragh?)

> load("fakedata500.Rda")
> x<-fakedata500
> f<-ecdf(x)
> f
Empirical CDF 
Call: ecdf(x)
 x[1:500] = 0.50174, 0.50307, 0.50383,  ..., 81.674, 140.63
> plot(f)

Below is the ECDF graph:

> plot(sort(x), 1-f(sort(x)), type="s", lwd=1)

and this command gives me the CCDF graph:

However, I would like to plot my CCDF graph on a log-log axis to produce a graph like the picture below: (graph from "Minimizing errors in identifying Lévy flight behaviour of organisms.")

Is there a way to do it in R?

If so, how to do a linear regression on the CCDF graph as well? I have tried using command below but that just not working for me.

a<-plot(sort(x), 1-f(sort(x)), type="s", lwd=1)
> a
NULL
> res=lm(a)
Error in terms.formula(formula, data = data) : 
  argument is not a valid model

Much appreciate.


UPDATE:

I used the code given by @BondedDust and successfully generated a CCDF graph:

(plot(sort(x) , 1-ecdf(x)(sort(x) ), log="xy"))

Below is the code of how I generated my dataset:

u<-runif(500)
fakedata500<-((2*(1-u))^(-1))

回答1:

Here's another way using quantiles.

library(VGAM)  # for rpareto(...)
set.seed(1)    # for reproducible example
X <- rpareto(1000,location=1,shape=1)
p <- ppoints(100)
par(mfrow=c(1,3))
plot(quantile(X,p=p),p,type="l",ylab="P(X < x)",xlab="x",main="CDF")
plot(quantile(X,p=p),1-p,type="l",ylab="P(X > x)",xlab="x",main="CCDF")
plot(log(quantile(X,p=p)),log(1-p),
     ylab="log[P(X > x)]",xlab="log(x)",main="CCDF: log-log")

And here's the regression.

df  <- data.frame(x=log(1-p),y=log(quantile(X,p=p)))
fit <- lm(y~x,df)
summary(fit)
# ...
# Coefficients:
#              Estimate Std. Error  t value Pr(>|t|)    
# (Intercept)  0.039559   0.007584    5.216 1.02e-06 ***
# x           -0.944380   0.005427 -174.028  < 2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.05317 on 98 degrees of freedom
# Multiple R-squared:  0.9968,  Adjusted R-squared:  0.9967 
# F-statistic: 3.029e+04 on 1 and 98 DF,  p-value: < 2.2e-16


回答2:

You give no data so no regression output will be attempted, but it should be pretty easy to construct the plot:

plot(sort(x) , 1-ecdf(x)(sort(x) ), log="xy")

You do get a warning because the most extreme point will have ecdf of 1 and 1-1 cannot have a log value at it is omitted. With a random sample of lognormal variable that plot looks like. Alternate labeling of the axis can be done by suppressing axis ticks in the plot call and then adding them with the axis function. This was done with a sample of 500 realizations from the lognormal distribution with default paramaters:



回答3:

You should use the [poweRlaw][1] package for this.

Load the package and generate some data:

library(poweRlaw)
x = rplcon(1000, 1, 2)

Next create a continuous powerlaw object

m = conpl$new(x)

Plot

plot(m)

Estimate the scaling parameter,

p = estimate_pars(m)
m$setPars(p)

and add the fitted line to the plot

lines(m, col=2)

See the package vignettes for more details.

BTW, avoid simple linear regression on estimating the scaling parameter (all assumptions are broken).