I want to plot a CCDF graph for some of my simulated power-law tail data on a log-log axis, below is my R code of plotting a CCDF graph on a normal axis, I used the code on the link: (How to plot a CCDF gragh?)
> load("fakedata500.Rda")
> x<-fakedata500
> f<-ecdf(x)
> f
Empirical CDF
Call: ecdf(x)
x[1:500] = 0.50174, 0.50307, 0.50383, ..., 81.674, 140.63
> plot(f)
Below is the ECDF graph:
> plot(sort(x), 1-f(sort(x)), type="s", lwd=1)
and this command gives me the CCDF graph:
However, I would like to plot my CCDF graph on a log-log axis to produce a graph like the picture below:
(graph from "Minimizing errors in identifying Lévy flight behaviour of organisms.")
Is there a way to do it in R?
If so, how to do a linear regression on the CCDF graph as well? I have tried using command below but that just not working for me.
a<-plot(sort(x), 1-f(sort(x)), type="s", lwd=1)
> a
NULL
> res=lm(a)
Error in terms.formula(formula, data = data) :
argument is not a valid model
Much appreciate.
UPDATE:
I used the code given by @BondedDust and successfully generated a CCDF graph:
(plot(sort(x) , 1-ecdf(x)(sort(x) ), log="xy"))
Below is the code of how I generated my dataset:
u<-runif(500)
fakedata500<-((2*(1-u))^(-1))
Here's another way using quantiles.
library(VGAM) # for rpareto(...)
set.seed(1) # for reproducible example
X <- rpareto(1000,location=1,shape=1)
p <- ppoints(100)
par(mfrow=c(1,3))
plot(quantile(X,p=p),p,type="l",ylab="P(X < x)",xlab="x",main="CDF")
plot(quantile(X,p=p),1-p,type="l",ylab="P(X > x)",xlab="x",main="CCDF")
plot(log(quantile(X,p=p)),log(1-p),
ylab="log[P(X > x)]",xlab="log(x)",main="CCDF: log-log")
And here's the regression.
df <- data.frame(x=log(1-p),y=log(quantile(X,p=p)))
fit <- lm(y~x,df)
summary(fit)
# ...
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.039559 0.007584 5.216 1.02e-06 ***
# x -0.944380 0.005427 -174.028 < 2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.05317 on 98 degrees of freedom
# Multiple R-squared: 0.9968, Adjusted R-squared: 0.9967
# F-statistic: 3.029e+04 on 1 and 98 DF, p-value: < 2.2e-16
You give no data so no regression output will be attempted, but it should be pretty easy to construct the plot:
plot(sort(x) , 1-ecdf(x)(sort(x) ), log="xy")
You do get a warning because the most extreme point will have ecdf of 1 and 1-1 cannot have a log value at it is omitted. With a random sample of lognormal variable that plot looks like. Alternate labeling of the axis can be done by suppressing axis ticks in the plot call and then adding them with the axis
function. This was done with a sample of 500 realizations from the lognormal distribution with default paramaters:
You should use the [poweRlaw][1]
package for this.
Load the package and generate some data:
library(poweRlaw)
x = rplcon(1000, 1, 2)
Next create a continuous powerlaw object
m = conpl$new(x)
Plot
plot(m)
Estimate the scaling parameter,
p = estimate_pars(m)
m$setPars(p)
and add the fitted line to the plot
lines(m, col=2)
See the package vignettes for more details.
BTW, avoid simple linear regression on estimating the scaling parameter (all assumptions are broken).