colleagues! I have panel data:
Company year Beta NI Sales Export Hedge FL QR AT Foreign
1 1 2010 -2.2052800 293000 1881000 78.6816 0 23.5158 1.289 0.6554 3000
2 1 2011 -2.2536069 316000 2647000 81.4885 0 21.7945 1.1787 0.8282 22000
3 1 2012 0.3258693 363000 2987000 82.4908 0 24.5782 1.2428 0.813 -11000
4 1 2013 0.4006030 549000 4546000 79.4325 0 31.4168 0.6038 0.7905 71000
5 1 2014 -0.4508811 348000 5376000 79.2411 0 37.1451 0.6563 0.661 -64000
6 1 2015 0.1494696 355000 5038000 77.1735 0 33.3852 0.9798 0.5483 37000
But R shows the mistake when I try to use plm package for the regression:
panel <- read.csv("Panel.csv", header=T, sep=";")
p=plm(data=panel,Beta~NI, model="within",index=c("id","year"))
Error in pdim.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
In addition: Warning messages:
1: In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")
2: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
3: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
I searched this error in the Internet and read that it's connected with the id of company and year. But I did not find the way how to avoid this problem. Also, when I do na.omit(panel), R does not show the error, but it's significant to stay NA data and companies in the data. Please, tell me to do with this problem. Thank you.
Let consider the
Produc
dataset in theplm
package.In this dataset information are collected over time (17 years) and over the same sample units (48 US States).
plm
requires that each (state, year) pair be unique.The command
plm
works nicely with this dataset:Now we duplicate one of the (state, year) pairs:
and
plm
now generates the same error message that you described above:Hope this can help you.