Package compilation and relative path

2019-08-14 20:12发布

I must be very confused. Have looked around but cannot find a suitable answer and have a feeling I am doing something wrong.

Here is a minimalist example: My function test import a file from a folder and does subsequent analysis on that file. I have dozens of compressed files in the folder specified by path = "inst/extdata/input_data"

test = structure(function(path,letter) {
  file = paste0(path, "/file_",letter,".tsv.gz")
  data = read.csv(file,sep="\t",header=F,quote="\"",stringsAsFactors=F)

  return(mean(data$var1))

}, ex = function(){
  path = "inst/extdata/input_data"
  m1 = test(path,"A")
})

I am building a package with the function in the folder R/ of the package directory.

When I set the working directory to the package parent and run the example line by line, everything goes fine. However when I check the package with R CMD check it gives me the following:

cannot open file 'inst/extdata/input_data/file_A.tsv.gz': No such file or directory
Error in file(file, "rt") : cannot open the connection

I thought in checking and building the package the working directory is automatically set to the parent directory of the package (that in my case is "C:/Users/yuhu/R/Projects/ABCDpackage" but it seems not to be the case.

What is the best practice in this case? I would avoid converting all data in .rda format and put it in the data folder as there are too many files. Is there a way to compile the package and set in the function example the relative working directory where the package is located? This would be helpful also when the package is distributed (therefore it should not be my own path)

Many thanks for your help.

标签: r path package
2条回答
ら.Afraid
2楼-- · 2019-08-14 20:19

I think you might just want to go with read.table... At any rate give this a try.

    fopen <- file(paste0(path,"/file_",letter,".tsv.gz"),open="rt")
    data <- read.table(fopen,sep="\t",header=F,quote="\"",stringsAsFactors=F)

Refinement:

At the end of the day I think your problem is mainly because you are using read.csv instead of read.table which can open up .gz zipped files directly. So just to be sure. Here is a little experiment I did.

Experiment:

# zip up a .csv file (in this case example_A.csv) that exists in my working directory into .gz format

    system("gzip example_A.csv")

# just wanted to pass the path as a variable like you did

    path <- getwd()

    file <- paste0(path, "/example_", "A", ".csv.gz")
    data <- read.table(file, sep=",", header=FALSE, stringsAsFactors=FALSE) # I think 
               # these are the only options you need. 
               # stringsAsFactors=FALSE is agood one.

    data <- data[1:5,1:7] # a subset of the data

  V1       V2     V3      V4     V5      V6     V7
1 id Scenario Region    Fuel  X2005   X2010  X2015
2  1 BSE9VOG4     R1 Biomass      0  2.2986 0.8306
3  2 BSE9VOG4     R1    Coal 7.4339 13.3548 9.2918
4  3 BSE9VOG4     R1     Gas 1.9918  2.4623 2.5558
5  4 BSE9VOG4     R1     LFG 0.2111  0.2111 0.2111

At the end of the day (I say that too much) you can be certain that the problem is in either the method you used to read the zipped up files or the text string you've constructed for the file names (haven't looked into the latter). At any rate best of luck with the package. I hope it turns tides.

查看更多
We Are One
3楼-- · 2019-08-14 20:23

When R CMD check (or the user later for that matter) runs the example, you need to provide the full path to the file! You can build that path easily with the system.file or the path.package command. If your package is called foo, the following should do the trick:

    }, ex = function(){
  path = paste0(system.file(package = "foo"), "/extdata/input_data")
  m1 = test(path,"A")
  })

You might want to add a file.path command somewhere to be OS independent.

Since read.csv is just a wrapper for read.table I would not expect any fundamental difference w.r.t. to reading compressed files.

Comment: R removes the "inst/" part of the directory when it builds the system directory. This thread has a discussion on the inst directory

查看更多
登录 后发表回答