How avoid using global variable when using nftw

2020-02-06 00:49发布

I want to use nftw to traverse a directory structure in C.

However, given what I want to do, I don't see a way around using a global variable.

The textbook examples of using (n)ftw all involve doing something like printing out a filename. I want, instead, to take the pathname and file checksum and place those in a data structure. But I don't see a good way to do that, given the limits on what can be passed to nftw.

The solution I'm using involves a global variable. The function called by nftw can then access that variable and add the required data.

Is there any reasonable way to do this without using a global variable?

Here's the exchange in previous post on stackoverflow in which someone suggested I post this as a follow-up.

3条回答
2楼-- · 2020-02-06 01:20

Using ftw can be really, really bad. Internally it will save the the function pointer that you use, if another thread then does something else it will overwrite the function pointer.

Horror scenario:

thread 1:  count billions of files
thread 2:  delete some files
thread 1:  ---oops, it is now deleting billions of 
              files instead of counting them.

In short. You are better off using fts_open.

If you still want to use nftw then my suggestion is to put the "global" type in a namespace and mark it as "thread_local". You should be able to adjust this to your needs.

/* in some cpp file */
namespace {
   thread_local size_t gTotalBytes{0};  // thread local makes this thread safe
int GetSize(const char* path, const struct stat* statPtr, int currentFlag, struct FTW* internalFtwUsage) {
    gTotalBytes+=  statPtr->st_size;
    return 0;  //ntfw continues
 }
} // namespace


size_t RecursiveFolderDiskUsed(const std::string& startPath) {
   const int flags = FTW_DEPTH | FTW_MOUNT | FTW_PHYS;
   const int maxFileDescriptorsToUse = 1024; // or whatever
   const int result = nftw(startPath.c_str(), GetSize, maxFileDescriptorsToUse , flags);

  // log or something if result== -1
  return gTotalBytes;
}
查看更多
冷血范
3楼-- · 2020-02-06 01:20

The data is best given static linkage (i.e. file-scope) in a separate module that includes only functions required to access the data, including the function passed to nftw(). That way the data is not visible globally and all access is controlled. It may be that the function that calls ntfw() is also part of this module, enabling the function passed to nftw() to also be static, and thus invisible externally.

In other words, you should do what you are probably doing already, but use separate compilation and static linkage judiciously to make the data only visible via access functions. Data with static linkage is accessible by any function within the same translation unit, and you avoid the problems associated with global variables by only including functions in that translation unit that are creators, maintainers or accessors of that data.

The general pattern is:

datamodule.h

#if defined DATAMODULE_INCLUDE
<type> create_data( <args>) ;
<type> get_data( <args> ) ;
#endif

datamodule.c

#include "datamodule.h"

static <type> my_data ;

static int nftwfunc(const char *filename, const struct stat *statptr, int fileflags, struct FTW *pfwt)
{
    // update/add to my_data
    ...
}


<type> create_data( const char* path, <other args>)
{
    ...

    ret = nftw( path, nftwfunc, fd_limit, flags);

    ... 
}

<type> get_data( <args> )
{
    // Get requested data from my_data and return it to caller
}
查看更多
淡お忘
4楼-- · 2020-02-06 01:33

No. nftw doesn't offer any user parameter that could be passed to the function, so you have to use global (or static) variables in C.

GCC offers an extension "nested function" which should capture the variables of their enclosing scopes, so they could be used like this:

void f()
{
  int i = 0;
  int fn(const char *,
    const struct stat *, int, struct FTW *) {
    i++;
    return 0;
  };
  nftw("path", fn, 10, 0);
}
查看更多
登录 后发表回答