Is it bad style to redefine non-S3 base functions

2019-02-16 20:03发布

问题:

So I'm working on an R package that use S3 classes, and it would be really nice if I could use sample as method for one of my classes. However, base already declares sample as a non-S3 function, so what I wonder is:

Is it bad style to redefine a non-S3 base function such as sample as an S3 function? And could this mess things up for users of my package?

A way you could redefine sample and still keep the base function working is:

sample.default <- base::sample
sample <- function(x, ...) {
  UseMethod("sample")
}
# This allows me to define a new sample method for my_special_class
sample.my_special_class <- function(...) {...}

But what I'm uncertain about is whether this will cause any trouble or namespace issues, for example, when loading other packages. I've also noticed that not many packages redefine sample, for example, dplyr uses sample_n and igraph uses sample_, and I thought that there might be some reason for this...

回答1:

Whether or not it's bad "style" is primarily opinion-based. But Writing R Extensions, Section 7.1 - Adding new generics tells you how to add new generics that mask base base/recommended functions. That said, your proposed solution is explicitly cautioned against in that section:

...a package can take over a function in the base package and make it generic by something like

 foo <- function(object, ...) UseMethod("foo")
 foo.default <- function(object, ...) base::foo(object)

Earlier versions of this manual suggested assigning foo.default <- base::foo. This is not a good idea, as it captures the base function at the time of installation and it might be changed as R is patched or updated.

One problem others may encounter is if another package also registers summary as a generic. Then packages depending on your package or the other package need to decide which generic to register their method with.

Not many packages redefine sample because it's usually not a good idea to mask base/recommended functions. That creates the potential for users to get different behavior at the top-level depending on whether or not your package is loaded.

One thing you certainly want to avoid is masking a generic in a base/recommended (or highly used) package. Doing so can prevent method dispatch at the top-level, causing headaches for users who expect the generic to work (e.g. dplyr::lag).



标签: r oop