-->

What does “S3 methods” mean in R?

2019-01-07 03:01发布

问题:

Since I am fairly new to R, I do not know what the S3 methods and objects are. I found that there are S3 and S4 object systems, and some recommend to use S3 over S4 if possible (http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html). However, I do not know the exact definition of S3 methods/objects.

回答1:

Most of the relevant information can be found by looking at ?S3 or ?UseMethod, but in a nutshell:

S3 refers to a scheme of method dispatching. If you've used R for a while, you'll notice that there are print, predict and summary methods for a lot of different kinds of objects.

In S3, this works by:

  • setting the class of objects of interest (e.g.: the return value of a call to method glm has class glm)
  • providing a method with the general name (e.g. print), then a dot, and then the classname (e.g.: print.glm)
  • some preparation has to have been done to this general name (print) for this to work, but if you're simply looking to conform yourself to existing method names, you don't need this (see the help I refered to earlier if you do).

To the eye of the beholder, and particularly, the user of your newly created funky model fitting package, it is much more convenient to be able to type predict(myfit, type="class") than predict.mykindoffit(myfit, type="class").

There is quite a bit more to it, but this should get you started. There are quite a few disadvantages to this way of dispatching methods based upon an attribute (class) of objects (and C purists probably lie awake at night in horror of it), but for a lot of situations, it works decently. With the current version of R, newer ways have been implemented (S4 and reference classes), but most people still (only) use S3.



回答2:

To get you started with S3, look at the code for the median function. Typing median at the command prompt reveals that it has one line in its body, namely

UseMethod("median")

That means that it is an S3 method. In other words, you can have a different median function for different S3 classes. To list all the possible median methods, type

methods(median) #actually not that interesting.  

In this case, there's only one method, the default, which is called for anything. You can see the code for that by typing

median.default

A much more interesting example is the print function, which has many different methods.

methods(print)  #very exciting

Notice that some of the methods have *s next to their name. That means that they are hidden inside some package's namespace. Use find to find out which package they are in. For example

find("acf")  #it's in the stats package
stats:::print.acf


回答3:

From http://adv-r.had.co.nz/OO-essentials.html:

R’s three OO systems differ in how classes and methods are defined:

  • S3 implements a style of OO programming called generic-function OO. This is different from most programming languages, like Java, C++ and C#, which implement message-passing OO. With message-passing, messages (methods) are sent to objects and the object determines which function to call. Typically, this object has a special appearance in the method call, usually appearing before the name of the method/message: e.g. canvas.drawRect("blue"). S3 is different. While computations are still carried out via methods, a special type of function called a generic function decides which method to call, e.g., drawRect(canvas, "blue"). S3 is a very casual system. It has no formal definition of classes.

  • S4 works similarly to S3, but is more formal. There are two major differences to S3. S4 has formal class definitions, which describe the representation and inheritance for each class, and has special helper functions for defining generics and methods. S4 also has multiple dispatch, which means that generic functions can pick methods based on the class of any number of arguments, not just one.

  • Reference classes, called RC for short, are quite different from S3 and S4. RC implements message-passing OO, so methods belong to classes, not functions. $ is used to separate objects and methods, so method calls look like canvas$drawRect("blue"). RC objects are also mutable: they don’t use R’s usual copy-on-modify semantics, but are modified in place. This makes them harder to reason about, but allows them to solve problems that are difficult to solve with S3 or S4.

There’s also one other system that’s not quite OO, but it’s important to mention here:

  • base types, the internal C-level types that underlie the other OO systems. Base types are mostly manipulated using C code, but they’re important to know about because they provide the building blocks for the other OO systems.


回答4:

I came to this question mostly wondering where the names came from. It appears from this wikipedia article that the name refers to the version of the S Programming Language that R is based on. The method dispatching schemes described in the other answers come from S and are labelled appropriately according to version.



回答5:

Try

methods(residuals)

which lists, among others, "residuals.lm" and "residuals.glm". This means when you have fitted a linear model, m, and type residuals(m), residuals.lm will be called. When you have fitted a generalized linear model, residuals.glm will be called. It's kind of the C++ object model turned upside down. In C++, you define a base class having virtual functions, which are overrided by derived classed. In R you define a virtual (aka generic) function and then you decide which classes will override this function (aka define a method). Note that the classes doing this do not need to be derived from one common super class. I would not agree to generally prefer S3 over S4. S4 has more formalism (= more typing) and this may be too much for some applications. S4 classes, however, can be de defined like a class or struct in C++. You can specify that an object of a certain class is made up of a string and two numbers for example:

setClass("myClass", representation(label = "character", x = "numeric", y = "numeric"))

Methods that are called with an object of that class can rely on the object having those members. That's very different from S3 classes, which are just a list of a bunch of elements.

With S3 and S4, you call a member function by fun(object, args) and not by object$fun(args). If you are looking for something like the latter, have a look at the proto package.



标签: r r-faq r-s3