What prevents me from including a class in Ruby?

2020-07-23 08:31发布

问题:

I'm trying to understand some Ruby internals:

Attempting to include a class instead of a module, results in a TypeError: (that's by design)

class C
end

class Foo
end

Foo.include(C)
#=> TypeError: wrong argument type Class (expected Module)

I would like to know how that type check works "under the hood".

Since classes are modules, I assumed Ruby checks whether the argument is an actual instance of Module:

C.is_a?(Module)        #=> true
C.instance_of?(Module) #=> false

Sounds reasonable, doesn't it?

But when I define my own Module subclass and create an instance of that subclass, it works just fine:

class Klass < Module
end

K = Klass.new

Foo.include(K)
# no error

But K is an instance of Klass, just like C is an instance of Class. And Klass is a subclass of Module, just like Class:

K.is_a?(Module)        #=> true
K.instance_of?(Module) #=> false

K.class #=> Klass
C.class #=> Class

Klass.superclass #=> Module
Class.superclass #=> Module

So what does that type check in include actually do?

Is there a hidden property that allows Ruby to tell modules from classes?

Since this is implementation specific: I'm especially interested in YARV/MRI.

回答1:

As @Stefan commented, Module#include calls the macro Check_Type(module, T_MODULE). You can find this in https://ruby-doc.org/core-2.6/Module.html#method-i-include

Further digging the source code, you can find that in the header file ruby.h, there's a line

#define Check_Type(v,t) rb_check_type((VALUE)(v),(t))

so Check_Type is just a handy alias of rb_check_type, and you can find the definition of rb_check_type in error.c:

void
rb_check_type(VALUE x, int t)  
{ 
    int xt;                    

    if (x == Qundef) {         
  rb_bug(UNDEF_LEAKED);        
    }

    xt = TYPE(x);              
    if (xt != t || (xt == T_DATA && RTYPEDDATA_P(x))) {
  unexpected_type(x, xt, t);   
    }
} 

The int t is the unique "ID" for a type, and int xt is the ID of the actual type of x. You can see if (xt != t || ...), so Check_Type is checking the type equivalence, not the is-a relation.

TL;DR

Ruby checks if the included module is actually a module and not a class.



回答2:

I'm answering my own question here


Is there a hidden property that allows Ruby to tell modules from classes?

Indeed there is. Internally, all Ruby objects start with a structure called RBasic:

struct RBasic {
    VALUE flags;
    const VALUE klass;
};

Within RBasic we have flags and those flags contain type information:

enum ruby_value_type {
    RUBY_T_NONE = 0x00,

    RUBY_T_OBJECT = 0x01,
    RUBY_T_CLASS = 0x02,
    RUBY_T_MODULE = 0x03,
    RUBY_T_FLOAT = 0x04,
    RUBY_T_STRING = 0x05,
    // ...

    RUBY_T_MASK = 0x1f
};

And that's what Ruby ultimately checks for when doing the type check:

#define RB_BUILTIN_TYPE(x) (int)(((struct RBasic*)(x))->flags & RUBY_T_MASK)

RB_BUILTIN_TYPE is also used by Marshal to dump type information:

module M ; end
class C ; end

Marshal.dump(M) #=> "\x04\bm\x06M"
Marshal.dump(C) #=> "\x04\bc\x06C"
Marshal.dump(4) #=> "\x04\bi\t"
#                          ^
#              m = module, c = class, i = integer

From within Ruby we can inspect the internal type via Fiddle:

require 'fiddle'

def type(obj)
  struct = Fiddle::Pointer.new(obj.object_id << 1)
  flags = struct[0]
  flags & 0x1f
end

module M ; end
class C ; end

type(M) #=> 3   (RUBY_T_MODULE = 0x03)
type(C) #=> 2   (RUBY_T_CLASS = 0x02)

And since Fiddle also allows to modify the underlying data, we could probably turn a class into a module by changing its flags accordingly ...

Let's give it a try:

class C
  def hello
    'hello from class'
  end
end

class Foo
end

Foo.include(C)
#=> TypeError: wrong argument type Class (expected Module)

Now the type change from 0x02 (class) to 0x03 (module):

require 'fiddle'

struct = Fiddle::Pointer.new(C.object_id << 1)
struct[0] = (struct[0] & ~0x1f) | 0x03

Foo.include(C)
# NoMethodError: undefined method `append_features' for C:Class

Still an error, but Ruby doesn't complain about the type anymore!

Apparently, Class undefines Module#append_features because the method doesn't make much sense for classes. Let's redefine it for C:

C.define_singleton_method(:append_features, Module.instance_method(:append_features))

Foo.include(C)
# no error!

Foo.ancestors
#=> [Foo, C, Object, BasicObject, Object, Kernel, BasicObject]

Foo.new.hello
#=> "hello from class"

And there we go: a class included in another class.

Note: I'm fiddling with Ruby's internals here. Don't use this kind of hacks in production. You've been warned.