I'm trying to understand some Ruby internals:
Attempting to include
a class instead of a module, results in a TypeError
: (that's by design)
class C
end
class Foo
end
Foo.include(C)
#=> TypeError: wrong argument type Class (expected Module)
I would like to know how that type check works "under the hood".
Since classes are modules, I assumed Ruby checks whether the argument is an actual instance of Module
:
C.is_a?(Module) #=> true
C.instance_of?(Module) #=> false
Sounds reasonable, doesn't it?
But when I define my own Module
subclass and create an instance of that subclass, it works just fine:
class Klass < Module
end
K = Klass.new
Foo.include(K)
# no error
But K
is an instance of Klass
, just like C
is an instance of Class
. And Klass
is a subclass of Module
, just like Class
:
K.is_a?(Module) #=> true
K.instance_of?(Module) #=> false
K.class #=> Klass
C.class #=> Class
Klass.superclass #=> Module
Class.superclass #=> Module
So what does that type check in include
actually do?
Is there a hidden property that allows Ruby to tell modules from classes?
Since this is implementation specific: I'm especially interested in YARV/MRI.
As @Stefan commented, Module#include
calls the macro Check_Type(module, T_MODULE)
. You can find this in https://ruby-doc.org/core-2.6/Module.html#method-i-include
Further digging the source code, you can find that in the header file ruby.h, there's a line
#define Check_Type(v,t) rb_check_type((VALUE)(v),(t))
so Check_Type
is just a handy alias of rb_check_type
, and you can find the definition of rb_check_type
in error.c:
void
rb_check_type(VALUE x, int t)
{
int xt;
if (x == Qundef) {
rb_bug(UNDEF_LEAKED);
}
xt = TYPE(x);
if (xt != t || (xt == T_DATA && RTYPEDDATA_P(x))) {
unexpected_type(x, xt, t);
}
}
The int t
is the unique "ID" for a type, and int xt
is the ID of the actual type of x
. You can see if (xt != t || ...)
, so Check_Type
is checking the type equivalence, not the is-a relation.
TL;DR
Ruby checks if the included module is actually a module and not a class.
I'm answering my own question here
Is there a hidden property that allows Ruby to tell modules from classes?
Indeed there is. Internally, all Ruby objects start with a structure called RBasic
:
struct RBasic {
VALUE flags;
const VALUE klass;
};
Within RBasic
we have flags
and those flags contain type information:
enum ruby_value_type {
RUBY_T_NONE = 0x00,
RUBY_T_OBJECT = 0x01,
RUBY_T_CLASS = 0x02,
RUBY_T_MODULE = 0x03,
RUBY_T_FLOAT = 0x04,
RUBY_T_STRING = 0x05,
// ...
RUBY_T_MASK = 0x1f
};
And that's what Ruby ultimately checks for when doing the type check:
#define RB_BUILTIN_TYPE(x) (int)(((struct RBasic*)(x))->flags & RUBY_T_MASK)
RB_BUILTIN_TYPE
is also used by Marshal
to dump type information:
module M ; end
class C ; end
Marshal.dump(M) #=> "\x04\bm\x06M"
Marshal.dump(C) #=> "\x04\bc\x06C"
Marshal.dump(4) #=> "\x04\bi\t"
# ^
# m = module, c = class, i = integer
From within Ruby we can inspect the internal type via Fiddle:
require 'fiddle'
def type(obj)
struct = Fiddle::Pointer.new(obj.object_id << 1)
flags = struct[0]
flags & 0x1f
end
module M ; end
class C ; end
type(M) #=> 3 (RUBY_T_MODULE = 0x03)
type(C) #=> 2 (RUBY_T_CLASS = 0x02)
And since Fiddle also allows to modify the underlying data, we could probably turn a class into a module by changing its flags accordingly ...
Let's give it a try:
class C
def hello
'hello from class'
end
end
class Foo
end
Foo.include(C)
#=> TypeError: wrong argument type Class (expected Module)
Now the type change from 0x02
(class) to 0x03
(module):
require 'fiddle'
struct = Fiddle::Pointer.new(C.object_id << 1)
struct[0] = (struct[0] & ~0x1f) | 0x03
Foo.include(C)
# NoMethodError: undefined method `append_features' for C:Class
Still an error, but Ruby doesn't complain about the type anymore!
Apparently, Class
undefines Module#append_features
because the method doesn't make much sense for classes. Let's redefine it for C
:
C.define_singleton_method(:append_features, Module.instance_method(:append_features))
Foo.include(C)
# no error!
Foo.ancestors
#=> [Foo, C, Object, BasicObject, Object, Kernel, BasicObject]
Foo.new.hello
#=> "hello from class"
And there we go: a class included in another class.
Note: I'm fiddling with Ruby's internals here. Don't use this kind of hacks in production. You've been warned.