Subtitle: Why a MOP is not always a good idea
cperl being a perl11, i.e. 5+6=11, of course means that cperl classes are designed after perl5 and perl6 classes. perl5 does not have a builtin class keyword, but allows to add keywords to be added at runtime. cperl and perl6 of course do have a builtin class keyword.
The backcompat problem with a new builtin keyword is, that some usages
of variables, package or function names will not work anymore, because
the new keyword stepped over it. With the current cperl 5.28 release
this is indeed a problem for the existing B::class
method which
cannot be imported anymore and be used as class($op)
. Instead all
these usage have been replaced with B::class($op)
.
Technically this can be avoided by hijacking only the first token
in a statement, and let those be valid cperl terms:
$class
, sub class {}
, package class;
An example
class Test::Builder::Module {
has int $Child_Error;
has $Parent;
has $Parent_TODO;
has str $Name;
has str $Child_Name;
has Bailed_Out_Reason;
has Bailed_Out;
has bool $Have_Plan;
has $No_Plan;
has $Skip_All;
has @Test_Results;
}
class Test::More is Test::Builder::Module {
}
Just look at the Perl6 Class Tutorial and replace all “traits” behind signatures with attributes.
e.g.
method area() returns Int {}
=> method area() :Int {}
,
or
has Bool $.done is rw;
=> has Bool $done :rw;
,
leave out the new secondory sigils, e.g.
has Int $.x;
=> has Int $x;
, and you got the cperl syntax.
class Point {
has Int $x;
has Int $y;
}
class Rectangle {
has Point $lower;
has Point $upper;
method area() :Int {
($upper->x - $lower->x) * ( $upper->y - $lower->y);
}
}
# Create a new Rectangle from two Points
my $r = new Rectangle(lower => new Point(x => 0, y => 0),
upper => new Point(x => 10, y => 10));
say $r->area; # OUTPUT: «100»
The old perl5 design for this was:
package Point; use fields qw(x y);
sub new {
my str $name = shift;
bless {@_}, $name;
}
package Rectangle;
use base ('Point'); use fields qw(lower upper);
sub new {
my str $name = shift;
bless {@_}, $name;
}
sub area {
my Rectangle $self = shift;
my ($lower, $upper) = ($self->{lower}, $self->{upper});
($upper->{x} - $lower->{x}) * ($upper->{y} - $lower->{y});
}
my $r = new Rectangle(lower => new Point(x => 0, y => 0),
upper => new Point(x => 10, y => 10));
print $r->area; # OUTPUT: «100»
With the old pre-5.10 pseudo-hashes the field names upper
, lower
as hash keys where compile-time optimized to linear-time array access
to the magic @Rectangle::FIELDS
array. The hash was made restricted,
ensuring typos in the field names would lead to compile-time errors
when those keys did not exist.
With perl6 or cperl fields you’ve got the same feature; just a different, more functional implementation. “functional” meaning features are hidden between functions, not datatypes. Supporting datatypes in an API will forever restrict it’s usage to this specific datatype, you will not be able to change the underlying structures and algorithms. This was the biggest mistakes perl and python did at the beginning.
Encapsulated fields
In perl6 fields are encapsulated. “Just as a my variable cannot be accessed from outside its declared scope, fields are not accessible outside of the class. This encapsulation is one of the key principles of object oriented design.”
perl5 fields were optionally private if given a _
prefix, but you
could always use the magic @FIELDS
array and hash in the first slot
to access the private fields also.
cperl fields are encapsulated, but the trait syntax is different to perl6. You should use the method syntax, not a hash or array access syntax. Internally this method is then compiled to the most efficient op or method.
The Moose syntax is more different to perl6 than cperl. And it’s implemention is beyond naive. But theoretically this could be improved, the biggest problem is still the troublesome syntax, based on the naive implementation restrictions.
With the new cperl fields API you can inspect all defined fields at run-time.
# return-value of Mu::fields or classobj->fields
class Foo {
has $foo;
has @bar;
has %baz :const;
}
my @fields = Foo->fields;
print $fields[0]->name; # foo
With cperl classes the fields methods returns a list of fields objects, representing the has declarations of the class with all imported roles - similar to the perl6 Metamodel::AttributeContainer returning Attribute objects.
Each such returned field object supports the following methods name
,
package
, const
, type
, get_value
and set_value
. The fields
method is valid for classes and objects. Only objects do have values,
therefore {g,s}et_value
on a class field is invalid.
Types, OO
There are type systems and there are type systems. Nominal or structural, co variant/contra variant, sound or unsound, making it slower or making it faster, static or dynamic, gradual or optional, hated or beloved.
What almost nobody knows, perl5 always had room for types built-in.
my Coffee $c;
assigned the type Coffee
to the scalar variable $c
at
compile-time. The type Coffee
needed to exist already, i.e. it needed
to be a properly declared package. Internally every package (or “class”)
defines a global symbol-table names space, a hash of symbols under main.
i.e. %main::Coffee::
(called a stash, “symboltable hash”).
There are even some modules on CPAN which declares types on some of its
variables.
…
Types are compile-time guarantees and hints for the compiler and optimizer.
Types structure classes and method dispatch.
Types document code, makes code stricter, with more static guarantees.
You can gradually switch from obsessive test driven development with test suites running hours with over-architectured refactoring, to obsessive statically typed code, running in 2x faster time, and not being able to debug into compile-time errors, which were previously dynamic run-time errors.
This concept came with Common Lisp and its famously optimizing
compiler, called python.
Yes, really, the CMUCL compiler, now still alive as SBCL. Types and compiler
pragmas were purely optional, as every symbol and variable carried its
type with itself. (or (>= safety 2) (>= safety speed 1))
(defmacro my-1+ (x)
`(the fixnum (1+ (the fixnum ,x))))
Statically typed variables loose all of its types at run-time - if you strip it from its dwarf sections, but nobody does run-time type introspection via dwarf besides Stephen Kell, just via horrible C++ RTTI.
Object systems are basically classes, i.e. types, declared with fields and methods. The optimizer figures out the object layout according to the type hierarchy, the fields and methods.
MOP
A MOP (“meta object protocol”) was invented to change the default behavior for objects, methods and classes, basically to make them better and slower. It came up with the differences in LISP frames vs CLOS. In CL we had a huge slow monster CLOS, and many small elegant but limited “frames” systems.
Now we know basically three types of object systems:
classic hierarchical compile-time classes with inheritance, shared methods per class (C++),
dynamic prototypes with all the methods in the objects (javascript),
mixins with compile-time composition of classes, in contrast to run-time dispatch to parents via inheritance (flavors, CLOS, ruby include).
With a MOP you are even able to change a classic system to a prototype or mixin system, and vice versa. Ruby on rails (ab)used the MOP all over which makes it imposible to scale. With a proper OO design as in Sinatra/Dancer with delegated classes known at compile-time you can easily scale and optimize such a system. A MOP is a very poor adhoc method to workaround a proper OO design. It’s nice for prototypes, such as Moose, which is a very immature adhoc prototype, but it should never make it into a production system.
Difference from class to package
A cperl class is internally a readonly package with a CLASS flag set. A class is closed, a readonly block by default. methods and fields are fixed. If you want dynamic classes use a package. Fields are lexical members of the class, copied into objects. Fields and methods can be composed from roles, i.e. copied at compile-time. Conflicts are then detected at compile-time, and not at run-time as with dynamic packages and the ISA inheritance mechanism.
Class fields have no variable data layout as with old blessed objects,
where fields could be stored as scalar, array or hash. Class fields
are stored as offset into a not-refcounted array, similar to C structs.
In fact with a the :native
attribute class objects can be passed via
the FFI to C back and forth. An int field takes 4 byte, a double field 8 byte,
and not 4 words as a normal scalar value.
Anon classes
Intermediate classes create via role mixins (the does
keyword) are
stored in the class slot of every object and refer to class stashes.
But when you mix types or multiple classes combined via and
or
or
you cannot use a stash, you’d need a list of stashes.
perl6 solved this problem by switching from stashes to objects.
perl5 solved this via creating temporary anon classes to hold mixins, and
mro
/@ISA
to support multiple inheritance.
cperl composes mixins at compile-time, without the need to hold anon classes at all.
Multiple dispatch - polymorphism
cperl 5.28 does not support the multi keyword yet, there’s no
polymorph dispatch on methods with the same name (generics) but
varying number and type of arguments yet. polymorphism solves the problem of
generic methods, which do the same but its implemention deviates on the given
arguments. E.g. +
acts differently on double or int or string.
polymorphism is the proper solution for problems previously solved with
the overload pragma.
Internally multi methods will be stored with a name suffix, either
seperated by the public name with \0
or @
, followed by the types
of the accepted arguments. The signature is encoded into the name. This
is similar to C++ name mangling for the run-time dispatcher.
\0
is a good prefix because in cperl binary names are forbidden, for
security and performance reasons.
@
would be a good prefix because cperl adopted @
from
Devel:::NYTProf for names of anonymous subroutines. An “ANON“
import method in cperl is named “import@” instead, in Devel:::NYTProf
it would be even named “import@[package.pm,10-12]“. perl5 anonymizes some
names when the GV symbol is being thrown away to __ANON__
, esp. with
import methods.
Limitations
5.28 still has some class limitations.
The number of fields is limited, as in C.
The inliner is not yet implemented, so field index fixups with roles are not supported yet.
When copying a method from a role to a class, and the field index from the role method would
be different to a field index in the resulting class, the method is not yet fixed up to the
new indices. A temp. solution would be to change the ordering of the roles, or to use the
$self->field
method syntax in the role method. This requires the not yet finished inliner.
Currently we can only alias composed role methods and we don’t change the ordering of the fields.
eval ‘class {}’ fails
A class cannot be created in an eval block or subroutine. The pad lookup is still global and not per optional CvPADLIST. During development of cperl 5.28 I found the severe limitations of the perl5 pad design, the delegation of FAKE pads into nested scopes. upvalues are not copied or delegated to the real slot in the outer pad, but just marked as NULL FAKE pad. This led to severe compiler bugs, only fixed in 5.28.
i.e.
my @a[1];
sub { $a[1] = 1 }->();
missed the compile-time error inside the closed-over sub.
Also all uoob (compile-time out-of-bounds checks) optimizations were missing on those nested fake PADs.
So I had to add a new pad API pad_findmy_real
to find the real pad/type of a nested lexical variable.