Wednesday, March 04, 2015

Thoughts about the new meta class system MOP 2

Some may remember, MOP2 is the idea of letting Groovy have a new meta class system, since the old system has some serious internal problems. The problems are largely related to the basic concepts like seeing the meta class as something that invokes methods. But also problems with locking for caching and global structures and many many small mistakes in the MOP that become established. The result is a really complicated MOP, that makes the life of everyone challenging, that wants to use those things.

Some ideas have been born out of how to redesign things from scratch. And over the last two years I was trying things here and there, toying with ideas, preparing for a big jump. But since the day I become aware that Pivotal is pulling the plug for funding Groovy I had to realize, that a long term adoption project like I had planed before with MOP 2 might be difficult to do in the future. Why? Even if me working full-time on Groovy in the future will be no problem, it will be a new employer and there you cannot simply come with a project like that in the beginning.

On the other hand I don't want to continue doing nothing for MOP2 in the next year or two.

So I started to think about a plan to slowly migrate the current MOP into one I find better suited for a modern Groovy. As a result I made a list of some features I did plan for the MOP2 and how it fits with the current system, as well as how much I see a possible migration path here. And with migration path I mean to not break old code, and have a different behavior only if intended. Sounds difficult? Well, yes. It can be done only by not doing some things I did intend to do.

New package namespace for groovy.lang

Why a new package? Because the idea was to be able to run the old and the new runtime in parallel, maybe let them communicate. You would have something like a MOP1 compatibility jar, that if on the classpath, allows your old Groovy classes to run normally. While having both runtimes in parallel was thought as optional I think now it has to become the regular mechanism and standard. Having to have a slow migration path means for me here to mostly exchange the standard meta class

Remove MOP specific methods from GroovyObject

The MOP specific methods I am talking about here are invokeMethod, getProperty and setProperty as well as getMetaClass. Those methods have been used in the early times of Groovy as a way to speed up things and avoid reflective method calls where possible, while at the same time providing extension points for the users. I am not so much talking about how bad it is that invokeMethod is usually called as a last resort kind of method, and get/setProperty usually upfront, or that the property methods make it really difficult to keep information about the origin of a call, or the amounts of code that goes into our runtime which tries to recognize if it can safely ignore those methods and directly go to the meta class... No, in terms of a migration path I have to clearly say, that to keep binary compatibility, those methods do have to stay. And they do have to stay with a more or less similar logic as today. But a new GroovyObject class could be made, that goes in a different package... Having to create GroovyObject wrappers all the time to satisfy the needs of older APIs using that won't do though. It is a terrible strain on memory and breaks referential identity. So I guess you would have to have an annotation to switch between old and new GroovyObject.

Make the meta class similar to a Persistent Collection

MetaClassImpl was in the past already going the route of being immutable, but ExpandoMetaClass has shown there are different needs. Well, to be exact, MetaClassImpl has two states, one that allows initialization (and mutation with it) and the initialized state, in which no mutation happens anymore. ExpandoMetaClass is based on MetaClassImpl and more or less switches between states to allow for mutation. But this turned out to be a horrible mess for concurrency. Persistent Collections on the other hand are by nature immutable, which allows nice lock-free paths. SO if you want a mutation, you have to actually use a new instance. The trick is to leverage the internal structures of the old instance to save on memory and initialization time. For example if you have two list, the second one is created by using the first one and appending an element, you can easily use simply the old list, if you link the new element to the old elements. It would be a list in which the new element is before the old elements, like a stack. Of course this is just for illustration and there are solutions to make it in the other order.

Now the current API is more based on having a single reference and the instance behind that is mutated. Doing the persistent collection approach means there would have to be a new instance. But this could be solved by having a kind of dummy meta class, that will keep track of the real meta class. The real meta class would not be mutated directly anymore.

Let open blocks not use a class anymore


In current Groovy if you have code like
list.each{println it}
you will get an anonymous inner class for the open block which is also an instance of Closure. There you can set strategies and whatever. Well, in the new MOP I intend to not produce a class anymore. Instead a method will be used. It would have the same parameters as the Closure doCall method would have now, but plus a helper instance for things like the resolve strategy and such. And in cases where we do not reference any outside context (like in my example) we could omit this as well. The old Closure class would then be only a wrapper for the real representation of reference to the open block. In Java7+ code this could be a a couple of MethodHandles, in older code this would have to be MetaMethods. But of course those MetaMethods would have to be used with care, to avoid to many references to the meta class system. Otherwise you would serialize half the runtime in current Groovy if you wanted to serialize a Closure. I think a lot of the logic that can currently be found in ClosureMetaClass to have a simplified dispatch for Closures can be reused here. I guess this all can be done using the current API, but some frameworks may be based on having those inner classes and recognize them. Also the change means there will be only one meta class for every open block out there. Frameworks building on top of that would either need to change or we would need to provide an option for the old way... The main reason I would want this kind of change is to have a smaller bytecode footprint, less permgen space and of course lower memory consumption. In cases that use no outside references we could even think about reusing instances of Closure to have even less footprint.


New Meta Class DSL

The DSL provided by ExpandoMetaClass has some nasty things here and there. Fixing them would almost certainly break code, especially in Grails. Thus a new optional front end to the meta classes would allow to have a cleaner DSL. The details here really have to be worked out using actual code I guess. But I see no bigger problem.


No Meta Class subclassing anymore

Subclassing proofed to be a problem for performance improvements. Of course there still needs to be a way to add some kind of custom behavior, but that would then be done using composition and with caching in mind, rather than having the meta class as the original instance of a method call invocation, as it is now... and need to be worked around all the time. The composition approach also allows for a cleaner separation of an API. Currently we have for example the MetaClass interface, but it is almost never used for a custom meta class, instead people subclass either DelegatingMetaClass or directly MetaClassImpl. And DelegatigMetaClass is really already going in the direction of a composition model.


Realms

The idea of the Realm concept is to have different spaces for meta classes, that can exist independent from each other and not influence each other. For example in a Groovy implementation of something like DefaultGroovyMethods, you may want to call the Java version of toString, instead of the Groovy one (if existing). Then you need a way to set a Realm and switch them. But since this concept is unknown to the current meta class system, using realms in a class, would automatically mean using the new meta classes. Since old and new system exist at the same time, this will cause confusion I guess. But basically there is only one Real for the old meta classes, so if another realm is requested I guess the best would be to just give a runtime error then.


The plan would be to start implementing the new meta class system and change the old meta class system to use the new one if appropriate. That means for example the old meta classes would become mere skeletons for the real system meta class for example. We would need some annotations or other logic here and there to switch between new and old logic, with the old logic being the default of course

But overall I think this can be done and in a way that won't require people to update all their code right away. The can then migrate with their own pace.