Even More Pickling

Puzzled at my lack of success with Scala pickling I filed an issue on GitHub [1]https://github.com/scala/pickling/issues/326. I will be the first to admit that as bug reports go, “I’ve got 20,000 lines of code and it doesn’t work with your library”, is pretty much a lost cause. I did hope to get one or two tips on how I could help myself, or possibly flamed for asking a stupid question, but I didn’t expect to be completely ignored for two weeks. I did eventually receive a reply, which involved quite a lot of work to follow up.

The reply suggested that I build my code in Java 7 outside of the IntelliJ environment. Doing this (via a shell script) identified a few small issues [2]I discovered a couple of Java files still lurking in the project but increased my confidence that it was pickling that was broken, not my environment. This led me to read through all of the open issues and #296 [3]https://github.com/scala/pickling/issues/296 turned out to be a real gem.

It turns out that pickling identifies any Scala class with a getter or setter method as a Java Bean. This leads it to do strange things and in particular it will generate a pickler at compile time for traits/base classes without full knowledge of the runtime type. This leads to a lot of lost or empty objects in the output. This is more or less a disaster for my scenario, with a lot of code converted from Java.

Fast forward another few weeks and I’ve converted most of the getters and setters to something closer to idiomatic Scala. A single feature definition produces about 12K of JSON output, which is probably not complete, but getting there.

Pickling still isn’t usable because it falls over with a strange error:

Caused by: java.lang.RuntimeException: error: cannot find class or module with type name 'scala.collection.immutable.ListMap.Node'
full type string: 'scala.collection.immutable.ListMap.Node'

I can get past this for a single instance by capturing it as a val and declaring the concrete static type. There is no ListMap in the output. If it is declared as a trait then it fails. This looks like another case of selecting the wrong static pickler at compile time, when deferring to a runtime pickler would be the safer choice.

It seems that the development team have the features that they need in working condition. Perhaps it is good enough to write papers about, good enough to solve some big data problems, good enough to use if you know what works and what doesn’t. I don’t think the current version is a general purpose competitor to Java Serialization. It isn’t robust enough and the documentation doesn’t point out the areas in which valid Scala code will cause it to break. If there was a bit more activity on the GitHub project I’d have waited for the library to mature, but it looks like the current version is here to stay for a while.

What puzzles me most of all is my grim determination to force it to work. To start with it looked like a lost cause. Many hours of spare-time programming later it looks like I might eventually trick it into working … this one line of code:

val pickled = featureDefinition.pickle.value

It seems I still have some good-will left, despite the obvious frustrations of spending so long getting nowhere.

References

References
1 https://github.com/scala/pickling/issues/326
2 I discovered a couple of Java files still lurking in the project
3 https://github.com/scala/pickling/issues/296