Monday, January 31, 2011

GWT Isn't a Good Environment for HTML5 Games

Last year, I made a small game in XNA. No one played it, so I've started porting it to be a web HTML5 game. Since the game was originally written in C#, I decided that the easiest way to webify it would be to rewrite the code in Java and then use GWT to translate the code into JavaScript. GWT is a set of tools from Google that let you write web code in statically typed Java. It then translates this code into cross-browser JavaScript for you.

After quickly discarding the GWT UI framework, I found GWT experience to be much smoother and much nicer than I had originally expected. Unfortunately, I've found that the current incarnation of GWT (GWT 2) doesn't work well for HTML5 games. The problem is that in development mode, GWT doesn't actually translate any of your code into JavaScript. It runs all of your code as regular Java, and then uses an intermediary layer to transfer manipulations of JavaScript objects or the DOM to a browser where the manipulation is done and the result transferred back to the Java world. Initially, I thought this wasn't a big deal because it only causes problems if you make an outrageous number of DOM API calls or decide to store a lot of things in JavaScript objects for some reason. Unfortunately, I found I was doing this alot in my HTML5 game. I was drawing lots of things to the HTML5 canvas, which requires lots of API calls. I was also using JSON for my save game data, which means you have to store a lot of data in the form of JavaScript objects.

As a result, my game code ran really sluggishly when run in development mode. This was especially true of Chrome whose sandbox design means that the Chrome GWT development plugin is particularly slow in transferring data between the browser and Java code. Doing my development with Firefox made things bearable, but I still found that I was optimizing things incorrectly. Things that seemed to be slow when running in development mode (e.g. the JSON game saving code appeared so slow that it would timeout the browser) were actually instantaneous when the code was properly compiled down to JavaScript. The overhead of interfacing Java code with a brower's JavaScript engine simply distorts performance information so much that it's hard for a developer to get a good feel for how a game behaves.

I understand why GWT is designed in this way. Most browsers don't expose JavaScript debugger APIs that would let a tool like GWT map lines and variables in JavaScript code to the original Java code that a programmer has written. Fortunately, browsers like Firefox are becoming mature enough platforms to have such APIs, so I'm hopeful that in the future, someone might reprogram GWT to actually translate Java code into JavaScript when in development mode and still let you properly debug it.

In the meantime, I'm going to finish coding up my game in GWT, and then go back to pure JavaScript games. I find the complete freeform, unstructured nature of JavaScript to be unproductive, but I'm wondering whether the static typing of Java is the best way to solve the problem. When I used to code in Smalltalk, everything was also dynamically typed, but it was a fairly productive environment to code in. Smalltalk organizes your code in a very structured way though, so it was easy to navigate the code and find things in it. Currently, the style of JavaScript code that I write is too freeform that even Eclipse with JDST can't analyze it too well and can only provide me simple ways to browse it. Perhaps I'll try writing my code in a more structured style to see if a proper code editor can extract useful structure from it, thereby allowing me to navigate and code JavaScript code more productively.

Wednesday, January 05, 2011

Rhino JavaScript security

My programming website contains a Java applet with a code interpreter for running user code. Users will not only run their own code, but possibly code from other people as well, meaning that they might be exposed to malicious code. The user is kept safe though because the code interpreter runs as part of an applet, meaning everything runs within the Java security sandbox.

For many years, I've been planning on making a standalone-version of my applet that can be easily downloaded and run as an application, but I've been concerned about security issues. I want user's to be able to run random code that they've found on the Internet without having to worry about the code infecting their systems with something. Without Java's applet security sandbox, my application would have to create its own sandbox. I always assumed that with Java's multiple layers of security, that I would be able to cobble something together. In the end, due to a convoluted API design, it seems that Java's security system is much less flexible than I had originally thought, meaning it's not really possible to do something like lower your own security permissions or to chroot yourself. I think the actual security mechanism in the VM could support this, but the APIs that Java exposes don't let you access such functionality.

The main security issue that I'm trying to protect against is that I want to let users run potentially malicious code in the interpreter. This interpreter has to call into my own code to access certain features. I'm too lazy to properly secure all of my own code, so I want to sandbox the interpreter code from my own code so that potentially malicious code can't muck around with the public fields of my objects and play with my inner classes to trick my code into doing something unsafe. So basically, I need a mechanism that allows me to take part of my own code, declare that I don't trust myself, and lower my permissions for that portion of code.

Based on what I can understand from the security documentation I've read, there are two primary mechanisms that Java uses to secure itself. The first is a namespace mechanism where different threads can be given access to only certain classes (or different versions of classes). This initially sounded like a great way of separating out my code from the interpreter code. My code would simply not be visible to the interpreter code, meaning that I wouldn't have to bother securing my own code. I would only have to create a hardened API for interfacing the interpreter with my own code. The second mechanism is a permissions mechanism where every class has an associated set of permissions. Whenever a potentially dangerous operation is being performed, the permission framework will go through the stack, find the class/code on the stack with the lowest set of permissions, and only allow the operation to proceed if the permissions are sufficiently high to allow it. So for my interpreter thread, as long as I could create a class with no permissions and then slip this class at the base of the interpreter thread's stack frame, then the interpreter wouldn't be able to do malicious things.

So with these two mechanisms, I could use permissions to prevent the interpreter from doing anything bad and use namespaces to prevent the interpreter from tricking my own code into doing bad things. Unfortunately, although this sounds theoretically great, I couldn't quite make the Java APIs do this for me. It seems like the API was mainly designed so that the Java VM and library could secure itself in applets. If programmers want to use the same mechanisms to secure their own code, you have to jump through a lot of hoops. The main problem seems to be that the Java VM loads the application's code with the system class loader. This means that the application code is basically considered to be as trusted and as secure as Java library code. You can't easily create a new thread with a new namespace with fewer classes and where existing classes are relabeled with lower permissions. It's probably possible to do some crazy classloader voodoo where my code is packaged in a separate jar and the interpreter is in its own jar and then a special bootstrap jar will piece together the other jars in some sort of secure way, but it's messy, hard to debug, and hard to distribute all these jars to end-users (I think this is how Java application servers do their security though).

If I spent enough time thinking about class loaders, I might be able to figure out a way to solve it, but I was able to put together a solution that presumably has similar security but doesn't require so much mental gymnastics. The interpreter I use for is the Mozilla Rhino JavaScript engine. The interpreter has a ClassShutter which restricts which Java classes that user scripts can have access to. Assuming that the Rhino interpreter is properly secured, then setting the ClassShutter to prevent access to any Java classes should prevent user code from accessing my own insecure code except through well-defined and secured APIs. This should provide equivalent security to namespaces. I still made use of the Java permissions security mechanism, but that only required me to find a way to use class loaders to load a single class with reduced security. Basically, I created a class that implemented a proxy for java.lang.Runnable and compiled it by hand. I renamed the resulting .class file to a .bin so that the system class loader wouldn't prevent my class loader from seeing the file. I then created a classloader that would intercept attempts to create that class and create a version from the .bin file instead with lower permissions. In order to make sure you use the version of the class loaded by the custom class loader (the one with reduced permissions) and not the system class loader (the one with full permissions), you have to carefully use reflection to get the class loader to load its version though. When creating the interpreter thread, I start the thread off by running this class, thereby inserting these lower permissions at the base of the interpreter's stack frame.