Capture Canvas and WebGL output as video using websockets

About a month ago I got the request to create a video from one of the WebGL demos on this site (the rotating earth cube, showing population density). So, of course, I said yes, but creating high quality output of this animation wasn't as easy as I thought. I looked at a couple of screen recording tools, but none were available for Mac that were free, offered a good resolution and high enough frame rate. To be honest, I didn't really look that long since I thought it a good opportunity to see how I could get the content of just the single canvas/webgl element and export it to a movie.

So basically, after some looking around, I came across this link where someone used websockets to export the content of the canvas and save these captures individually. Finally at the server side we can use ffmpeg to convert these captures to a movie. This looked like an interesting approach, so I did pretty much the same thing. I, however, used a simple vert.x backend for the websocket server.

In this short article I'll walk you through the required steps and code, so you can easily set this up yourself.

Creating the client javascript code

The first thing we'll look at is the required code on the client side. The first thing we need to do here is set up a websocket:

 var ws = new WebSocket("ws://localhost:8889/");
    var frame = 0;
    var isOpen = false;
    ws.onopen = function(e) {
        console.log('opening');
        isOpen = true;
    };
    ws.onclose = function(e) {
        console.log('closing');
        console.log(e);
        isOpen = false;
    };
    ws.onerror = function(e) {
        console.log('error');
        console.log(e);
    }

Note that not all the functions need to be defined, it's just easier to spot issues when thing to wrong. Basically what this piece of code does, is that it opens up a websocket connection to "ws://localhost:8889" where our websocket server will be listening. If the connection is successful we set the isOpen to true. Note that we also define a "frame" variable here. This variable is used to keep track of the number of frame we sent and is used on the server side to ensure a correct sequence in the final movie. This piece of code goes at the top of our file and is executed before the rest of the javascript.

When you create animations, you usually have a render loop. The render loop for this example looked like this:

   // render the scene
    function render() {
        var timer = Date.now() * 0.0001;
 
        camera.position.x = (Math.cos( timer ) *  1800);
        camera.position.z = (Math.sin( timer ) *  1800) ;
        camera.lookAt( scene.position );
 
        light.position = camera.position;
        light.lookAt(scene.position);
 
       renderer.render( scene, camera );
       requestAnimationFrame( render );
    }

Nothing to special. We just position the camera, move some lights around and render the scene. Furthermore we use requestAnimationFrame here to let the browser keep track of when to render the next frame. Even though we shouldn't do too much in this rendering loop, I've added to code to send the data to the websocket server in this function. A better approach would have been by using webworkers to send the frame asynchronously, but this approach worked well enough.

    function render() {
 
        renderer.render( scene, camera );
        sendToServer();
 
        requestAnimationFrame( render );
 
    }
    function sendToServer() {
        var asString = renderer.domElement.toDataURL();
 
         if (isOpen) {
             frame++;
              ws.send(str2ab(frame+asString));
         }
    }
 
    function str2ab(str) {
        var buf = new ArrayBuffer(str.length);
        var bufView = new Uint8Array(buf);
        for (var i=0, strLen=str.length; i<strLen; i++) {
            bufView[i] = str.charCodeAt(i);
        }
        return buf;
    }

This piece of code uses the toDataURL() function to get the data from the canvas element, next it converts it to a bytearray so we can easily send it as a binary websockets message. Making it binary is a bit of overkill for this example I guess, but this was the approach I had the best result with. And this is it for the websockets client part. One last important thing to note here, is that you should make sure you cal the "sendToServer" after the canvas is rendered. This might seem obvious, but costed me a lot of time bug hunting. I assumed that it wouldn't matter if I added it before or after, since there is always something rendered on the canvas, and thus available to be sent to the server. Well... this isn't the case. The canvas is apparently cleared at the beginning of the render loop, which caused a lot of black screens to be sent to the server.

Setup the websocket server

You can use any websocket server you want and for this example I've use the vert.x approach, since I've been playing around more and more with this great asynchronous server framework. Lets look directly at the (very easy and straightforward) code:

        HttpServer server2 = vertx.createHttpServer().websocketHandler(new Handler<ServerWebSocket>() {
            @Override
            public void handle(ServerWebSocket ws) {
                ws.dataHandler(new Handler<Buffer>() {
 
                    @Override
                    public void handle(Buffer event) {
                        if (event.length() > 100) {
                            byte[] bytes = event.getBytes(0,event.length());
 
                            String frame = new String(bytes);
                            int frameNr = Integer.parseInt(frame.substring(0, frame.indexOf("data:")));
                            String frameData = frame.substring(frame.indexOf("base64,")+7);
                            BASE64Decoder decoder = new BASE64Decoder();
                            try {
                                byte[] imageByte = decoder.decodeBuffer(frameData);
                                File f = new File("pngout/" + String.format("%08d", frameNr) + "-frame.png" );
                                FileOutputStream fOut = new FileOutputStream(f);
                                fOut.write(imageByte);
                                fOut.close();
                            } catch (IOException e) {
                                e.printStackTrace();  
                            }
 
                        }
                    }
                });
            }
        });
 
        server2.setMaxWebSocketFrameSize(512000);
        server2.listen(8889);

Pretty easy to follow. The handle method is called on each event that is received. Since I'm not interested in handshake events I only check for event where we receive at least 100 bytes of data (there are other better ways, but I went for a quick solution :). In the handle method I splt the incoming data to get the frame I'm working with and the actual base64 encoded data. From this data I create a bytearray and store it in a file. This is done for all the frames that are received this way.

One thing to note here is the setMaxWebSockerFrameSize function (which is available since vert.x 2.1M2). This sets the maximum amount of data this server can receive in a single websocket message. Since we're sending over rather large captures we need to increase this. If you don't do this you get all kinds of strange error messages at the client side.

And the result

Now you can use an ffmpeg command like this:

"ffmpeg -r 60 -i %08d-frame.png -vcodec libx264 -vpre lossless_slow -threads 0 output.mp4"

To create the output movie from the individual frames. One of the first intermediate result (ignore the crappy compression from youtube) looks like this:

And that's it.