Secure web applications – using JS to create a new web language

When you look around on security mailing lists you’ll probably an increase in security warnings relating to web applications… many of them based on JS code injected into a webpage.

This has lead to the uncomfortable situation where pages that are based on usercontent can not trust their users to provide JS as part of their submitted content. So now we can share video, audio and other passive media but anything interactive is out of the question.

What to do about it? The JS security system is entirely based on domain names and some providers have resorted to running all user js code on a seperate domain… but this again limits the usefulness of JS because it can only operate within the assigned iFrame. Others are trying to run the JS code through code analysis tools to find out if it is doing anything “forbidden”.

But who are we kidding? Blacklist attempts have never worked so far and the thing about web security is that even a single attack can leave data from dozens of apps exposed.

The alternative is quite simple, but to my best knowledge has never been tried: Implementing a second language in JS, running protected in a seperate sandbox, allowing only whitelisted calls and if necessary filtering the results. Is this possible? Certainly? Is it hard? Not as hard as one would imagine? Is it slow? Definately slower than true JS but still fast enough to be of use.

Let’s tackle these questions one by one:

Is it possible? Every language that can implement basic text parsing can implement it’s own parser… it’s really as simple as that. And it JS it’s even easier because we have a bunch of text processing tools like RegularExpressions that make parsing quite straight-forward and simple.

Is it hard? Not really… many of the requirements for the interpreted language can be mapped to native behaviour. For example: the garbage collector can work for the interpreted language as well if we map stacks and variables in the interpreted language back to native objects.

Is it slow? In order to answer this question we have to remember how code is usually stored in high level languages: The CodeDOM. The codedom is a simple, object-based tree structure where any number of atoms make up expressions. Once we have parsed the expressions into this DOM and inserted all implicit behaviour, executing code is really just a matter of walking this tree. So each interpreted operation means running the atom handler and following the tree. The atom handlers usually don’t change and can therefore be compiled by the JS handler and the jump to the next atom is just following a single reference. Combine that with the fact that we can replace known atom combinations with optimized functions and you’ll see that this is fast enough for the majority of simple web apps.

Just think about it what people could do if their apps were not restricted to their iFrames… youTubeOS? mySpace dynamic layouts? The sky would be the limit (That and the rules inserted into the interpreter… mySpace could opt to give users full access over the page’s elements, but not their ads and not the document and window elements).

If I had a Hammer OR Why RFID in passports is a really bad idea…

First things first: I actually do have a hammer and I know how to use it when it’s time to get my new RFID-enabled passport. It’s a fairly easy method to disable this ugly tracking device.

The more important question is why should I do it? Well there are a couple of reasons, so let’s make a list:

Let’s start with the basic problems of any encrypted data:

  1. I don’t want the state to identify me… sure they say the data is encrypted, but there was no way for officials to read it, then we wouldn’t have to carry it around… so the key is somewhere and let’s face it: If any part of our state has this key then it won’t take long until every single policestation or whatever has access to it.
  2. I don’t want others to identify me … if the key is available somewhere, then it won’t take long until it leaks out.

But are there other scenarios where the chip could reveal your presence. Even if the encryption was not compromised?
Hell yes. With RFID anybody can track you, even without the encryption key. This is by far the most interesting point. Lets assume for a moment that the data is stored 100% perecent secure and that the key is not available to anybody (I know, it’s difficult but let’s try). Then the chip is still sending out the encrypted data which may not be readable by itself, but it’s still a unique identifier. It says that person XY was last seen going to a bank, then going to a chemical supply firm and finally after a brief visit to Starbucks boarding a flight to Saudi Arabia (at least if there’s a RFID scanner at all these locations…. this probably isn’t the case now but it’s still a possiblity we’ll have to deal with). Maybe you can’t find out who person XY is, but you sure can find out what he’s been doing as XY has left the same digital fingerprint at all these locations. And if XY has used another identifier, let’s say a credit card, at at least two locations with an RFID scanner, we even know that this person is me.

Now this may all be very useful when trying to catch a criminal (eventhough it violates about every privacy law we’ve got), but this kind of information is available to anybody who can afford an RFID scanner. Let’s assume a group of stores agrees to exchange RFID information… not with any other authority, just among themselves. Sounds pretty harmless doesn’t it? But from this information alone, combined with the list of items bought while you were at the store and matched across multiple shopping sessions and some easy statistical analysis they’ll get something like this:

Usually around 1pm at store A, usually buys sweets, pizza, Coke and bathroom acessories. around 6pm either at store B or C. This is only a tiny bit of what they could derive but already they’d know where you live, where you work and what you buy, just like that.

And this would only be the “normal”, “marketing” way of analysing your data. Criminals are much more inventive…

I’m not asking you to do anything but think about it how your privacy gets a beating with RFID passports.

You’ve gotta love some quotes

Every once in a while, you come across a quote that is so absolutely stunning, you simply cannot believe that the person who wrote it was crazy enough to even put it in writing. This is just such a case.

From: Bill Gates
Sent: Saturday, December 5 1998
To: Bob Muglia, Jon DeVann, Steven Sinofsky
Subject : Office rendering

One thing we have got to change in our
strategy - allowing Office documents to
be rendered very well by other peoples
browsers is one of the most destructive
things we could do to the company.

We have to stop putting any effort into
this and make sure that Office documents
very well depends on PROPRIETARY IE
capabilities.
Anything else is suicide for our
platform. This is a case where Office
has to avoid doing something to destroy
Windows.

I would be glad to explain at a greater
length.
Likewise this love of DAV in Office /
Exchange is a huge problem. I would also
like to make sure people understand this
as well.

CSI – Client Side Includes

This has plagued me for the past few months and I simply couldn’t find any nice, simple, small solutions available. If your Javascript code consists of separate parts, you probably have them separated into any number of .js files… problem is: If these JS files, like any self-respecting library, want to include other JS files, then all you can do is write down it into a comment. Then, when you start a new project that would benefit from that library, you have to add the library itself via a script tag, look at the comments, find that library that your library needs, add it via a script tag, look at the comments of that library… you see what I’m getting at.

The alternative would be to let a PHP, or other script do that task… but that would either require you to move all your projects to your web directory or run that script every time you want to test your code. Additionally, when an error occurs during runtime, you won’t get the file and line number of your working files, but the one in the compiled file. Not very satisfactory.

Enter CSI – Client Side Includes.

CSI is something in between a library and a framework. It’s tiny (the core is <2k uncompressed) and requires only minimal modifications to your code.

For more details (as well as a demo) and the download, check here.

Smoothly fading slideshows and hover buttons

This was done per request and while the code is not really beautiful, it works just fine.

“Fade” is a small (8kb uncompressed, 2kb compressed), standalone Javascript library which provides smooth slideshows and hover buttons, while requiring only an absolute minimum on coding knowledge. Just set a few class attributes, copy the loader code and voilà. It even degrades gracefully if there’s no Javascript available.

See it in action, along with a short introduction here.

I’d love to hear some feedback, too. 🙂

(License is CC Attribution-ShareAlike 3.0)

Sam and Max

OK, so this post doesn’t have any useful information whatsoever. Big Deal? Well yes, because it’s got something so utterly useless that it stands out: A dog in a suit and a naked white rabbit.

More precisely, a guy in a dog suit wearing a suit carrying a rabbit.

In case you haven’t noticed, over at Telltale games, they’re developing a new video game series and well, these guys are simply nuts. Here’s their official report from Comic-Con. Watch it, enjoy it, vote for Max!

Spore – It’s not that scary!

I just finished reading an article over at Bona Fide Reviews and I simply can’t keep but smiling. You should read the actual article for yourself and then consider my reply. Enjoy:

OK, let’s start one by one. First of all we should clarify if we’re talking directly about programming or rather some greater logic that is the basic for the whole program. In this case this is almost certainly the logic behind the programming, not the actual language itself. In fact, it’s not even the logic behind the whole program: It’s the logic behind one subsystem, namely the model generation.

The point is that Spore simply adds a whole new level of abstraction for generating and maintaining models. Notice that the system itself is nothing new… it has been around since the first days of game programming. But maybe it’s easier to explain it using a well known example.

Consider the model generation in Half Life
You see a guy lifting his arm… what data is actually stored and which one is generated on-the-fly by Half Life, DirectX or your graphics card?

What you see is a two dimensional raster image, so does Half Life come with a image of that guy from every imaginable angle?
No, it doesn’t. This would either require a huge amount of memory or limit the number of angles. However, this was done by older games like Doom to save processing power… as a result no matter from which angle you would look at a monster, it was always one of four images (Left,Right,Front,Back).

So, we’ve established that the guy isn’t stored as a 2D raster image, so what is he made of? What data is processed into the image we see? Well, it is a vector image… an image where points and the areas between them, together with a number of parameters like the color of a area make up a description that can be used to generate a bitmap. But this doesn’t really solve our problem, there are still way too many angles.

We really have to abstract some more. What data can be used that when combined with a particular angle results in a 2D vector image? You’ve got it: a 3D vector image! We just spread the points along an additional third axes and suddenly we can generate a 2D vector image from every imaginable point. Hurray!

But wait, the guy is lifting his arm! Uh… do we have to store a 3D vector image for every single point in time while he’s lifting his arm? Well… this was done by the game on which Half Life is based (Quake) and it wasn’t a very good system… In order to keep the size of all these models down, they had to remove precission resulting in a very strange effect where parts of the model would appear to change shape. Luckily, there’s an alternative! We can use yet another form of abstraction! We simply divide the data up again, in this case in the static 3D vector image and a skeleton that defines which parts of the 3D vector image belong to which bone. So now we just have to store the 3D model of the skeleton and in order to for example lift his arm, we simply move the arm bone, and all points in the 3D vector image that belong to the arm bone will move the same way.

But this isn’t enough! There may be hundreds of guy in the game and most of them lift their arm in a similar way. Since we separated the skeletal animation data from the 3D vector image, we can use it for all guys, not just this one!

But we still have to define all individual states, so let’s abstract some more! We can simply define a number of points along which the bone should move, and voila, we only have to save two or three states and interpolate between them.

———–

This is where it ends for Half Life. But wait! There’s more!

Unreal was one of the first games to dynamically generate images to be used as textures for the areas in a 3D vector image: They would for example specify a image and then generate a whole series of images where water was flowing across it.

Quake3 brought parametric skeletal animation to the masses. There would only be one animation for walking which was modified to make it appear as if different characters walked differently. Some would spread their legs more, others less and so on.

Others, like the famous N64 game Waveracer, would generate models for a whole see using a set of rules that formed a simplistic physics model.

And finally we see games like World of Warcraft creating a near infinite number of models by combining parts of different models.

———–

But there’s one gap. We always end up with a static model that’s being deformed in various ways. And this is where Sporn offers (or appears to offer) something new. We can create arbitary models from a simple model (in this case a ball). In fact it’s not new at all. We’re simply adding better deforming algorithms and give them more freedom in that we not only allow them to not only move existing points, but also add new ones. We add routines that we’ve developed to simulate physics to make it appear a if we were actually sculpting. Then we apply all the techniques mentioned before.

It’s a great example of evolution, but it’s not a revolution.

xmlTree – minimalistic XML processor for PHP

xmlTree is a tiny, little XML processor for PHP that doesn’t even offer 10% of what the XML spec has to offer, but still manages to do almost everything a normal developer needs (and besides, it’s easy to extend). It’s meant for all those people that

  • Hate to use code that they cannot possibly understand
  • Prefer small libraries
  • Don’t expect the XML Library to validate their data
  • Don’t need the XML library to handle data that’s not trustworthy
  • Don’t need automatic character conversion, like ” to \” or <>

So, it’s really quite minimalistic. It was initially written to store configuration data for another program, but ended but managing quite a bit of the HTML code too.

Essentially, it provides the following features:

  • An XMLElement class that provides parentNode, childNodes, tagName and attributes. It also provides a value for text nodes
  • A Javascript like DOM manipulation system, featuring such gems as appendChild, setAttribute, getAttribute, removeChild, getElementsByTagName, getElementsByName,getElementById, toXML and toFile, all of which (with the exception of the last two methods) behave almost exactly like their Javascript counterparts.
  • An XMLDocument class with html, head, title and body (nothing special, but it makes life easier)
  • An XMLParser class that turns an XML string into a XMLElement tree.

You can find the source, along with some more notes here. Oh, and it’s GPL too.

Canvas 3D Renderer updated

I’ve updates the Canvas 3D Renderer with new sample data, a few performance tweaks, background image support, support for colored polygons and a few tweaks here and there.

Note however that the car you see is not a sprite, it’s simply an image put in front of the canvas for my amusement. So don’t get confused if it doesn’t behave the way you expect it to do.

Right now the sample data has around 200 faces and while the code is almost ready for shared vertices (which means that one vertex belongs to many faces, resulting in a much lower number of vertices in memory and therefore a lot less calculations) right now a face still consists of 3 vertices, meaning that it does about 500-600 rotations, projections,clipping and collision tests per frame update, with very few optimizations so far and for that the speed is (at least in my opinion) amazing.

What’s interesting to me are the test results… I didn’t time them, so I can’t give you any numbers, but
Opera 9.1 is definitely the slowest one, but with a very steady framerate, which probably means that drawing and garbage collection are very fast (as these tend to take up a variable amount of time), but arithmetic is slow.

Firefox 2 is pretty steady as well, and a lot faster than Opera.

Firefox 3 Alpha is certainly the fastest browser, but with a very unsteady framerate. I guess the new garbage collector is causing this while drawing speed is increasing thanks to Cairo.

You can still find it at http://tapper-ware.net/canvas3d/ (Note that the background gets drawn during the first screen update after the image has loaded, so it will probably take a second or two to appear the first time you load the page. Screen updates happen whenever you move around).

P.S. I’d appreciate it if you could send me mail if you want me to answer your comments. hansschmucker at gmail dot com