Everybody likes XML. Everybody, but me.

Nowadays, it seems the vast majority thinks XML is the bestestestest format ever on this and any other planet, and they really use it to serialize anything, no matter how perverted it actually may be.

It appears to be one of the first standards for the task of human readable representation of complex data, that gained popularity. Its strength does not come from its design, but from the fact, that it is a standard, and that there’s been some thought put into it, unlike many home-brew serialization you and I come across every now and then. But really, that’s it. Being reasonably good in a field at a time where there were no alternatives, doesn’t mean, it’s still to be considered good.

Personally, I do not like XML

  1. It is verbous, redundant and huge in size. The XML closing tag is the most stupid invention ever. At any point, where a closing tag may occur, it is completely determined. It doesn’t carry any information the string <//> wouldn’t carry. But no, you have to type it in.
  2. It is error-prone. The above problem (missing/misspelled closing tags) is the problem I run into most of the time, as soon as I let people edit the XMLs (which is the purpose of human readable formats). In proper markup languages, this is a pure syntax error.
  3. It has no built-in support for numerical and boolean values. These values can only be included using string representations, which means you need a contract on top of the XML standard, stating how to represent them. Is a bool true | false? TRUE | FALSE? 1|0? How about 1.12+10? Is that a Float? 1.12.2010 is not (In German and other languages, this denotes the date 2010/12/1), although you realize that only half way through, but you can’t possibly try parsing all possible data types and see which one fits the best.
  4. It’s semantics differ A LOT from the object model of about any decent language. At data level, objects have properties. Each property has a value, that’s either primitive, complex or a collection. An XML-node has attributes and children. These concepts are completely different. Sometimes, properties are represented as attributes, but that doesn’t work for complex values. It is hard to say, whether a child node represents a property, or whether it is possibly the only entry of a list, which is the actual property of the represented object.

XML is of use, but by far not the universal tool everybody believes it to be. In order for XML to be usable in as many contexts as possible, it is completely misused. SVG paths are the best example. XML does not capture the information, that the path represented is not just a string. It is not a flat attribute, such as hairColor=”black”, but XML itself provides no way to tell that.

Widely spread alternatives are JSON and YAML, the latter still being quite exotic, while at the same time being very expressive and containing the former as a subset. JSON could represent the SVG path information, that is tucked into a single string in XML, as what it is: an array (i.e. actually a list, but fair enough). It literally means JavaScript Object Notation and focuses on representing objects, while the eXtensible Markup Language focuses on extensibility, blatantly failing at the most obvious tasks.

XML done right, using schemas and within certain contexts can resolve a lot of ambiguities, but then again this makes XML even more complex and more verbose.

Actually, there is nothing, XML can do, you cannot do better in a number of other established human readable or binary serialization formats. At the end of the day, the only reason to use XML is, that many services and tools you will encounter and want to integrate, use XML. Other than that, XML just sucks.

, , , ,

4 Comments

Howto: Endless Universe …

Recently, there was a question on stackoverflow.com, about how to optimize a 2D-game in flash, where there are many stars in the background, which move relatively to a ship that is in the center of the screen. The naive approach is to put all objects into one container and move it around. The thing is, flash is really not very good at clipping,  so the whole thing stops working properly very quickly.

I myself started to write a little engine (spawn stars (s), unpause (SPACE) and navigate with your mouse) for the same reason a while ago. It is just a proof of concept, and the API is terrifying, and so on, and so forth, which is why I am not willing to release it yet, but when I have time for a rewrite, I will give it a shot again. I tried to explain the basics of the idea behind it, but it seems my explanation was too superficial, which is why I decided to make a post about it.

Let’s look at the whole thing in 1 dimension. The approach is the same, but it’s a little easier to explain and to imagine.
So we have an awful amount of stars, and we want to do something with them.

To clarify our problem:
We have an unlimited set of objects randomly distributed over an unlimited interval.
We need a good method of rendering all objects within a chosen interval (“the part of the universe we currently can see”).

universe

The idea is to somehow put them into a tree, so you can quickly find out which are to be displayed and which aren’t.

Update: please note, that if the size of the visible region is both constant and known beforehand, it is much easier to subdivide “space” into a grid, where cell size is equal to the visible region’s size.

The following should explain to you, how the tree aproach works.

Tree Structure

This tree consists of nodes, leafs and a root. Fot the lack of a better term, I will call both nodes and leafs containers.
A container always represents a given region, which in the case means, it covers an interval.

A leaf contains a maximum of m objects of our universe.  m should be reasonably big (somewhere between 50-2000).
A node contains n adjoining containers, (which all cover equal but disjunct intervals, the union of which is the overall interval represented by the node). n=2 actually worked best for me.
The root is a very special node, which can have an unlimited amount of containers. It is the represantion of space in our universe. In the beginning it is absolutely empty. All children of the root cover intervals of the same size s and do not intersect.
Please note that a node always contains a constant number of adjoining containers, whereas the root may contain completely randomly and sparsely distributed containers. For that reason, an array can be used to reference the children of a node. For the root, you should either use  vectors, or inthashes in the case of sparse universes.

In the end, it’ll look something like this:

tree

Tree Insertion

When a new element is inserted into the tree, you look in the root, trying to find a container covering the interval corresponding to that element. If there is none, a leaf is created for that interval, and then insertion is carried out.

When inserting an object into a container, there are two possibilities:
1. it’s a node. In that case, the element is inserted into the child container covering the according interval.
2. it’s a leaf. In that case, the element is inserted. If the maximum threshhold m is exceeded, the leaf is split into a new node with n leafs, redestributing its children accordingly.

Rendering

Our input is a given viewport interval, and our output should be a set of elements, that are to be rendered, and the according screen positions they should be rendered to.

So everything starts with finding the containers intersecting our viewport interval in the root.
To find the visible elements in a node, we look for visible for elements in those child containers, that intersect the interval we are searching in. To find the visible elements in a leaf, we’ll simply check for each element, whether it is in the interval or not. This costs O(m), which in the end is O(1) since m is constant.
The rendering position is the element’s coordinate minus the lower of the viewport. With p being the overall size of our universe, we will get an average cost of about O(log(p)) for the whole thing.

render

Since we need to render again and again, we need to do a little more work. We need to keep track of any visible elements and containers, which can be easily done by flagging.
At node level, we use this information as follows. We look at all children. If they are flagged visible, but no longer on screen, we hide them and any children flagged visible recursively. For leafs, this works very similarly. If it’s flagged visible, but off screen, we hide it, if it is flagged invisble, and on screen, we show it, and if it is flagged visible and on screen, we update it.

In 2D, the solution is absolutely the same, except that intervals become squares, objects have 2 coordinates and subdivision of nodes is done with n* n child containers.

Well, one day I may have the time to implement this myself. Until then, I wish you all good luck, and keep me up to date with further optimizations … 😉

, , ,

Leave a comment

ActionScript 3 Iteration

Recently I stumbled upon a question on stackoverflow.com, where someone wanted to know about the quickest way to iterate over an Array. In response, someone else pulled out a benchmark for .NET, which showed, for loops would be faster. But i was quite sure, this is not at all the case for AVM2. So I did a little research and expanded my quest to examine iteration as such, on any native AVM2 objects, that are suitable for collections. Apart from Array, Object and Dictionary, this includes Vector, which is only available for Flash Player 10. Both iteration methods provide key and value. Although it does not really make sense to use Object and Dictionary, if they contain dense numerical keys. In the end, it looks like so:

//for loop
for (key = 0; key < size; key++) value = iterable[key];
//for each loop
key = 0;
for each (value in iterable) key++;

You can grab the whole source >here<

I found, that for each loops are more than two times faster, sometimes even significantly. I also found, that speed depends on the type of the variable the collection is stored into. So here some numbers (tested on Debug FlashPlayer 10.0 r22 for Windows XP, on a Core2Duo with 2Ghz):

testing Vector as Vector.<int>
200 repetitions with collections of size 500000
	> for loops needed 48.595 msecs
	> for each loops needed 19.11 msecs
	> factor: 2.5429094714809

testing Vector as *
200 repetitions with collections of size 500000
	> for loops needed 54.65 msecs
	> for each loops needed 16.125 msecs
	> factor: 3.3891472868217054

testing Vector as Object
200 repetitions with collections of size 500000
	> for loops needed 54.44 msecs
	> for each loops needed 16.335 msecs
	> factor: 3.332721150902969

testing Array as Array
200 repetitions with collections of size 500000
	> for loops needed 50.335 msecs
	> for each loops needed 15.46 msecs
	> factor: 3.2558214747736094

testing Array as *
200 repetitions with collections of size 500000
	> for loops needed 54.19 msecs
	> for each loops needed 15.455 msecs
	> factor: 3.506308637981236

testing Array as Object
200 repetitions with collections of size 500000
	> for loops needed 54.315 msecs
	> for each loops needed 15.335 msecs
	> factor: 3.5418976198239323

testing Dictionary as Dictionary
200 repetitions with collections of size 500000
	> for loops needed 61.17 msecs
	> for each loops needed 24.16 msecs
	> factor: 2.5318708609271523

testing Dictionary as *
200 repetitions with collections of size 500000
	> for loops needed 62.395 msecs
	> for each loops needed 24.205 msecs
	> factor: 2.577773187357984

testing Dictionary as Object
200 repetitions with collections of size 500000
	> for loops needed 62.155 msecs
	> for each loops needed 23.91 msecs
	> factor: 2.599539941447093

testing Object as *
200 repetitions with collections of size 500000
	> for loops needed 64.125 msecs
	> for each loops needed 26.35 msecs
	> factor: 2.433586337760911

testing Object as Object
200 repetitions with collections of size 500000
	> for loops needed 64.09 msecs
	> for each loops needed 26.245 msecs
	> factor: 2.4419889502762433

now some explenations, where this comes from:

  • to people not from the ECMA-world: Objects, i.e. instances of the class Object, are simply hashes, if you will. someObject.someProperty and someObject[“someProperty”] are equivalent … thus array access and property access are the same.There is not a lot of difference between Arrays and Objects. Except that Arrays handle numerical keys a little differently, that is, they maintain an order, and expose a length, as well as Array manipulation functions, and now new in AS3, iteration functions. Array do have a sweet spot, performancewise, when they are dense and numerical. Then, they are faster than Objects, when it comes to array access.
  • consider a for each loop. this is some runtime internal magic, written C or C++, which runs considerably fast and retrieves the value, while you calculate the key with a simple incrementation. for the for loop in turn, you need the incrementation, which is not costy, and you need to evaluate the condition, which is AVM2 bytecode, but still ok, and to retrieve the key, you need an array access, which also consist of executing the opcodes, and the implementation of the array access itself. The Array in ActionScript is not just a block of references in memory. It’s some weird multipurpose collection, with complicated access routines, that are all encapsuleted in the array access. now for each iteration does not necessarily preserve order, only for Vectors, and Arrays, but it does not rely on the ambiguous array access, since it comes from the runtime.

it seemed a little suprising to me, that there is a performance difference between * and Object. This seemed really strange. Also, since there is no rule for that. Accessing through the exact type is faster, as you can see. This is, because there are 5 different array accesses in flash. For Object, Dictionary, Array, Vector and Proxy. If the variable is typed, the compiler probably uses this information to hardwire the right array acces, instead of looking it up at runtime. Just a guess, though. One last note: if you create collections, subclassing Proxy, then simple for loops are much faster, since the require only one call to the proxy per step.

, , , , ,

1 Comment