Posts Tagged performance
Recently, there was a question on stackoverflow.com, about how to optimize a 2D-game in flash, where there are many stars in the background, which move relatively to a ship that is in the center of the screen. The naive approach is to put all objects into one container and move it around. The thing is, flash is really not very good at clipping, so the whole thing stops working properly very quickly.
I myself started to write a little engine (spawn stars (s), unpause (SPACE) and navigate with your mouse) for the same reason a while ago. It is just a proof of concept, and the API is terrifying, and so on, and so forth, which is why I am not willing to release it yet, but when I have time for a rewrite, I will give it a shot again. I tried to explain the basics of the idea behind it, but it seems my explanation was too superficial, which is why I decided to make a post about it.
Let’s look at the whole thing in 1 dimension. The approach is the same, but it’s a little easier to explain and to imagine.
So we have an awful amount of stars, and we want to do something with them.
To clarify our problem:
We have an unlimited set of objects randomly distributed over an unlimited interval.
We need a good method of rendering all objects within a chosen interval (“the part of the universe we currently can see”).
The idea is to somehow put them into a tree, so you can quickly find out which are to be displayed and which aren’t.
Update: please note, that if the size of the visible region is both constant and known beforehand, it is much easier to subdivide “space” into a grid, where cell size is equal to the visible region’s size.
The following should explain to you, how the tree aproach works.
This tree consists of nodes, leafs and a root. Fot the lack of a better term, I will call both nodes and leafs containers.
A container always represents a given region, which in the case means, it covers an interval.
A leaf contains a maximum of m objects of our universe. m should be reasonably big (somewhere between 50-2000).
A node contains n adjoining containers, (which all cover equal but disjunct intervals, the union of which is the overall interval represented by the node). n=2 actually worked best for me.
The root is a very special node, which can have an unlimited amount of containers. It is the represantion of space in our universe. In the beginning it is absolutely empty. All children of the root cover intervals of the same size s and do not intersect.
Please note that a node always contains a constant number of adjoining containers, whereas the root may contain completely randomly and sparsely distributed containers. For that reason, an array can be used to reference the children of a node. For the root, you should either use vectors, or inthashes in the case of sparse universes.
In the end, it’ll look something like this:
When a new element is inserted into the tree, you look in the root, trying to find a container covering the interval corresponding to that element. If there is none, a leaf is created for that interval, and then insertion is carried out.
When inserting an object into a container, there are two possibilities:
1. it’s a node. In that case, the element is inserted into the child container covering the according interval.
2. it’s a leaf. In that case, the element is inserted. If the maximum threshhold m is exceeded, the leaf is split into a new node with n leafs, redestributing its children accordingly.
Our input is a given viewport interval, and our output should be a set of elements, that are to be rendered, and the according screen positions they should be rendered to.
So everything starts with finding the containers intersecting our viewport interval in the root.
To find the visible elements in a node, we look for visible for elements in those child containers, that intersect the interval we are searching in. To find the visible elements in a leaf, we’ll simply check for each element, whether it is in the interval or not. This costs O(m), which in the end is O(1) since m is constant.
The rendering position is the element’s coordinate minus the lower of the viewport. With p being the overall size of our universe, we will get an average cost of about O(log(p)) for the whole thing.
Since we need to render again and again, we need to do a little more work. We need to keep track of any visible elements and containers, which can be easily done by flagging.
At node level, we use this information as follows. We look at all children. If they are flagged visible, but no longer on screen, we hide them and any children flagged visible recursively. For leafs, this works very similarly. If it’s flagged visible, but off screen, we hide it, if it is flagged invisble, and on screen, we show it, and if it is flagged visible and on screen, we update it.
In 2D, the solution is absolutely the same, except that intervals become squares, objects have 2 coordinates and subdivision of nodes is done with n* n child containers.
Well, one day I may have the time to implement this myself. Until then, I wish you all good luck, and keep me up to date with further optimizations … 😉