back2dos

This user hasn't shared any biographical information

Don’t bother obfuscating code.

Posted in Thoughts On Programming on March 23, 2012

Code obfuscation is rather pointless. To understand that, let’s first examine the reasons why you want to obfuscate your code in the first place:

Security by obscurity.
“To protect you intellectual property” as people normally put it.

Both reasons are understandable, but if either of both is what you’re trying to achieve, then code obfuscation will not really cut it except in a few selected cases.

From an evil villains perspective, code obfuscation doesn’t present the hindrance that you would like it to be, for a number of reasons:

Your actual source code is rarely needed to try crack your software or to “steal” it. In most cases, your software can be decompiled or disassembled into some sort of source code. It’s not going to be pretty, but it’s going to be good enough to run it in a debugger and find suitable entry points to hook into for whatever purpose. You’re “safe” if your architecture is just a big ball of mud, but in that case you don’t have much worth protecting anyway.
Security by obscurity usually ends up shooting yourself in the foot. You have no way to actually guarantee security. You can only hope that no-one finds vulnerabilities, or that you are lucky enough to become aware of attackers and close the loopholes they were using. Most of the time, you will do much better using open, well-established and field-tested security mechanisms and guidelines. These are constantly driven forward by a whole number of very smart people. This doesn’t (necessarily) apply, if you work in defense or banking and have major funding to develop/buy reliable proprietary security mechanisms. But you very probably don’t.
When you look at software as “intellectual property”, then protecting your source code won’t protect the software. Designing great software with a good user experience is a very hard, resource intense, time consuming process, that involves a lot of learning about your actual problem domain. The real value of your software lies in that design. The effective value of the product that this software constitutes lies in how well it is marketed. You have gone through this whole process of iterating a great design and actually generating/defining the demand for a product such as yours. Reverse-engineering your implementation is far easier. Selling a product to meet an existing demand is far easier. You just add some crappy extra features, cut the price in half and people are going to fall for it.

The kind of paranoia that leads you to wanting obfuscation is bad for your creativity, because it pushes you in a defensive mental stance. This will harm you, because frankly, you get the best ideas when you feel free, safe, confident, positive and when you are undistracted. And also implementing obfuscation consumes a lot of time and energy, often yields bigger/slower results and might even impose restrictions (some obfuscation methods do not work with reflection). If you really want to stay ahead of your competition, you are going to need that creativity, that positiveness, that time and that energy, to focus on improving your design and to sell your product, which works a lot better with a smile in your face.

So to the contrary: you should consider open-sourcing and GPL-ing parts of your software. If those parts are really worth using, you will get peer reviews for free and marketing for free and an angry mob of free software lobbyists up your sleeve, if your competition decides to use your code but not to publish their enhancements. And also materials, training and support might in fact present a new revenue stream.

code obfuscation, intellectual property, software value

Leave a comment

Releasing the Minions

Posted in Projects on May 16, 2011

Last Friday I was made aware of the fact, that Flash banner programming is still relying on ActionScript 2 these days.

That’s presumably due to the extra 0.05% of users you can reach, who have the same computer since 3 years and never cared to upgrade their Flash Player in all that time. Exactly the kind of people who will spend gazillions of money on the product advertised.

Anyway, since you can’t really alter industry superstitions, you just have to deal with it. Currently TweenNano is the best thing out there for AS2, if you want things to keep small. However TweenNano is really quite poor in features. It lacks chaining and many other things you’d normally want. All that at 2KB for AS2.

While I do approve the work of the greensock team and especially the effort to maintain an AS2 port of their tweening suite, I think this is one of the major problems of TweenNano for AS2: It’s a port from AS3. It is therefore unnecessarily clumsy.

Personally, when it comes to AS3 and tweening, I am a HUGE fan of eaze, which was created by Philippe (one of the creators of FlashDevelop). Why? Well it keeps its own promises: “Eaze Tween: smart, fast, chainable and compact Flash AS3 tweening library”

What more could you possibly want? That’s right: An alternative for AS2.

This is why I decided to create Minion, as in “mini animation library”. It’s still a little raw, but it packs quite a punch at only 1.5KB file size (for the core engine) and it puts the fun back into tweening, if not for its features, then for its evil nature: Just imagine you’re an evil overlord and your animations are carried out by a bunch of servile creatures. Muahaha!!

Have fun with it 😉

actionscript2, animation, announcement, as2, lightweight, release, tween

2 Comments

Stop Singleton Abuse!

Posted in Thoughts On Programming on April 8, 2011

There is a general misconception about how Singletons should be used.

Possibly, because it is true that for having global state, it is better using Singletons, than just plainly global variables or class objects. However I quite often encounter the claim that Singletons intend to provide global state. I have repeated this on numerous occasions, which is why I will put it here once and for all.

Singletons do NOT justify global state!

Some people see the Singleton as a justification for global state, along the lines of “If there’s a pattern for it, it must be good”. Well no, it isn’t. Global state is considered harmful. For a number of reasons, that even Singleton-misuse won’t make go away, simply because:

Singletons are NOT intended to provide global state!

The Singleton is a creational pattern. It is used to enforce, that a class be instantiated only once. What it basically does is, to give control over instantiation back to the programmer. This is what you sometimes need in languages with classical constructors (Java, C++ and such).

So while Singletons are often used to replace someGlobalExpression with SomeClass.getInstance(), it is actually intended to replace new SomeClass().

It is evident, by name, that the method is supposed to return an instance. The possibility to return the same (and thus global) instance again and again, is subject to implementation. Thus using a Singleton for global access actually violates that encapsulation and unnecessarily ties clients to the implementation. And if at some point you decided to replace the implementation by a Multiton or a Pool, a lot of code could break (because the assumption that the code always returns the same instance no longer holds).

You should ultimately think of the Singleton as a special case of a Factory. Just imagine that that static getInstance-method were called createInstance. And you’ll be using it just the right way.

design patterns, global state, instantiation, singleton

Leave a comment

Abstraction

Posted in Thoughts On Programming on December 16, 2010

Once again, I stumbled upon the word “abstraction”, as in a recent question on programmers.stackexchange.com. Often times I see this word misunderstood and some of the answers provided embody the most common misunderstanding.

Common Misunderstanding

In programming, abstraction is often understood as generalization, i.e. generalization of a specific component.

A component has a specific design based on a number of assumptions about the problem it should solve. The straight-forward and popular way of generalization means, we reduce those assumptions, leading to a more general problem, to which we can provide the design (and implementation) of a (possibly) partial solution. Java has the keyword abstract just for this, although what it actually denotes is classes and methods, that do not work due to a lack of concrete context.

For example, let’s take a component, that is called HTTPService. We “generalize” this by removing the assumption that we use HTTP for communication. We design a new class called ServiceBase, which provides about all functionality of HTTPService, but doesn’t make the assumption of HTTP being used. As such, this class is useless, but by subclassing it while adding the assumption that HTTP can be used we obtain our HTTPService. So ServiceBase is a partial solution for a more general problem. This is not too bad. But it’s not abstraction. It is what I would call explicit generalization.

Abstraction in Human Perception

I know, this is gonna knock you off of your feet: Abstraction comes from Latin :D.

To abstract (abs – from, trahere – pull, draw) an object, means to draw something from that object. That something is its essence. The abstraction of an object is, what is essential about it. Abstract art for example undertakes the effort of trying to visualize the essence of an object, bypassing its visual appearence. Sometimes this just really WTFs you, but maybe this will enlighten you:

Picasso - Le Taureau

This starts with a depiction of what a bull looks like and ends as the essence of a bull, as Picasso sees it: Big, 4 legs, 2 horns, a tail and genitalia. And you surely agree, one can easily recognize what it is. This is abstraction.

What these few lines represent can engage in a bull fight or reproduce. Our ServiceBase-component, which many would call AbstractService (in fact I know at least one framework with a class named exactly like that), cannot do anything. That is why this is not abstraction.

In our everyday life, we deal with abstractions. We usually deal with the essence of things, not with the complications of their concrete nature. When we say mouse, the essence is basically that pointer moving on the screen “clicking” things. We do not even really think much of the physical device or the mechanical, optical and electronical effects at work. We could possibly decompose the mouse’s essence into moving and clicking. But nobody would ever build a mouse that only moves or clicks, or that is only an empty piece of plastic with a USB-cable coming out at the front, saying that you have built an abstract mouse, that provides a partial solution to the generalization of the problem mice can solve.

Abstraction is about perspective. About the way we see things. Consider a mouse in a computer game.

We see the mouse as a source of movement information, which is used to control the camera.
We see the mouse as a source of discrete signals (buttons), which is used to fire a weapon or more generally for interaction with the environment.

However, on a motion sensible device, I can use motion for camera control or I can use speech input for interaction on others. I can replace the mouse as a whole by devices, that sum up to its abstraction. I can just as well replace one role of the mouse, e.g. moving, by a different device, but this doesn’t mean I rip out the mouse ball or tape over the laser.

Abstraction in Programming

Abstraction in programming is about understanding the essence of an object within a given context. When we abstract the HTTPService, we have an object that performs whatever “service” actually means here (probably some (possibly stateful) application layer protocol) using HTTP to do the job. If our architecture is good, then about every part of the application concerned with the HTTPService is only concerned with the service-part of it. So a Service here is not a particular object or class, but the concept of a service, which is best reflected in an interface (as it is called in most languages). Abstraction means, that all parts of our application, that rely on this and only this functionality of the HTTPService, rely on exactly on this essential aspect, naturally an interface named IService. The process of abstraction means ensuring that components do not depend on other concrete components but on mere abstractions of those.

It is important to understand, that by abstraction we do achieve generalization, only in a different way then through “explicit generalization”. Here, all components depending on the abstraction are implicetely generalized. This has nothing to do with how the concrete implementation of the abstracted component is implemented. It could be the worst spaghetti code ever on earth. It doesn’t matter, because you do not perceive it. In fact HTTPService shouldn’t even exist. There should just be a Service, that executes its communication through an implementor of IConnection, HTTPConnection being one of them. But this design flaw doesn’t affect the rest of the application, because it is abstracted away.

Abstraction is not only mistaken with generalization, but also with encapsulation, but these are the two orthogonal parts of information hiding: The service module decides what it is willing to show and the client module decides what it is willing to see. Encapsulation is the first part and abstraction the latter. Only both together constitute full information hiding.

The dependency inversion principle is all about the importance of the less popular part of information hiding. Assuming you had the right low level components and the right high level frameworks, programming is “only” about writing adapters to stick the first into the second. Usually, the best you get is a neat component library, tightened up in a nice facade, but sooner or later you discover you need to both do modifications beyond the facade and write an adapter layer anyway.

If there is something I would like you to take with you, it is, that abstraction is a good thing, and that the right place for an abstraction is in a given context. Trying to anticipate such contexts, i.e. designing a component for specific uses is a very good thing too. But try to keep it sensible. KISS, because otherwise you’ll probably overengineer things.

abstraction, design

1 Comment

Free your mind from boxes

Posted in Thoughts On Programming on October 8, 2010

DISCLAIMER: I’d like to point out, that the issue I am addressing here transcends the field of software developement, but this would drastically go beyond the scope of what I intend this blog to cover.

Human thinking tends to revolve a lot around boxes. Why? Boxes are useful. During everyday life, we constantly bump into objects of our perception and we tend to put these into boxes. There are two reasons:

The most important “feature” of boxes is, that we can (mentally) carry around multiple objects in just one box. We can say “I like classical music”, instead of iterating all classical pieces and saying we like them.

Up to a certain point, boxes are even a boost to everyday perception. If there is an item a, and a box A with many items that share common features with a and one-another, we can put a into A. We tend to infer from that, that a is likely to also share features of other items from A, without actually inspecting it. The nice word for this is “induction”, the common word is “prejudice”. It is a way of concluding, that is very efficient, because it doesn’t require a thorough study of the subject, but not without risk, because it quite often yields poor conclusions.

None the less, the main point is, boxes are useful, because we can put objects into them, which helps us dealing with the enourmous complexity of the world we live in. I suppose, we all agree, at the bottom line, the purpose of boxes is to put objects into them. I also suppose, we’d agree this doesn’t imply the purpose of objects is to be put into boxes. Yet we often behave like that. We like boxes. We can label them and then put objects into them, saving us the work of labeling all items individually.

It is my belief however, that for any kind of task, that requires problem solving, having boxes is not the biggest quality. Having boxes is good. Nowadays it is a popular oppinion that thinking outside the box is the new “shizzle” and will solve all your problems. I don’t think it will. It will perform just as poorly as thinking inside the box, because either way, the box is the limit.

The point is, to think past the box. The box is a support, a tool, and as such should not be in your way. The actual “shizzle” is the ability to build boxes and to choose clever labels. It is a more less mechanical task to sort a set of items into a given set of boxes. The stroke of genius is choosing a certain way to box things. Thus the important quality is the art of “meta-boxing”, the tool that transcends boxes, the understanding of why certain boxes are chosen the way they are and being able to replace or add boxes as the problem, one is solving, evolves.

The actual art of problem solving consists of looking at the problem in search of the right perspective, until this sweet moment comes, where all parts fall into their places and the solution becomes obvious. You’re less likely to find a good solution if you try to use the same approach you used on an old problem you consider similar. Probably it even isn’t, but your perspective is so poorly chosen, it looks similar. You’re also quite unlikely to succeed, if you try hard not to draw knowledge from problems you’ve already solved. The word is “why” and the task is to understand why you succeeded in solving other problems. Why your solution worked. To understand how you found the right perspective. And apply that understanding.

design, spiel

Leave a comment

The miracle of computer applications

Posted in Thoughts On Programming on August 14, 2010

There is a process, that I’d call man-made evolution, for the lack of a better term. By intellectual effort, mankind can achieve new abilities.

Basically, there were two major breakthroughs until recently:

Tools. A tool would transform our force in a way, so that it can be used in a way previously impossible. Man has no claws, but could shape stones to be used the same way, an animal with claws would use claws.
Machines. As the next step from tools, man created machines. Machines are a step forward in two ways. For one, they allowed to perform desired motions in a very effective way (such as using a loom, which requires only a repetitive and simple movement, to perform a task, that would otherwise be very time consuming), but secondly also allowed harnessing other sources of energy than manpower.

And then, not so long ago, computer applications emerged (you may want to imagine some dramatic music here, a choir of angels and that kind of stuff 😉 ). Of course, computer applications require computers to run, but personally, I don’t think, the biggest gain mankind gets from computers is, that we can compute things insanely fast, but rather that we can run applications on them.

I believe applications, to be the third significant breakthrough.

What a machine does, is determined by the laws of physics and the constellation of its components. A computer constitutes a world, where the laws of physics are replaced by the possibilities of the hardware. What an application does is determined by those possibilities, while the physical composition of components is replaced by mere software, providing flexibility, that was probably undreamed of 100 years ago. Without material or mechanical modification, a computer can be altered in order to perform new or other tasks, or can be improved at the tasks it is carrying out. Any application installed on a computer, can be thought of as an individual machine.

Computers are a leap forward in that a single computer can be used for a multitude of extremely unrelated tasks, while an application can run on an unlimited number of computers. When computers actually emerged, this was more a theoretical possibility, but now, we live in a world, where computers are cheap, small, fast, amazingly reliable and writing portable software is feasible in a reasonable amount of time.

Truth is, computer applications are great in many ways, but they still are machines. As any machine, they need an operator. For him to operate the machine, control elements are needed. Or an interface, as programmers would say. Computers (at least PCs) typically all have the same physical interface (screen, possibly some sort of audio output, mouse and keyboard), while applications have a more abstract interface built on top of computer interfaces. The great thing about application interfaces is, that they are also just software. Again, without any material or mechanical modification, an interface can be adapted to its user. Suppose, you wanted your alarm to start filling your tub, when it rings, or you want to switch your tap from separate taps to a mixing tap. In the world of software, an equivalent task usually requires less effort, and once you succeeded, you can do it for any tub in the world, without much effort. This would really be of great service to the inhabitants of the UK. 😛

With the advent of the internet, the potential of software development spiked. One application can run in multiple physical locations (something basically no traditional machines can accomplish reasonably). The application can be updated automatically and easily. I think, we rarely appreciate or even understand, what an enormous potential this is. But it is. One man can create an application, that saves 10% of the broadband-internet users (should be about 50 millions) 1 minute of work per day. That’s half a billion minutes saved per day, assuming a workpensum 40 hours a week (ergo about 125000 minutes per year), this is equivalent to 400 man years saved. Per day. Now these are numbers serve more the purpose to impress you and everyone else, than to really measure anything, but I think, they do make my point. 🙂

computer applications, potential, spiel

Leave a comment

Thoughts on Programming

Posted in Thoughts On Programming on August 14, 2010

It has been recently (i.e. a few months ago) pointed out to me, that my blog is kind of empty. I was pleasently surprised to hear, someone cared. 😀

Well, I have been reflecting a lot on programming lately, looking at various other languages, after I had given up efforts to create my own, as well as frameworks, not having given up those efforts yet. I have looked at different paradigms, best practices, common practices, philosophies and concepts. What I found out, is that there’s a general lack of clear definitions on the web and there are many approaches that are mistakes. As a consequence, I have decided to fill the void, which I am hopefully not the only one to perceive. 😉

I am proud to announce, that from this day on, I intend to pester the world with my dilettantish thoughts on programming. This will include concrete principles and concepts, as well as some “deep” spiel (much like this post), intended to explain, why I deem this and that approach of high importance. I think, it is important to see a purpose in what you do, and should you be looking for one, maybe I can help you at least a little.

In the scope of this announced series of posts, programming and software development shall be used interchageably. That’s because I

will only focus on programming for the purpose of software development
intend to postulate principles, which transcend all layers of software development, right from design down to implementation,
think programming cannot be thought of as just coding (i.e. writing code). Any time it is, the results are usually useless crap.

Apart from stealing your time with this announcement, I’ll try to maintain some sort of table of contents in this very post, to provide some sort of structured overview to an otherwise chaotic stream of vaguely related posts.

I hope you enjoy reading, what I have to say, and pick up a few helpful things.

announcement, spiel

Leave a comment

Everybody likes XML. Everybody, but me.

Posted in Holy Wars on August 6, 2010

Nowadays, it seems the vast majority thinks XML is the bestestestest format ever on this and any other planet, and they really use it to serialize anything, no matter how perverted it actually may be.

It appears to be one of the first standards for the task of human readable representation of complex data, that gained popularity. Its strength does not come from its design, but from the fact, that it is a standard, and that there’s been some thought put into it, unlike many home-brew serialization you and I come across every now and then. But really, that’s it. Being reasonably good in a field at a time where there were no alternatives, doesn’t mean, it’s still to be considered good.

Personally, I do not like XML

It is verbous, redundant and huge in size. The XML closing tag is the most stupid invention ever. At any point, where a closing tag may occur, it is completely determined. It doesn’t carry any information the string <//> wouldn’t carry. But no, you have to type it in.
It is error-prone. The above problem (missing/misspelled closing tags) is the problem I run into most of the time, as soon as I let people edit the XMLs (which is the purpose of human readable formats). In proper markup languages, this is a pure syntax error.
It has no built-in support for numerical and boolean values. These values can only be included using string representations, which means you need a contract on top of the XML standard, stating how to represent them. Is a bool true | false? TRUE | FALSE? 1|0? How about 1.12+10? Is that a Float? 1.12.2010 is not (In German and other languages, this denotes the date 2010/12/1), although you realize that only half way through, but you can’t possibly try parsing all possible data types and see which one fits the best.
It’s semantics differ A LOT from the object model of about any decent language. At data level, objects have properties. Each property has a value, that’s either primitive, complex or a collection. An XML-node has attributes and children. These concepts are completely different. Sometimes, properties are represented as attributes, but that doesn’t work for complex values. It is hard to say, whether a child node represents a property, or whether it is possibly the only entry of a list, which is the actual property of the represented object.

XML is of use, but by far not the universal tool everybody believes it to be. In order for XML to be usable in as many contexts as possible, it is completely misused. SVG paths are the best example. XML does not capture the information, that the path represented is not just a string. It is not a flat attribute, such as hairColor=”black”, but XML itself provides no way to tell that.

Widely spread alternatives are JSON and YAML, the latter still being quite exotic, while at the same time being very expressive and containing the former as a subset. JSON could represent the SVG path information, that is tucked into a single string in XML, as what it is: an array (i.e. actually a list, but fair enough). It literally means JavaScript Object Notation and focuses on representing objects, while the eXtensible Markup Language focuses on extensibility, blatantly failing at the most obvious tasks.

XML done right, using schemas and within certain contexts can resolve a lot of ambiguities, but then again this makes XML even more complex and more verbose.

Actually, there is nothing, XML can do, you cannot do better in a number of other established human readable or binary serialization formats. At the end of the day, the only reason to use XML is, that many services and tools you will encounter and want to integrate, use XML. Other than that, XML just sucks.

common mistakes, data formats, data representation, serialization, xml

4 Comments

Howto: Endless Universe …

Posted in Uncategorized on September 5, 2009

Recently, there was a question on stackoverflow.com, about how to optimize a 2D-game in flash, where there are many stars in the background, which move relatively to a ship that is in the center of the screen. The naive approach is to put all objects into one container and move it around. The thing is, flash is really not very good at clipping, so the whole thing stops working properly very quickly.

I myself started to write a little engine (spawn stars (s), unpause (SPACE) and navigate with your mouse) for the same reason a while ago. It is just a proof of concept, and the API is terrifying, and so on, and so forth, which is why I am not willing to release it yet, but when I have time for a rewrite, I will give it a shot again. I tried to explain the basics of the idea behind it, but it seems my explanation was too superficial, which is why I decided to make a post about it.

Let’s look at the whole thing in 1 dimension. The approach is the same, but it’s a little easier to explain and to imagine.
So we have an awful amount of stars, and we want to do something with them.

To clarify our problem:
We have an unlimited set of objects randomly distributed over an unlimited interval.
We need a good method of rendering all objects within a chosen interval (“the part of the universe we currently can see”).

universe

The idea is to somehow put them into a tree, so you can quickly find out which are to be displayed and which aren’t.

Update: please note, that if the size of the visible region is both constant and known beforehand, it is much easier to subdivide “space” into a grid, where cell size is equal to the visible region’s size.

The following should explain to you, how the tree aproach works.

Tree Structure

This tree consists of nodes, leafs and a root. Fot the lack of a better term, I will call both nodes and leafs containers.
A container always represents a given region, which in the case means, it covers an interval.

A leaf contains a maximum of m objects of our universe. m should be reasonably big (somewhere between 50-2000).
A node contains n adjoining containers, (which all cover equal but disjunct intervals, the union of which is the overall interval represented by the node). n=2 actually worked best for me.
The root is a very special node, which can have an unlimited amount of containers. It is the represantion of space in our universe. In the beginning it is absolutely empty. All children of the root cover intervals of the same size s and do not intersect.
Please note that a node always contains a constant number of adjoining containers, whereas the root may contain completely randomly and sparsely distributed containers. For that reason, an array can be used to reference the children of a node. For the root, you should either use vectors, or inthashes in the case of sparse universes.

In the end, it’ll look something like this:

tree

Tree Insertion

When a new element is inserted into the tree, you look in the root, trying to find a container covering the interval corresponding to that element. If there is none, a leaf is created for that interval, and then insertion is carried out.

When inserting an object into a container, there are two possibilities:
1. it’s a node. In that case, the element is inserted into the child container covering the according interval.
2. it’s a leaf. In that case, the element is inserted. If the maximum threshhold m is exceeded, the leaf is split into a new node with n leafs, redestributing its children accordingly.

Rendering

Our input is a given viewport interval, and our output should be a set of elements, that are to be rendered, and the according screen positions they should be rendered to.

So everything starts with finding the containers intersecting our viewport interval in the root.
To find the visible elements in a node, we look for visible for elements in those child containers, that intersect the interval we are searching in. To find the visible elements in a leaf, we’ll simply check for each element, whether it is in the interval or not. This costs O(m), which in the end is O(1) since m is constant.
The rendering position is the element’s coordinate minus the lower of the viewport. With p being the overall size of our universe, we will get an average cost of about O(log(p)) for the whole thing.

render

Since we need to render again and again, we need to do a little more work. We need to keep track of any visible elements and containers, which can be easily done by flagging.
At node level, we use this information as follows. We look at all children. If they are flagged visible, but no longer on screen, we hide them and any children flagged visible recursively. For leafs, this works very similarly. If it’s flagged visible, but off screen, we hide it, if it is flagged invisble, and on screen, we show it, and if it is flagged visible and on screen, we update it.

In 2D, the solution is absolutely the same, except that intervals become squares, objects have 2 coordinates and subdivision of nodes is done with n* n child containers.

Well, one day I may have the time to implement this myself. Until then, I wish you all good luck, and keep me up to date with further optimizations … 😉

performance, space, stars, universe

Leave a comment

ActionScript 3 Iteration

Posted in ActionScript 3 on June 18, 2009

Recently I stumbled upon a question on stackoverflow.com, where someone wanted to know about the quickest way to iterate over an Array. In response, someone else pulled out a benchmark for .NET, which showed, for loops would be faster. But i was quite sure, this is not at all the case for AVM2. So I did a little research and expanded my quest to examine iteration as such, on any native AVM2 objects, that are suitable for collections. Apart from Array, Object and Dictionary, this includes Vector, which is only available for Flash Player 10. Both iteration methods provide key and value. Although it does not really make sense to use Object and Dictionary, if they contain dense numerical keys. In the end, it looks like so:

//for loop
for (key = 0; key < size; key++) value = iterable[key];
//for each loop
key = 0;
for each (value in iterable) key++;

You can grab the whole source >here<

I found, that for each loops are more than two times faster, sometimes even significantly. I also found, that speed depends on the type of the variable the collection is stored into. So here some numbers (tested on Debug FlashPlayer 10.0 r22 for Windows XP, on a Core2Duo with 2Ghz):

testing Vector as Vector.<int>
200 repetitions with collections of size 500000
	> for loops needed 48.595 msecs
	> for each loops needed 19.11 msecs
	> factor: 2.5429094714809

testing Vector as *
200 repetitions with collections of size 500000
	> for loops needed 54.65 msecs
	> for each loops needed 16.125 msecs
	> factor: 3.3891472868217054

testing Vector as Object
200 repetitions with collections of size 500000
	> for loops needed 54.44 msecs
	> for each loops needed 16.335 msecs
	> factor: 3.332721150902969

testing Array as Array
200 repetitions with collections of size 500000
	> for loops needed 50.335 msecs
	> for each loops needed 15.46 msecs
	> factor: 3.2558214747736094

testing Array as *
200 repetitions with collections of size 500000
	> for loops needed 54.19 msecs
	> for each loops needed 15.455 msecs
	> factor: 3.506308637981236

testing Array as Object
200 repetitions with collections of size 500000
	> for loops needed 54.315 msecs
	> for each loops needed 15.335 msecs
	> factor: 3.5418976198239323

testing Dictionary as Dictionary
200 repetitions with collections of size 500000
	> for loops needed 61.17 msecs
	> for each loops needed 24.16 msecs
	> factor: 2.5318708609271523

testing Dictionary as *
200 repetitions with collections of size 500000
	> for loops needed 62.395 msecs
	> for each loops needed 24.205 msecs
	> factor: 2.577773187357984

testing Dictionary as Object
200 repetitions with collections of size 500000
	> for loops needed 62.155 msecs
	> for each loops needed 23.91 msecs
	> factor: 2.599539941447093

testing Object as *
200 repetitions with collections of size 500000
	> for loops needed 64.125 msecs
	> for each loops needed 26.35 msecs
	> factor: 2.433586337760911

testing Object as Object
200 repetitions with collections of size 500000
	> for loops needed 64.09 msecs
	> for each loops needed 26.245 msecs
	> factor: 2.4419889502762433

now some explenations, where this comes from:

to people not from the ECMA-world: Objects, i.e. instances of the class Object, are simply hashes, if you will. someObject.someProperty and someObject[“someProperty”] are equivalent … thus array access and property access are the same.There is not a lot of difference between Arrays and Objects. Except that Arrays handle numerical keys a little differently, that is, they maintain an order, and expose a length, as well as Array manipulation functions, and now new in AS3, iteration functions. Array do have a sweet spot, performancewise, when they are dense and numerical. Then, they are faster than Objects, when it comes to array access.
consider a for each loop. this is some runtime internal magic, written C or C++, which runs considerably fast and retrieves the value, while you calculate the key with a simple incrementation. for the for loop in turn, you need the incrementation, which is not costy, and you need to evaluate the condition, which is AVM2 bytecode, but still ok, and to retrieve the key, you need an array access, which also consist of executing the opcodes, and the implementation of the array access itself. The Array in ActionScript is not just a block of references in memory. It’s some weird multipurpose collection, with complicated access routines, that are all encapsuleted in the array access. now for each iteration does not necessarily preserve order, only for Vectors, and Arrays, but it does not rely on the ambiguous array access, since it comes from the runtime.

it seemed a little suprising to me, that there is a performance difference between * and Object. This seemed really strange. Also, since there is no rule for that. Accessing through the exact type is faster, as you can see. This is, because there are 5 different array accesses in flash. For Object, Dictionary, Array, Vector and Proxy. If the variable is typed, the compiler probably uses this information to hardwire the right array acces, instead of looking it up at runtime. Just a guess, though. One last note: if you create collections, subclassing Proxy, then simple for loops are much faster, since the require only one call to the proxy per step.

actionscript3, array, as3, for, for each, iteration

1 Comment

Back2Dos

back2dos

Don’t bother obfuscating code.

Releasing the Minions

Stop Singleton Abuse!

Singletons do NOT justify global state!

Singletons are NOT intended to provide global state!

Abstraction

Common Misunderstanding

Abstraction in Human Perception

Abstraction in Programming

Free your mind from boxes

The miracle of computer applications

Thoughts on Programming

Everybody likes XML. Everybody, but me.

Howto: Endless Universe …

Tree Structure

Tree Insertion

Rendering

ActionScript 3 Iteration

Archives