Today’s post will address two particular areas:
- optimization of pathfinding and raycasting
- area awareness system, automatic waypoint placing system
As work on the RaveAI continues I can see things in the existing code that could be rewritten for a huge speed gain. Following notes are not some big discovery but you must remember them when you are using managed runtime environment for doing some “realtime”. Thanks to the access to the Monocle prerelease program I had a chance to use this great profiling tool to search for RaveAI performance bottlenecks. It turned out that there are (were) quite a few of them. Almost all are connected to the fact that Flash Player and AIR runtime use Garbage Collection as memory management paradigm. This fact infulences not just the memory release issues but also the final speed of your application. If you are creating huge numbers of object per frame, it is only matter of time when garbage collector will stop your application to release unused objects from memory. You can see this in the Monocle as sawtooths in the Memory consumption. When memory consumption goes above some number (based on your OS, hw, available memory, swap size etc.), GC will start to do its work and frees the memory. This looks like a tooth on memory consumption graph in the Monocle and there is strong correlation with FPS (frames per second) your game / app is running. When GC starts, it has to stop your app, so FPS go down because there are no updates to framebuffer while GC runs. GC has to traverse whole object graph of your application to search for objects that have connection to the GC root. These objects are candidates to release from the memory pool. But object graph traversing of the running application is a problem, so GC has to stop your app (well, there are GCs which handle parallel run to your app, but I don’t think it’s FP’s case). Thanks to the Monocle you can see where GC run takes place and you can see objects that are released too. So you can see where are your GC connected bottlenecks located.
This isn’t only about releasing memory – object creation and allocation of the memory for new object are time consuming operations. Performing too much “new calls” in each frame can kill your project’s performance as well. But there are solutions for this – for example objects pools and reusing existing objects.
RaveAI uses a lot of Vector and Matrix math. I used Vector3D as a base of my vector operations and I built simple vector math library on top of it. But objects in my library were all immutable so all operations produced new objects and it turned out as a big problem. New object is created to store your result everytime you use some vector operation. So, there are calls for “new” operator, memory allocation, new object creation, filling it with nulled variables, calling constructor hierarchy, etc. and as time goes, GC has to kill and free a lot of objects. So I rewrote my Vector library to MutableVector library. Every vector operation now stores the result to one of its operands. So you are reusing them and you do not have to create new objects at all (at least for the vector math). You can use this principle through your application. But, there is a problem, a big one.
As Donald Knuth wrote, “Premature optimization is root of all evil”. Not just premature optimization but every optimization that is used without reason can cause problems and bugs. The thing is, the MutableVector library can’t be used in a same way as ImmutableVector since it doesn’t behave the same way. Parameters given to the methods are rewritten and so you have to store and backup parameters before calls to ensure they won’t be overwritten. Same applies to the rest of object reuse paradigm. Unit tests are ideal for dealing with these problems. Test even the cases that shouldn’t make any problems on the first look. There always can be one.
So, back to RaveAI. An optimization concerning GC and memory allocation stuff, object allocation and so resulted in a huge speed gain. I had 10 NPCs running with pathfinding and full raycasting sight barely at 30 FPS beforethe optimization. After optimization work – 60NPCs running almost constant 60 FPS. And I see more possibilites for another optimizations. These are not related to the GC and/or FP/AIR runtime behaviour, but related to the RaveAI’s base algorithms. I am also going to implement AS3 workers, so stay tuned!
Area awareness system
I want RaveAI to be easy to use even to people that have no AI or hardcore programming skills. That is the reason that you can use JSON to script and control your NPCs in Rave, that is the reason that you can configure almost everything in Rave through JSON configuration and so. But when it comes to level design, you must let know the AI how it should behave in your levels. Where it can and where it cannot go. You can set it up in your own editors and export it, you can lay waypoint grid by yourself, but RaveAI can do this for you. Rave features automatic waypoint placing system, that can place waypoints anywhere you want. On terrain, on meshes, everywhere. It can do this even in realtime during the application start up so you can use it with procedurally generated levels or meshes. You can specify which meshes will be used for the waypoints placement and which not. You can define parts of the meshes that cannot be populated by waypoints, you can define distance of waypoints etc. Or you can let RaveAI decide. Following screenshot shows area awareness system test on a simple mesh exported from Blender. The waypoints are automatically placed on the places you expect AI should be walking and they are connected in a way that AI can use navigation there. Of couse, you can set up rules for the waypoint connection yourself in the JSON file in the form of predicates. Final note: Of course you can precompute this data and let Rave load them on startup so it doesn’t have to compute tham again.