Have you ever had one of those moments where someone told you that your work sucked, and it inspired you to create something better? About a month ago,Alex Simkin sent me a message on the CodeProject forums for one of my articles saying that LinFu ranked second to last in performance among all the other containers, and that he was willing to show me the benchmark code that produced those numbers.
Eating the Humble Pie
Unfortunately, Alex was correct. Despite all of its features, LinFu landed a spot near the bottom of the pack, and needless to say, there had to be a better way to design a container so that it wouldn't have these bottlenecks.
"But...but...my container is dynamic and it's flexible!"
As an IOC container author myself, I've probably given that same excuse a dozen times over whenever someone complained that my framework was too slow. I never realized that the flexibility that I so touted in all my IOC articles was the same cause for all my performance headaches. Indeed, there had to be some way to improve these numbers, and at first, I thought adding more dynamic IL generation would solve the problem. After all, Lightweight Code Generation with DynamicMethod seems to be the trend nowadays among the other frameworks like Ninject, and that makes their code run faster, right?
Once again, I was wrong. DynamicMethods didn't make much of a performance impact because Ninject (which reportedly uses a lot of LCG its code) was actually the slowest among all of the IOC containers tested in the benchmark (Sorry Nate). Of course, this doesn't mean that the DynamicMethod approach is the cause of the slowdown; what it does suggest, however, is that piling more and more reflection onto the speed problem is not the solution. In addition, there were other frameworks in that benchmark (such as Funq) that didn't use any reflection at all, and yet, they still were taking significant performance hits on that benchmark. In fact, even the fastest among all the other containers--StructureMap--was still running forty-four times slower than the Plain/No Dependency Injection use case!
So the principle question is this: "Where is this bottleneck coming from, and how do I eliminate it?"
The Real Problem
As it turns out, the answer was staring me in the face all along: "It's the configuration, stupid", I thought to myself. The problem is that every major IOC container at the time of this post (such as Ninject, StructureMap, Unity, AutoFac, Castle, LinFu, etc) essentially has to trawl through each one of its dependencies just to instantiate a single service instance on every call, and practically no amount of optimization will ever compensate for the fact that they still have to "rediscover" any given part of an application's configuration in order to instantiate that one service instance. Needless to say, this rediscovery process wastes a huge amount of resources because these containers are actually "rediscovering" a configuration that (for all practical purposes) will rarely change between two successive method calls.
In layman's terms, this is akin to stopping and asking for directions at every intersection every time you want to leave your home to go to some other destination. There has to be some way to see the "whole map" and plan the trip ahead of time without having to stop for directions at every intersection. If you could plan all the possible routes on that trip ahead of time, then all the time you would have wasted asking for directions immediately vanishes.
In essence, that is what I did with Hiro. Hiro is an IOC container framework that reads the dependencies in your application ahead of time and actually compiles a custom IOC container that knows how to create those dependencies from your application itself. It uses absolutely no runtime reflection or runtime code generation, and since all your dependencies are discovered at compile time (that is, when the Hiro compiler runs), Hiro suffers zero performance penalties at runtime when instantiating your types.
Yes, you read that right: Hiro runs at 1:1 speed with a Plain/No DI configuration. Here's the results of the IOC container benchmark:
As you can see from the results above, the current crop of IOC containers (including LinFu) can only reach 2% of the speed of an application that does not use an IOC container. Now, let's take a look at Hiro's results:
If you don't believe it, then you can download and run the benchmarks yourself.
Like LinFu, Hiro is licensed under the terms of the LGPL, and you can preview the source code at this site. I'll also be starting a Hiro-contrib project, so if you want to add your own extensions, just email me at firstname.lastname@example.org and I'll be more than happy to anyone who is interested. Thanks! :)