Thursday, April 16, 2009

Introducing Hiro, the World's Fastest IOC Container, Part II: The Little Feature Set That Could

The Art of Lean

Now that everyone knows just how fast Hiro can be, the next question you might be asking yourself is, "What features will it support?"

The simplest answer that I can give is that Hiro is that it will implement as little features as possible--in fact, Hiro will implement so little features that it might turn away your traditional "dynamic" IOC user. Here's the features that Hiro will implement:

-Named/Anonymous Static Component Registration. This means that you'll be able to register your services using a service name, service type, and the implementing type.

-Convention Over Configuration for Registration and Injection.. In addition to its support for programmatic registration, Hiro will be able to scan your assemblies and automatically infer:
  1. The list of available services
  2. The list of concrete types that implement those services
  3. The list of properties that can be injected
  4. The constructors that will be used for constructor injection
  5. The services that will be injected into each parameter during a constructor injection call.
The most interesting part about this is the amount of work that will take to do this sort of registration. Take a look at this example:


var loader = new DependencyMapLoader();
var map = loader.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Yourlibrary.dll");

IContainerCompiler compiler = new ContainerCompiler();
var microContainer = compiler.Compile(map);

// Do something with the container here

Believe it or not, this is the only code that you need to use to configure the dependencies for any assembly of any given size. What makes this even more interesting is the fact that all of this is done statically at (container) compile time, not runtime. This means that unlike the current generation of IOC containers, Hiro will not waste any time trying to rediscover your configuration, and based on the benchmark results from my previous post, the performance numbers are nothing short of being dramatically one-sided.

-Transient/Singleton instances. Aside from performance, however, Hiro has to be able to create both Transient (that is, plain old object instances created with the New operator) and Singleton instances.

-Property/Constructor Injection. Hiro will implement both property and constructor injection by default. What makes this even more interesting is that like everything else with Hiro, all property and constructor injection calls will all be precompiled into IL, meaning that there will be no performance issues when using Hiro.

-Stateless. Each Hiro-compiled container instance will not have any private local data (or even shared data) that will distinguish it from another compiled container of the same type. This means that you can scale Hiro's compiled containers across multiple cores AND multiple threads without using a single semaphore, mutex, or lock statement (in C#).

"But wait, your container isn't a REAL container until you implement feature X!"

Some would argue that Hiro would have to implement a minimum "baseline" feature set in order to be considered a "commercial" quality IOC container. However, my counterargument there is that it's this same "fat feature" mindset that got these IOC containers (including LinFu) into this speed problem in the first place. Secondly, if you're experienced enough about IOC to understand the significance of what Hiro does, then it's safe to assume that you're a user that falls into at least one of the following categories:

  1. You've probably written at least one IOC container framework, or
  2. You already are comfortable with an existing IOC container framework (such as Castle, Ninject, AutoFac, Unity, StructureMap, LinFu, etc) and you know enough to customize it to your needs.


Assuming that you're an IOC container author, it would be pointless for me to implement something in Hiro that you've probably rolled into your own framework, and given that you're skilled enough to write your own framework, it would be practically trivial for you to plug Hiro into your own framework and reap the performance benefits, and there's clearly no reason to reinvent the wheel here if you somehow did a better job than I did in implementing "feature X".

Now, if you think you're a user that falls into the second category, there's a good chance that you've pretty much decided to stick to the favorite framework of your choice, and like the other IOC container authors, there's really nothing that I can do for you unless you decide to plug in Hiro into your favorite container.

So between these two types of users, who do you think Hiro is written for?

Here's my answer: Neither one of them. Hiro is written for the average developer who wants to get started with an IOC container and doesn't have time to "geek out" over the latest and greatest features of an IOC container framework. Given that there are probably far better IOC container developers than myself, I've pretty much decided to skip the "my container is better than your container" religious wars and focus on what really matters: the end users.

The Pareto Pleasure Principle

Hiro doesn't need to implement 80% of the expected features of an IOC container in order to be useful--instead, it only has to implement the other 20% of the overall features that (in my opinion) people will actually use. In the end, if I can help those people get their jobs done in the simplest possible way without forcing them to wade through the "awesomeness" of my framework, then I would call Hiro a success, and at the end of the day, that's really all that matters.

Wednesday, April 8, 2009

Introducing Hiro, the World's Fastest IOC Container, Part I: Design Diary

Introduction

Have you ever had one of those moments where someone told you that your work sucked, and it inspired you to create something better? About a month ago,Alex Simkin sent me a message on the CodeProject forums for one of my articles saying that LinFu ranked second to last in performance among all the other containers, and that he was willing to show me the benchmark code that produced those numbers.

Eating the Humble Pie

Unfortunately, Alex was correct. Despite all of its features, LinFu landed a spot near the bottom of the pack, and needless to say, there had to be a better way to design a container so that it wouldn't have these bottlenecks.

"But...but...my container is dynamic and it's flexible!"

As an IOC container author myself, I've probably given that same excuse a dozen times over whenever someone complained that my framework was too slow. I never realized that the flexibility that I so touted in all my IOC articles was the same cause for all my performance headaches. Indeed, there had to be some way to improve these numbers, and at first, I thought adding more dynamic IL generation would solve the problem. After all, Lightweight Code Generation with DynamicMethod seems to be the trend nowadays among the other frameworks like Ninject, and that makes their code run faster, right?

Once again, I was wrong. DynamicMethods didn't make much of a performance impact because Ninject (which reportedly uses a lot of LCG its code) was actually the slowest among all of the IOC containers tested in the benchmark (Sorry Nate). Of course, this doesn't mean that the DynamicMethod approach is the cause of the slowdown; what it does suggest, however, is that piling more and more reflection onto the speed problem is not the solution. In addition, there were other frameworks in that benchmark (such as Funq) that didn't use any reflection at all, and yet, they still were taking significant performance hits on that benchmark. In fact, even the fastest among all the other containers--StructureMap--was still running forty-four times slower than the Plain/No Dependency Injection use case!

So the principle question is this: "Where is this bottleneck coming from, and how do I eliminate it?"


The Real Problem


As it turns out, the answer was staring me in the face all along: "It's the configuration, stupid", I thought to myself. The problem is that every major IOC container at the time of this post (such as Ninject, StructureMap, Unity, AutoFac, Castle, LinFu, etc) essentially has to trawl through each one of its dependencies just to instantiate a single service instance on every call, and practically no amount of optimization will ever compensate for the fact that they still have to "rediscover" any given part of an application's configuration in order to instantiate that one service instance. Needless to say, this rediscovery process wastes a huge amount of resources because these containers are actually "rediscovering" a configuration that (for all practical purposes) will rarely change between two successive method calls.

In layman's terms, this is akin to stopping and asking for directions at every intersection every time you want to leave your home to go to some other destination. There has to be some way to see the "whole map" and plan the trip ahead of time without having to stop for directions at every intersection. If you could plan all the possible routes on that trip ahead of time, then all the time you would have wasted asking for directions immediately vanishes.

In essence, that is what I did with Hiro. Hiro is an IOC container framework that reads the dependencies in your application ahead of time and actually compiles a custom IOC container that knows how to create those dependencies from your application itself. It uses absolutely no runtime reflection or runtime code generation, and since all your dependencies are discovered at compile time (that is, when the Hiro compiler runs), Hiro suffers zero performance penalties at runtime when instantiating your types.

Yes, you read that right: Hiro runs at 1:1 speed with a Plain/No DI configuration. Here's the results of the IOC container benchmark:




As you can see from the results above, the current crop of IOC containers (including LinFu) can only reach 2% of the speed of an application that does not use an IOC container. Now, let's take a look at Hiro's results:



If you don't believe it, then you can download and run the benchmarks yourself.

Like LinFu, Hiro is licensed under the terms of the LGPL, and you can preview the source code at this site. I'll also be starting a Hiro-contrib project, so if you want to add your own extensions, just email me at marttub@hotmail.com and I'll be more than happy to anyone who is interested. Thanks! :)

Ratings by outbrain