Browsing Posts in Performance

In this final (for now) instalment, let me ask a rhetorical question: Is managed code the right choice for every applications? Absolutely not! For example, .NET and Windows itself are not designed for use in real-time systems. There are no guarentees on worst-case latency during processing. i.e. If you’re writing software for a pacemaker or nuclear reactor – both hard real-time systems since failure results in loss of life – you have deadlines by which you have to complete computations and if you don’t make those deadlines, your system may fail. You might think it trivial to meet a deadline (e.g. worst-case length of computation / instructions per second), but consider the fact that any device connected to the system can raise an interrupt, which can result in your code being preempted so that kernel-level driver code can run. So figuring out worst-case latency involves considering the impact of all connected peripherals (and how they interact) in addition to other factors. Not an easy problem and the reason why there exists many operating systems specifically designed for real-time applications. How much software truly has real-time constraints? Very little, to be honest.

But I digress. I think you get the point that .NET isn’t appropriate for all software, but then again, neither is Java or many other commonly used languages and frameworks. However .NET is applicable to a broader class of software than you might imagine. What surprises many people is that .NET isn’t slower than unmanaged code in many cases. There are a lot of areas, such as raw numerical calculations, where the JITed MSIL code is essentially equivalent to what an optimising C++ compiler would produce. Games, which traditionally try to squeeze out the every last ounce of performance from your hardware, are starting to be written in managed code, and we’re not talking Space Invaders and Pacman. Arena Wars, a real-time strategy game, is built using the .NET Framework 1.1. (I’ve honestly never played this game, but it does go to show you that it is possible to write a real game using managed code.) Games are no longer requiring hand-optimized assembly language for critical loops. (For example, Quake by id Software was written in C with parts in
hand-optimized assembly. Quake 2 was also written in C, but contained
no assembly language. The slight performance gain of using assembly
language was not deemed necessary.)

Look at the performance characteristics of your code. If the bulk of your CPU time is spent in third-party frameworks (like DirectX, a physics engine, or an AI engine), it’s rather irrelevant what you write the code that drives the third-party framework, assuming the overhead of calling it isn’t too high. Imagine a situation where you have a sort and you naively use a bubble sort, which is known to be slow. Should this concern you? That depends on how much time is spent in the sort. If the CPU spends 1% of the overall application time in the bubble sort, speeding it up will result in at most a 1% performance gain. Not worth your time. If however the application spends 50% of its time in the sort, then even a factor of 2 speed-up would result in 25% faster execution. The same is true with the choice between managed and unmanaged code. If the time spent in your code is a small fraction of the overall execution time, it’s rather irrelevant what that code is written in. Pick a language/framework that allows you to develop quickly and with the fewest number of errors. But as I pointed out earlier, managed code doesn’t equate to slow code.

In summary, there are trade-offs in using managed code as there are with any runtime environment. You’re not going to get any faster than hand-optimized assembly (if you have infinite time to optimize), but who is going to write Windows applications in assembly language today? (I’m ignoring Steve Gibson for the moment. Amazing what one can do in assembly language these days, but not somewhere I want to live my developer life personally.) The key is to know your tools and know which ones are right for which job. With that in mind, I leave you with some links comparing managed to unmanaged code performance. I hope that they prove enlightening.

Back to talking about performance and .NET. Let me give you an example from my own experience regarding skepticism about whether you can write highly performant applications using managed code. The thought is that if it ain’t C++, it can’t be fast. (The same skepticism applies to Java, too.) I can attest first-hand that it is possible to write highly performant applications using .NET. For example, I was brought in to assist with the development of a server application that was written in C++ and having perf problems. The performance goal was 2000 concurrent clients on reasonable hardware. The alpha version in C++ was able to handle 200 concurrent clients and fell over (read the server crashed with out of memory errors) at 500. The team had spent approximately 3 months working on the C++ codebase writing it and trying to bring it up to par. They had never written server software before and had made a few honest mistakes such as firing up a thread per client (which resulted in running out of memory as stack space was allocated for each thread) and writing to the database while a remote connection was held open. (The problem with holding the connection open was that the clients were just dropping off information and didn’t require an ACK. So you’re holding open a TCP socket for an extra 50-100 ms during the synchronous database write and then closing it rather than freeing it up immediately.)

The team believed that you could only write a highly performant server in C++. So they weren’t too keen when I suggested using C# and .NET. With the assistance of another developer, we created in two days a skunkworks prototype in C# that serviced clients using a thread pool and dropping the results into a queue (implemented with MSMQ) so that we didn’t have to roundtrip to the database on every connection. (The results were lazy-loaded into the database by a queue listener.) This allowed us to free up sockets quickly. The results… On the same hardware that was falling over at 500 clients with the C++ version, we did approximately 10,000 clients with little to no perf tuning. (We used the same client load generator in both cases for a fair comparison.)

Why did we use C#? It wasn’t our intent to subvert their development standards. We just wanted to show that this alternate architecture could scale and the most expedient way to implement it was using .NET. When we showed them the managed version, their jaws dropped. When I suggested that we could spend a few weeks (literally!) writing the same thing in C++, the immediate question was “Why bother?” They had spent more time tracking down memory leaks and bad pointer arithmetic in their C++ codebase than we had spent implementing ours. We more than exceeded their performance goals and could implement new features faster. They didn’t look back and implemented a highly scalable server written ground-up in C# and .NET.

Next time, we’ll look at the performance characteristics of different aspects of managed code…

Continuing the series on life in a managed world, I promised you Developer Deathmatch. It was honestly more of a friendly challenge between Rico Mariani and Raymond Chen. This is only one application and your mileage will vary from application to application, but the main point here is to show that managed doesn’t necessarily mean slow. Now onto the main event…

Raymond wrote a series of articles about perf tuning an unmanaged application, a Chinese/English dictionary reader. Rico ported the same application to managed code and also perf tuned it. (Links to all of Rico’s and Raymond’s blog posts on the topic can be found here on Rico’s blog.) The results are staggering. Remember, both developers are at the top of their game. So this isn’t a naive unmanaged application versus a highly optimized managed application. Let’s look at the execution times:

Version Execution Time
Unmanaged v1 1.328
Unmanaged v2 0.828
Unmanaged v3 0.343
Unmanaged v4 0.187
Unmanaged v5 With Bug 0.296
Unmanaged v5 Corrected 0.124
Unoptimized Managed port of v1    0.124
Optimized Managed port of v1 0.093
Unmanaged v6 0.062

It was only after Raymond pulled out all the stops with unmanaged v6 and wrote a custom allocator to replace the new operator that the unmanaged version surpassed Rico’s first try. This is not suggestive that Raymond is a poor coder – far from that – he’s probably one of the best.

In the end, Raymond won, but at what cost? It took six versions of the unmanaged code and who knows how much effort analyzing and tuning to beat the managed version. As for the managed version, at 0.093 seconds a large portion of the time is the startup overhead of the CLR. So you’re hitting a performance limit of managed code, but for most reasonable, and long-running, applications, ~50 milliseconds firing up your runtime at the beginning of application execution isn’t going to make a dent in your overall perf.

Why do people think that managed code is slow? One reason is that .NET makes it really easy to do colossally stupid things. Things like sucking 1 GB XML file off your hard disk and parsing it into an in-memory tree.

XmlDocument dom = new XmlDocument();

Just because you can do this in two lines of code doesn’t absolve you of the fact that you need to understand what you’re doing. You need to understand your toolset. It is not the fault of the toolset designer if you use it improperly. Let me tell you that XmlDocument.Load isn’t going to be returning control to you anytime soon if someHonkingHugeFile.xml is 1 GB. I haven’t tried it, but you’ll probably throw an OutOfMemoryException on 32-bit Windows if you try something dumb like that.

Next time I’ll talk about some of my own personal experiences in developing performant managed applications.

This is the first in a series on life in the managed world. Originally it started as a single post and has been growing tentacles ever since. To tame the beast, I’ll only write about a few tentacles at a time…

The performance characteristics of managed code have been an interest of mine for a long time. I love writing C# code because it’s so much easier to write, maintain, and debug compared to unmanaged C++ code. But let’s be honest, if managed code doesn’t truly perform compared to C++, then you’re left with a nice prototyping language. Any serious work will still have to be done in C++.

So let’s start by looking at Microsoft’s use of managed code in its own products. This should be suggestive of whether it’s possible to write performant managed applications. (If anyone can do it, it should be the creators of the managed environment!) Dan Fernandez, Lead Product Manager of Visual Studio Express, released some interesting stats:

Product Estimated Managed Code
Visual Studio 2005
7.50 million lines
SQL Server 2005
3.00 million lines
BizTalk Server
2.00 million lines
Visual Studio Team System
1.70 million lines
Windows Presentation Foundation
0.90 million lines
Windows Sharepoint Services
0.75 million lines
Expression Interactive Designer
0.25 million lines
Sharepoint Portal Server
0.20 million lines
Content Management Server
0.10 million lines

Take a closer look. Let’s look at the not-so-interesting ones first. WSS and SPS, though great products that both use .NET, are still very much unmanaged code bases at the core. (Read big C++ ISAPI filter that takes over IIS.) From what I’ve heard, this will be changing with WSS/SPS v3.0. (Bil, correct me if I’m wrong here.) Same goes for CMS, SQL Server, and VS 2005. Basically unmanaged applications/servers with .NET bolted on the side. Great for extensibility, but they really don’t tell us anything deep about the performance characteristics of managed code.

The interesting data points are BizTalk Server 2004, WPF, and Expression (aka Sparkle). These products are .NET from front-to-back.

BizTalk is a fantastic EAI tool and pivotal to Microsoft’s strategy in that space. Do you think Microsoft would “bet the farm” if .NET didn’t perform? They’ve got smart engineers. If managed code couldn’t perform up to its unmanaged counterpart, they’d write the core in unmanaged C++ and provide extensibility points by hosting the CLR. But they didn’t. They wrote the whole thing in .NET. That’s rather suggestive (though not conclusive) that managed code can perform well compared to C++.

Windows Presentation Foundation (WPF) is the new windowing framework that is set to replace Win32-style windows. We’re going from a bitmap/HWND-based windowing environment to a composable, vector-based windowing environment. At its core, it uses DirectX technology for drawing. Vista is getting a full UI facelift in the form of Aero, which is built on top of WPF. As developers, we love to hate our end users who buy apps based solely on visual appearance. The actual code can be a dog’s breakfast, but as long as it looks pretty, users will love it. So this is another huge bet for Microsoft. If WPF doesn’t perform, it will be immediately noticible by end users.

Expression (codename Sparkle) is a new design tool for graphic designers to develop WPF applications. Developers will still work in Visual Studio, but graphic designers will author XAML using Expression. (You didn’t expect Microsoft to force VS onto the artist community, did you?) The nice thing is that the XAML will roundtrip between VS and Expression. Like it or hate it, what will drive most users to upgrade to Vista is a prettier interface. If applications are going to have nice Vista-style visuals, Expression better be fast, easy, and intuitive for graphic designers. (Interesting tidbit. Expression only makes one P/Invoke call. Everything else is pure .NET. The one call? To show Help. The managed wrapper to the Help subsystem is in System.Windows.Forms and it was deemed silly to take a dependency on Windows Forms solely for this reason.)

Microsoft is making some pretty serious bets on managed code and is baking it more and more deeply into Windows and its server products. From experience, managed code is just plain easier. Forget the machismo about whether you’re man (or woman) enough to handle pointers and pointer arithmetic. Pointers have a time and a place. Sure, break out the C, C++, and even assembly language when it is warranted, but only when warranted. Most of the time you should be living in a managed world.

Next time I’ll look at Developer Deathmatch: Rico “Managed” Mariani vs. Raymond “Unmanaged” Chen.

Rico Mariani has been providing excellent .NET performance advice for quite awhile on his blog. If you don’t know Rico, he’s part of the CLR performance team and I would highly recommend reading his posts. Rico has been so kind as to setup a pair of wikis so that his past blog postings are easier to find (Rico Mariani’s Articles and Recommendations) and allow you to share your own performance tidbits (Classes with Comments). Highly recommended.