While some teams solve their performance problems by throwing bigger hardware at the problem, many of us really do care about creating fast code. Visual Studio 2008 Team Developer Edition and above has had a great profiler that I’ve used to solve many performance problems but Visual 2010 moves it into the realm of excellent for both managed and native code. This is especially true for developers who haven’t done a lot of performance tuning and don’t know where to start. Profilers, even more than debuggers, have always scared off some people because in the past they normally showed table upon table of raw numbers that don’t make any sense to a mere mortal unless you’re armed with a PhD in Computer Science or spent 20+ years in the industry. Having written profilers (the now dead TrueTime, may it RIP) and done a ton of performance tuning over the years, I’ve learned one key lesson: knocking off the first one or two obvious performance problems is many times enough to reach your performance goals. The trouble is finding those first two low hanging fruit. The Visual Studio 2010 sampling and instrumentation profiling now make that fruit picking easier than ever.
Before I jump into the new ease of use features, I have to mention that my personal favorite feature in Visual Studio 2010 is full virtual machine support for both x86 and x64. A little bit ago I was working with a client who virtualized every server they had and needed to do sampling profiling. The only solution was to set up a physical box, which was a huge pain because the IT people in that company were, well, jerks about letting anyone set up a machine on the network without their official and time consuming blessing.
If you need to do profiling on Hyper-V, VMware, or Virtual PC, or want the improved ease of use today, go for it! The profiling in Visual Studio 2010 works great for .NET 2.0 applications. There’s no requirement in the profiler that you need to use a solution. Profiling any type of .NET binaries works great and I would highly recommend you start using Visual Studio 2010 this instant. If you find any bugs, please report those and any feature requests to the Diagnostic team at Microsoft so we can all get better tools.
To look at the profiling for Visual Studio 2010 Beta 2, I took a version of a program that had a performance problem in it that I had solved. On my first sampling run, I immediately wished I had Visual Studio 2010 years ago because the new Summary view is amazing. Solving many of your problems can be done right there without any further digging on your part.
The most important new addition is the CPU usage graph in the upper left corner. You’re seeing the CPU utilization across the wall (or stopwatch) time. When you’re looking for CPU-bound problems, seeing where the spikes are is very important. Even better, the CPU usage view is interactive. If you want to see what was going on during a spike, select the spike in the graph and click the Filter by selection link and the summary view resets to just what was happening during that time
Keep in mind, CPU flat lines are also important for filtering. If your server application normally uses 10% of the CPU, but there’s drop offs you didn’t expect, you’ll want to filter on those so you can see what you’re blocking on. Those flat lines get very important in stress scenarios because you could be blocking on the database because it’s overwhelmed. By the way, new in Visual Studio 2010’s profiler now has a feature called Tier Interaction Profiling (TIP) that will report the number of times a managed ADO.NET SQL query executed and how long it took. I won’t go more into Tier Interaction Profiling here because it’s a big enough subject that it needs its own article.
Another Filter by selection trick is when you save the Analyzed Report, it only saves the filtered data. Visual Studio 2008 supported analyzed report saving with manual filters, but getting the data for a CPU spike as you can in Visual Studio 2010 was a bit of work. Additionally, the filtering allows you to grab a portion of the run so it makes it easier to compare two runs for before and after scenarios.
The middle portion of the new Summary page is the Hot Path showing you the most expensive call path based on the type of profiling you were doing (sampling, instrumentation, and/or memory). In Visual Studio 2008, it took several mouse clicks in the Call Tree view to get what I consider the most important piece of profiling information, the hot path. I love those flames!
One of the issues with sampling profiling in particular is that you’re almost never executing your code. No matter if you are sampling managed or native code, you’re always going to have far more samples in runtimes or the operating system than your code. In fact, if you’re doing CPU sampling and see 10%-20% of the samples in your code, that’s generally a sign of a serious performance problem. (Disclaimer: it all depends on the application of course).
What the profiler team has done in Visual Studio 2010 to make our sampling lives easier is they now default to “Just My Code” so you’ll see your code instead of the call stacks and hot paths from the bowls of the operating system or your runtime. This is a surprisingly useful addition to the profiler. If you’ve followed my writings on debugging, you know I have a real problem with Just My Code, but that’s when debugging. For sampling performance tuning Just My Code is what we need to narrow down where in our code is causing the problems. The profiler team gets extra credit because there are some people out there that want to see everything every time. If that’s you, all you need to do is go into the Performance General page in the Options dialog and disable Just My Code. You can also toggle this on and off in the Notifications area of Summary page.
One of the hard problems when designing a profiler is showing call relationships so you can determine what contributed to the performance problem. While it’s easy to see the overall hot path in the Call Tree view, it’s diving into the details that it becomes a hard UI problem. Visual Studio’s profiling tools have always had a Caller/Callee view, which other products have called the Butterfly View.
The Caller/Callee view is useful, but what you really want to see is this data in conjunction with your source code. That’s exactly what the new Function Detail’s view is solves with aplomb. Here’s the same data from above in the Function Detail.
As you’re drilling through your code, the Function Details makes much obvious exactly which lines are contributing to your performance problem so you can fix it faster. My only wish was there was an option to allow word wrapping in the graphical portion because most of your long names are sometimes hard to read.
I have to admit that I saved one of my favorite new features in the profiler to the end of the article. These rules work for both .NET and native development and the easiest way to see all the default rules are by going to the Options dialog, Performance Tools, Rules property page.
If your performance run has one of these errors, it shows up in the Errors List, just like your compiler errors when opening the .VSP file.
As soon as I saw this feature, I knew I was going to be adding my own rules. Right now it looks like extending the rule system is undocumented, so anything I show here can
will break by the time the RTM rolls around. However, since we are at Beta 2, we may get lucky as Microsoft traditionally doesn’t do major rewrites between Beta 2 and RTM.
As with anything undocumented, I need to warn you that you can do extremely bad things to Visual Studio. In experimenting with adding my own rules, I got Visual Studio into a wicked just-crash-or–hang-every-time-you-open-a-performance-log-loop. We are completely on our own with this one so don’t complain to Microsoft if causes your goldfish to hate you.
It turned out to be very easy to find where the rules hang out. I knew the profiler lived in <Visual Studio Directory>Team ToolsPerformance Tools. In that directory is a file, VSPerf_Rule_Definitions.xml. Who said reverse engineering had to be hard?
I thought it would be great to have a.NET rule that would warn you that you called GC.Collect. In nearly every case, calling GC.Collect is a horrible idea because none of your work occurs as all threads are suspended and you screw up the self-tuning in the garbage collector.
After a bit of trial and error, here’s the rule I designed.
<Title>Don not call GC.Collect.</Title>
Calling GC.Collect causes performance problems in your application.
Obviously, the Rule element signifies what follows is a rule. The ID, which must not conflict with any other ID in the file, the Title, and Category elements are self-explanatory. The ContextView element is the view in the profiler you want the user to go to when they double click on your rule. In my rule, I wanted to jump to the Function Details view. Other values you’ll see in VSPerf_Rule_Definitions.xml are Summary and Marks (the performance counters). The GuidanceMessage is the message that will describe the error in the Error List window.
The Condition element is where things start to get interesting. As you look through VSPerf_Rule_Definitions.xml you’ll see examples of AND as well as OR conditions where some of the rules look at multiple values. The part that was a little confusing were the appropriate values for the xsi:type attribute. While there are a few different examples in the rule definitions, I wanted to see if I could find the complete list. Poking around the Team ToolsPerformance Tools directory, I saw a file Microsoft.VisualStudio.PerformanceTools.RulesEngine.DLL, which looked like it might have some interesting stuff in it. Cracking it open with Reflector and a little exploring got me to a namespace called Microsoft.VisualStudio.PerformanceTools.RulesEngine.CodeModel. All the classes in that namespace map directly to the xsi:type attributes I had seen. From the names of the types, you can guess what the condition does. In my case, I wanted to look for the FunctionThresholdPercentCondition because I want to do a comparison on a counter.
Inside the Condition element is where you the pieces of the condition. The Threshold is the numeric value that you want to compare against. In my case, if there’s any percentage of time other than zero means you’re seeing that you spent time inside the function defined in the FunctionName element. Obviously, I’ve specified System.GC.Collect, and since there are multiple overloads I use the regular expression value of .* to look for zero or more characters between the parenthesis. The CounterName element is the column of data I want to look at for the Threshold. The rest of the elements are self-explanatory.
Figuring out the different column names was a bit of an effort. It turns out that the coded names in VSPerf_Rule_Definitions.xml don’t always match what you would expect as it looks like the column names have undocumented abbreviations. After poking around the performance tools directory, I found that in the resources for VSPerfPresentation.DLL is an XML file that appears to list all the columns and their abbreviated names. I’m not positive that’s the whole list, but all the values that are in VSPerf_Rule_Definitions.xml appear in the ShortName element.
There is one problem with my example, which I haven’t been able to figure out a workaround. No matter what I tried, I could not get my rule to look at the Number of Calls column when analyzing an instrumented performance run. My hope is that the Diagnostics team will document how to create our own rules and keep reading my blog as I’ll my rule to account for instrumentation. Even though it’s all undocumented now, it’s always worth it to go through the process so you can start getting ideas on what you can do to extend a tool.
The new features and emphasis on the ease of use make the Visual Studio 2010 profiler a must have tool on all developer’s desktops. I hope I’ve been able to show you what’s new and as I mentioned earlier, why you need to start using it today. Most companies only worry about performance on the day before shipping or after a user complains. Performance is a feature no matter what type of development you’re doing. As David Cheriton said in The Art of Computer System Performance Analysis; “If it is fast and ugly, they will use it and curse you; if it is slow, they will not use it.” Those are truly words to live by!