标签:swp body sof pat network line unity ota finish
原文:
https://michaelscodingspot.com/dotnet-trace/
Performance issues never seem to disappear from the world, no matter how fast new computers become. To deal with those issues we need proper tools. In the world of .NET development, we can consider ourselves lucky in this regard. We have some of the best tools available, at least on Windows. On .NET Core with Linux, things are not so great. But they’re getting better.
With .NET Core 3, Microsoft introduced a bunch of new much-needed diagnostic tools:
In this article, we’re going to do some performance analysis with dotnet-trace and PerfView. But before that, let’s talk a bit about how we did things up to now, in Windows. And why we can’t do that with .NET Core in Linux.
There are 3 major performance profilers in the .NET market:
A profiler allows you to record a runtime snapshot and analyze it for performance issues. You’ll be able to break execution time into methods and find out how much time each method took. You’ll see time of network requests, I/O requests, system code vs user code and a myriad of other metrics to help you figure out the problem.
Unfortunately, these profilers don’t run on Linux (although they do support .NET Core apps on Windows).
Performance counters show a bunch of metrics to get an overall sense of your app’s performance. Those include:
And hundreds more.
Just looking at performance counters can indicate if you have a CPU problem, an I/O problem, a contention problem or a problem with too many exceptions.
Linux has a sort of equivalent tool called Perf but we don’t have an easy way to consume these counters like with PerfMon on Windows. The new dotnet-counters solves this problem and allows to easily consume performance counters from .NET Core on Linux.
Another way to profile performance problems is with PerfView‘s CPU Stack sampling. PerfView is a great tool to analyze ETW events, GC and CPU usage. It’s less intuitive than the commercial performance profilers but in some ways more powerful.
Here’s the general idea of how we do performance profiling with PerfView: First, collect system-wide CPU Stack samples every millisecond (or every X milliseconds). When done collecting, we filter down to a single process and analyze the data much like in a classic performance profiler. You’ll be able to see which methods took the most execution time, method call trees, thread times, and even method call distribution histogram (more on that later).
Analyzing with PerfView adds less performance overhead than attaching a classic performance profiler, which allows you to use it in production environments. It’s also free.
Up to now, you could have used PerfCollect to collect performance trace on Linux (.NET Core 2.1+) and analyze in PerfView. Now you can do it with dotnet-trace as well (.NET Core 3+). There are many differences between dotnet-trace and PerfCollect which are described here. I think dotnet-trace is easier to use and to analyze the data. It also uses a cross-platform mechanism called EventPipe that’s built into .NET Core 3.0. This means your analysis is going to be similar on all platforms. In this article, we’re going to use the newer dotnet-trace.
Here are the steps I followed to collect a performance trace on Linux:
dotnet tool install --global dotnet-trace
.
dotnet publish --configuration Release
|
This created the folder bin\Release\netcoreapp3.0\publish
.
publish
folder to the Linux machine and ran it with the dotnet
command:list-processes
command:For some reason, there are always two different processes listed when running the application (2871 & 2955). In any case, the first one is the one relevant.
Unfortunately, dotnet-trace can only attach to a running process, not start it (at least not yet). So profiling issues at startup is still impossible.
Next, I ran this command to do the actual trace collection: dotnet trace collect -p 2871
. While the collection was recorded, I completed the Console app scenario.
Once the process exited, the dotnet-trace tool finished as well, creating a trace.nettrace
file.
trace.nettrace
output file to WindowsNow that the collection part is over, it’s time to analyze the results. Let’s open trace.nettrace
file in PerfView. We get the following options:
We’re interested in Thread Time here, which will give CPU Stacks samples of all threads.
Once opened, you can go straight to CallTree tab to get the following:
So that’s a whole bunch of information in one screen, but the Name column is pretty straightforward. It shows the calling hierarchy in the process. The thread called the method Main
which then called Sorter.Start
which called ReadLine()
, Bubble.Sort()
, QuickSort.Sort()
, MergeSort.Sort()
, and GetRandomNumbers()
.
We can look at the column Inc % that shows the execution percentage this method took including the time taken by the methods it called (callees). You can see that the Bubble sort function took 30.9% of all execution time, Quicksort function took 3% and Merge sort took just 0.1%. Bubble sort took the longest, which makes sense. But if you know a thing or two about sorting algorithms, then you might suspect there’s a problem here. Merge sort and Quicksort are both O(n*log(n)) (well, Quicksort is that on average) and it doesn’t make sense that merge sort is so much quicker. We’ll analyze that a bit later on.
Let’s talk more about what we see in this window, starting from the columns
If you look at the GroupPats input (top-right corner), you’ll see it’s set to [no grouping]. Unlike in Windows traces, the default of a dotnet-trace is [group module entries] which groups all methods from a single module into one line until another module is called. Without going into too many details on grouping and folding, the [no grouping] option that I used, expands all lines. A better tactic would be to start from module grouping, find the SortingStrategies module of this application and ungroup it. An even better one is the default setting on Windows called [Just my app] which ungroups your own application’s methods and groups everything else. Unfortunately, this option is still missing when importing from Linux. For more detail on grouping and folding watch this tutorial by Vance Morrison.
Let’s investigate the strange difference between Quick Sort and Merge sort. Specifically, I’d like to analyze Quicksort and see why it’s so slow.
Expanding QuickSort.Sort()
simply shows a lot of recursive calls that don’t help much.
To help the investigation, we can minimize the analysis to include just Quick Sort time with the Drill Into functionality. This will remove all other CPU stacks.
Now, let’s move to the By Name tab, which shows just a flat list of methods:
I usually sort this table by Inc % which includes callees for each method (includes the time of the methods it called). You can see that Trace.WriteLine
took 90.7% of the entire execution time. So for some reason, the QuickSort.Sort()
method calls Trace.WriteLine()
which destroys its performance. With this new information, it’s going to be easy to find the problem. Check out the source code of QuickSort and try to find it yourself.
I usually sort this table by Inc % which includes callees for each method (includes the time of the methods it called). You can see that Trace.WriteLine
took 90.7% of the entire execution time. So for some reason, the QuickSort.Sort()
method calls Trace.WriteLine()
which destroys its performance. With this new information, it’s going to be easy to find the problem. Check out the source code of QuickSort and try to find it yourself.
In this post, we saw how to do performance profiling on a .NET Core 3 process on Linux. This involved installing the dotnet SDK, then installing the dotnet-trace tool, collecting a trace of your .NET Core process, copying the trace file to a Windows machine and analyzing with PerfView.
This is probably the best way to do performance analysis today on Linux. Having said that, you can also analyze with PerfCollect and other tools as shown in this very interesting talk by Sasha Goldstein: Debugging and Profiling .NET Core Apps on Linux.
I’d love to see my favorite tools (dotTrace, dotMemory, ANTS profilers) run on Linux. I don’t think there’s any hope for ANTS profiler to support it, but there’s some hope with JetBrains. They seem to support dotTrace for the Mono runtime, but not .NET Core. This is nice for their Rider customers that use Mono runtime with Unity 3D on Mac, but doesn’t help us mainstream .NET developers. Fingers crossed.
Here’s the official blog post by Microsoft announcing the new diagnostic tools in .NET Core 3 (dotnet-trace, dotnet-dump, and dotnet-counters): Introducing diagnostics improvements in .NET Core 3.0. Interesting read.
To learn more on PerfView, I really recommend the video tutorials by Vance Morrison on Channel 9.
https://blog.jetbrains.com/dotnet/2019/10/25/resharper-ultimate-2019-3-eap/
The EAP for dotTrace command-line tools 2019.3 now supports profiling Mono and .NET Core applications on Linux and macOS.
https://blog.jetbrains.com/dotnet/2019/10/25/rider-2019-3-eap/
Now it’s possible to profile .NET Core applications on Linux and macOS using the embedded dotTrace plugin.
Performance Profiling of .NET Core 3 applications on Linux with dotnet-trace and PerfView(转发)
标签:swp body sof pat network line unity ota finish
原文地址:https://www.cnblogs.com/panpanwelcome/p/14283409.html