which has better performance? static versus object

2019-05-14 17:51发布

问题:

I have designed a C# console application to merge and split huge files (about 4GB of size) using OOP design. It involves reading/writing xml, flat-files, and images. I have classes for readers and writers.

The merging took about 00:12, while the splitting took more than 04:30 hours. Then I've enhanced the performance of the splitting to reach 00:50 by distributing the output files into subdirectorys rather than using single directory.

My boss's asking me to convert everything into static procedural programming, but not objects. He says 00:12 for merging comparing to 00:50 for splitting is not balanced. He wants to have the splitting done in 00:30 minutes by converting into static.

Now i know static calls are faster according to this. However I disagree that all static will be better since i will have to use "ref" and "out" parameters in methods.

My questions are:

  1. What is the reason for splitting files into subdirectory is much more faster than using a single output directory? (i.e. for huge number of files >200,000)
  2. Is there a better way than converting my code from object to static, in order to achieve higher performance?

回答1:

Did you profile your program?

You should profile your code. Objects are fast, unoptimal code is dead slow.

After you optimize it, this task would be I/O bound anyway (which means it spends most of time waiting for disks to fetch another part of data).

And yeah, your boss is better off doing bossy things like playing golf or dilberting around, not telling you bullshit about software design. 'cause you're not trying to play golf for him, do you?



回答2:

The difference between an instance call and a static call is so miniscule that I would happily wager that it has nothing to do with your performance issue. At all. Yes, static call is technically faster (by a tiny, tiny amount), but that is nothing compared to all the file IO you are doing. As has already been stated - profile your code, and stop worrying about things like this (premature optimisation). Most likely, the bottleneck is poor collection performance, perhaps fixable with dictionary etc.

Timings:

static: 154ms
instance: 156ms

So 2ms difference over 50M calls! Forget about it...

Based on:

class Program
{
    static void Main()
    {
        StaticMethod(); // JIT
        Program p = new Program();
        p.InstanceMethod(); // JIT

        const int LOOP = 50000000; // 50M
        Stopwatch watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++) StaticMethod();
        watch.Stop();
        Console.WriteLine("static: " + watch.ElapsedMilliseconds + "ms");

        watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++) p.InstanceMethod();
        watch.Stop();
        Console.WriteLine("instance: " + watch.ElapsedMilliseconds + "ms");
    }
    [MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.NoOptimization)]
    void InstanceMethod() { }
    [MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.NoOptimization)]
    static void StaticMethod() { }
}

edit:

If we assume (for example) that we create a new method every 20 calls (if (i % 20 == 0) p = new Program();), then the metrics change to:

static: 174ms
instance: 873ms

Again - nowhere near enough to indicate a bottleneck, when that is over 50M calls, and we're still under a second!



回答3:

Your task sounds like it should definitely be IO-bound, not CPU-bound. Micro-optimising by removing proper OO design would be madness. The difference between static methods and instance methods is usually unmeasurably small (if it's even present) anyway.

As alamar says, you should profile your app before going any further. There's a free profiler available from Microsoft or you could use JetBrains dotTrace profiler. There are others, of course - those are the two I've used.

Just as an indication of whether it's IO-bound or CPU-bound, if you run task manager while the app is running, how much CPU is the process taking? And is the disk thrashing the whole time?

Putting a vast number of files in a directory will slow down access to that directory, but only when you actually create or open a file, or list the files in the directory. I'm surprised it makes quite that much difference, admittedly. However, having 200,000 files in a directory sounds pretty unmanageable anyway. Using a hierarchical approach is likely to be better in terms of using these files afterwards.

Why does your boss think that the merge and split should take the same amount of time in the first place?



回答4:

I can answer number 1: having many files in a single directory gives you poor performance. It doesn't have anything to do with your code - it's a Windows thing (or a NTFS thing, I don't know). Splitting things up under different subdirectories indeed improves performance a lot.

As for number 2, I highly doubt that using static methods will make a huge difference. Using static methods is faster but only marginally so. We're talking microseconds here. There's probably something else going on. There's only one way to find out, and that is like alamar says, to profile your code.

You can use a tool like Ants to profile your code and see what operations are the bottleneck. It can list the time spent in all methods in your program, so you can see what takes the most time, which could really be anything. But then at least you know what to optimize.



回答5:

My Answers are

  1. Depending on your OS & file system, performance starts to degrade after 20 -30k files/subfolders. It's a fact of life. Ntfs Performance And Large Volumes of files and directories

  2. A statement that Non OO code is faster than OO code is redicilous. You cannot know what your performance bottle neck is until you profile the code. See the answers to this question for good information Performance anti-patterns



回答6:

Many file systems have performance problems when the number of entries in a directory increases beyond a certain limit. Which one are you using?

If you add a logging function in the debug version of your program you may get an indication of the places where the most time is spent. That's where the optimization should take place.



回答7:

  1. It is impossible to answer this without knowing your FS. But as others have noted, FSes generally are not optimized for massive collapsed directory trees.
  2. I think rejecting OOP due to a possible (you haven't profiled) ~10% speed increase is ridiculous, particularly when the page says, "please do not take this data too literally".

Finally, though you haven't given much information, I see no reason to think this "unbalance" is odd. Writing is slower, sometimes significantly so.