VB.NET/COM Server code way slower than Excel VBA c

2019-04-16 13:18发布

Background

I have a client who needs Excel VBA code that produces formula values moved to VB.NET. He is in the business of providing financial analytics, in this case delivered as an Excel add-in. I have translated the VBA into VB.NET code that runs in a separate DLL. The DLL is compiled as a COM Server because, well, Excel-callable .NET UDFs have to be. So far, so good: Excel cells have "=foo(Range1, Range2, ...)", the VB.NET Com Server's UDF is called, and the cell obtains a value that matches the VBA code's value.

The problem

The VB.NET code is way slower. I can stretch a range of VBA-based formulas and get instantaneous calculation. I can stretch a comparable range of VB.NET-based formulas and the calculation takes 5-10 seconds. It is visibly slower and unacceptable to the client.

There are a few possibilities that occur to me:

  1. native compilation of VBA is faster because of the absence of a switch
  2. the DLL may be loaded and unloaded for each UDF call
  3. the DLL calls Excel WorksheetFunction methods and requires an Application object, and creating the Application object is expensive
  4. calling an Excel WorksheetFunction method from the DLL is expensive

I don't think that (2) is true because I put calls to append to a file in the Shared New, the Public New, and Finalize functions, and all I get are:

Shared Sub New
Public Sub New
Finalize

when I open the spreadsheet, repeatedly stretch a formula range, and close the spreadsheet.

I don't think (3) is true because the file writing shows that the Application object is created only once.

The question

How do I figure out what is taking the time? How to profile in this environment? Are there obvious enhancements?

In the last category, I have tried to reduce the number of creations of an Application object (used for WorkSheetFunction calls) by making it Shared:

<Guid("1ECB17BB-444F-4a26-BC3B-B1D6F07D670E")> _
<ClassInterface(ClassInterfaceType.AutoDual)> _
<ComVisible(True)> _
<ProgId("Library.Class")> _
Public Class MyClass
    Private Shared Appp As Application ' Very annoying

Approaches taken

I've tried to reduce the dependence on Excel mathematical functions by rewriting my own. I've replaced Min, Max, Average, Stdev, Small, Percentile, Skew, Kurtosis, and a few more. My UDF code calls out to Excel much less. The unavoidable call seems to be taking a Range as an argument and converting that to a .NET Array for internal use.

8条回答
Lonely孤独者°
2楼-- · 2019-04-16 13:38

I recently benchmarked moving data from Excel to .NET using various products/methods. All the .NET methods I tried were slower than VBA and VB6 but the best ones were able to use the XLL interface which gave better results than the Automation interface. the benchmark was reasonably optimised (transferring ranges to arrays etc) results were (millisecs for my benchmark)

  • VB6 COM addin 63

    C XLL 37

    Addin Express Automation VB.net 170

    Addin Express XLL VB.net 100

    ExcelDNA XLL CVB.Net 81

Managed XLL gave comparable times but also enables cusom marshallers which can be fast.

查看更多
仙女界的扛把子
3楼-- · 2019-04-16 13:39

There is some more performance stuff for ExcelDna on CodePlex: http://exceldna.codeplex.com/Wiki/View.aspx?title=ExcelDna%20Performance.

For really simple functions, the overhead of calling a managed function through ExcelDna is very small, allowing you to make several hundred thousand UDF calls per second.

查看更多
Viruses.
4楼-- · 2019-04-16 13:42

I have the same experience as Joe. It is mostly the interop that is slow.

In most cases this can be solved by working with entire ranges instead if individual cells. You do this by using .Net arrays and the pass them to/from excel in one call.

e.g.

Dim values(10,10) As object

Dim r As Excel.Range = Me.Range("A1")
r = r.Resize(UBound(values, 1), UBound(values,2))
values = r.Value

For ii = 0 To UBound(values,1)
    For jj = 0 To UBound(values,2)
        values(ii,jj) = CType(values(ii,jj), Double)*2
    Next
Next

r.Value = values

This has solved all performance problems I have seen

查看更多
Viruses.
5楼-- · 2019-04-16 13:43

Really late to this question (7 years) but for what it is worth, I have worked on 5/6 separate Excel systems in Investment Banks and have seen a similar design pattern in all their Excel systems which I'll describe.

Yes, they have blocks of cells which contain related data such as a list of government bond prices but they do not always pass this block of cells around. Instead they will create an object that resides in memory which is globally accessible and is labelled with a handle. The object contains a copy of the cell's content and so is thus more easily accessed in analytic code.

So an example handle would be

'USTreasuries(103450|2016-07-25T15:33)' 

where it can be seen that '103450' is an object number, unique enough to acquire the object from a globally scoped dictionary (say), the timestamp represents when the object is created and USTreasuries is a user friendly description. One would create such as object with a formula function something like this

=CreateHandledObject("USTreasuries",A1:D30)

The one would write an analytic which accepts this handle and acquires the data internally. It requires the CreateHandledObject() to be marked volatile and you have to turn calculation to manual and execute recalculation by code or by user.

You problems stem from endless marshalling data from the sheet. I think this approach will help you reduce this cumbersome element to a minimum.

查看更多
你好瞎i
6楼-- · 2019-04-16 13:44

I seriously suppose that interop from VB.NET to the COM server is done via marshalling. In VBA the methods were called directly - the control was passed into them at a cost of couple of processor instructions and that looked really fast. Now with marshalling a whole set of extra work is done and each call encounters a serious overhead. You need to either seriously reduce the number of calls (make each call do more work) or disable marshalling and work as if was with VBA. See this question for details on how to possibly accomplish the latter.

查看更多
▲ chillily
7楼-- · 2019-04-16 13:47

One thought. Instead of passing the Range object (it could be that every call onto the Ranbe object could be marshalled from .Net to Excel), collate all your parameters into basic types, doubles, strings, typed arrays and if necessary un-typed variant arrays, and pass them into the .Net DLL. That way you only have to marshall a variant.

-- DM

查看更多
登录 后发表回答