Design approach: Overloading vs Switch?

2020-06-01 07:58发布

问题:

In regard to performance and scalability in package design, is it best to:

  1. … ‘overload’ function names (letting Mathematica sort out which version to use based on patterns/conditions/tests and the way the system orders definitions)?
  2. … or to construct a single function with a Switch[] (or similar command) to direct evaluation?

Mathematica’s expressiveness frequently confuses me with silly (?) issues like this.

回答1:

This is a broad question, but I will take this opportunity to give a broad answer...

I advocate that one should embrace the principal paradigm of a programming language, rather than trying to fight it or to write code that follows the idioms of another language. Mathematica is built around the notion of pattern-matching so, IMHO, we should always consider pattern-matching first when trying to express ourselves. Following that principle, I would favour definitions over Switch.

On the question of performance, I am becoming increasingly vexed by the growing emphasis on microbenchmarks when comparing Mathematica constructs. While it is valuable to know the costs associated with constructs, we should heed Knuth (or was it Hoare?): "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil". The "evil" being the loss of readability in a program that, in the interests of efficiency, uses some obscure or indirect approach to achieve an effect. Here is my performance checklist:

  1. Is performance a problem? If not, then skip the rest of the checklist.

  2. Where is the performance bottleneck? A profiler helps here, but often the bottleneck can be found easily by inspection or a few print statements. Then...

  3. Is the algorithm inefficient? Very frequently: is there a doubly-nested loop that could be linearized or helped out by an indexing scheme?

  4. Okay, the algorithm is good so I guess it is time to microbenchmark.

I don't know if my use of Mathematica is not ambitious enough, but most of the time I don't get past step #1. And then #3 catches most of the rest. In Mathematica, I find I'm usually just overjoyed that I can perform some ambitious task with a small amount of code -- overall performance usually doesn't enter into the picture.

Uh-oh, I'd better put the soapbox away. Sorry 'bout that.



回答2:

Except for very simple cases, I prefer to use a function with multiple definitions instead of Switch. The reasons for this are threefold:

  1. I find it easier to read the code as long as the function is well named.
  2. It's easier for the fall-through/error case to set an appropriate default and call the function again.
  3. If you use a function you can use the named patterns when calculating the result.

Edit

Here's an example created as an example of #2 for Sjoerd:

createNColors[fn_, Automatic, n_] := Table[Hue[i/n], {i, n}]

createNColors[fn_, colors_List, n_] := PadRight[colors, n, colors]

createNColors[fn_, color:(Hue | RGBColor | CMYKColor | GrayLevel)[__], n_] := 
    Table[color, {n}]

createNColors[fn_, color_, n_] := (
    Message[fn::"color", HoldForm[color]]; 
    createNColors[fn, Automatic, n]
    )  

It could be used to generate a set of n colors for some option.



回答3:

To answer the performance part of your question, consider the following two examples of overloading and using Switch[]

switchFunc[a_] :=  Switch[a, _String, 5, _Integer, var, _Symbol, "string"] 

overloadFunc[a_String]  := 5;
overloadFunc[a_Integer] := var;
overloadFunc[a_Symbol]  := "string";

This is extremely simplified, but suffices to demonstrate the difference in performance

In[1]  := Timing@Nest[switchFunc, x, 1000000]
Out[1] := {3.435, "string"}

In[2]  := Timing@Nest[overloadFunc, x, 1000000]
Out[2] := {0.754, "string"}

However, if you intend to overload your function based on conditional tests, the performance is worse than Switch[]:

switchFunc2[a_] := Switch[a < 5, True, 6, False, 4];

overloadFunc2[a_ /; a < 5] := 6;
overloadFunc2[a_ /; a > 5] := 4;
overloadFunc2[a_] := a;

In[3]  := Timing@Nest[switchFunc2, 4, 1000000]
Out[3] := {2.63146, 4}

In[4]  := Timing@Nest[overloadFunc2, 6, 1000000]
Out[4] := {4.349, 6}

EDIT: The timings in this answer were made using Mathematica 8.0.1 on OS X 10.7.2. See Mr.Wizard's answer for some additional results where the above order is reversed. Nonetheless, I think it is a bad idea performance wise to do logical pattern checks on function arguments.

From a design point of view, my personal experience is that Switch[] and it's ilk are terrible because they are hard to read and slow. However, I also think that having the same function perform differently based on the type of argument is generally bad design and makes following your code much harder (even though it may be easier to read).



回答4:

Your question is quite vague as written, and there are different interpretations of "overloading" that would change my answer. However, if you are talking about overloading your own functions with regard to different types (heads) and argument patterns, then by all means, take advantage of Mathematica's tightly integrated pattern matching.


To provide a practical example, I shall use this solution of mine. For reference:

f[k_, {}, c__] := If[Plus[c] == k, {{c}}, {}]

f[k_, {x_, r___}, c___] := Join @@ (f[k, {r}, c, #] & /@ Range[0, Min[x, k - Plus[c]]])

If I rewrite f without pattern matching and call it g:

g = Function[{k, L, c},
      If[L === {},
         If[Tr@c == k, {c}, {}],
         Join @@ (g[k, Rest@L, Append[c, #]] & /@ Range[0, Min[First@L, k - Tr@c]])
      ]
    ];

I feel that this is less clear, and it is certainly less convenient to write. I had to use explicit Rest and First functions, and I had to introduce Append as I cannot accommodate a variable number of arguments. This also necessitates a dummy third argument in use: {}.

Timings show that the original form is also considerably faster:

f[12, {1, 5, 8, 10, 9, 9, 4, 10, 8}]; // Timing
g[12, {1, 5, 8, 10, 9, 9, 4, 10, 8}, {}]; // Timing
{0.951, Null}
{1.576, Null}

In response to Timo's answer, I feel it is of value to share my timing results, as they differ from his. (I am using Mathematica 7 on Windows 7.) Further, I believe he complicated the DownValues version beyond the function of the Switch version.

First, my timings of his functions as written, but using a range of values:

Array[switchFunc2, 1*^6]; // Timing
Array[overloadFunc2, 1*^6]; // Timing
{1.014, Null}
{0.749, Null}

So even as written, the DownValues function is faster for me. But the second condition is not needed:

ClearAll[overloadFunc2]

overloadFunc2[a_ /; a < 5] := 6;
overloadFunc2[a_] := 4;

Array[overloadFunc2, 1*^6]; // Timing
{0.546, Null}

Of course, in the case of such a simple function one could also use If:

ifFunc[a_] := If[a < 5, 6, 4]

Array[ifFunc, 1*^6]; // Timing
{0.593, Null}

And if this is written as a pure function which Mathematica compiles inside Array:

ClearAll[ifFunc]
ifFunc = If[# < 5, 6, 4] &;

Array[ifFunc, 1*^6]; // Timing
{0.031, Null}