Can Scala call by reference?

2019-01-23 04:58发布

问题:

I know that Scala supports call-by-name from ALGOL, and I think I understand what that means, but can Scala do call-by-reference like C#, VB.NET, and C++ can? I know that Java cannot do call-by-reference, but I'm unsure if this limitation is solely due to the language or also the JVM.

This would be useful when you want to pass an enormous data structure to a method, but you don't want to make a copy of it. Call-by-reference seems perfect in this case.

回答1:

Java and Scala both use call by value exclusively, except that the value is either a primitive or a pointer to an object. If your object contains mutable fields, then there is very little substantive difference between this and call by reference.

Since you are always passing pointers to objects not the objects themselves, you don't have the problem of having to repeatedly copy a giant object.

Incidentally, Scala's call by name is implemented using call by value, with the value being a (pointer to a) function object that returns the result of the expression.



回答2:

For a language where "everything is an object" and the object reference can not be accessed, e.g. Java and Scala, then every function parameter is a reference passed-by-value at some level of abstraction below the language. However, from the perspective of the semantics of language abstraction, there is either a call-by-reference or call-by-value, depending on whether the function is supplied a copy of the referenced object. In this case, the term call-by-sharing encompasses both call-by-reference and call-by-value at the language level of abstraction. Thus it is correct to say that Java is call-by-value at the level of abstraction below the language semantics (i.e. comparing to how it would be hypothetically translated to C or in bytecode for the virtual machine), while also saying that Java and Scala are (except for builtin types) call-by-reference at the semantics of its "everything is an object" abstraction.

In Java and Scala, certain builtin (a/k/a primitive) types get passed-by-value (e.g. int or Int) automatically, and every user defined type is passed-by-reference (i.e. must manually copy them to pass only their value).

Note I updated Wikipedia's Call-by-sharing section to make this more clear.

Perhaps Wikipedia is confused about the distinction between pass-by-value and call-by-value? I thought pass-by-value is the more general term, as it applies to assignment expressions, as well as function application. I didn't bother to try to make that correction at Wikipedia, leave it for others to hash out.

There is no difference at level of semantics where "everything is an object" between call-by-reference and call-by-value, when the object is immutable. Thus a language which allows declaration of call-by-value versus call-by-reference (such as the Scala-like language I am developing), can be optimized by delaying the copy-by-value until the object is modified.


The people who voted this down apparently do not understand what "call-by-sharing" is.

Below I will add the write up I did for my Copute language (which targets JVM), where I discuss evaluation strategy.


Even with purity, no Turing complete language (i.e. that allows recursion) is perfectly declarative, because it must choose an evaluation strategy. Evaluation strategy is the relative runtime evaluation order between functions and their arguments. The evaluation strategy of functions can be strict or non-strict, which is the same as eager or lazy respectively, because all expressions are functions. Eager means argument expressions are evaluated before their function is; whereas, lazy means argument expressions are only evaluated (once) at the runtime moment of their first use in the function. The evaluation strategy determines a performance, determinism, debugging, and operational semantics tradeoff. For pure programs, it does not alter the denotational semantics result, because with purity, the imperative side-effects of evaluation order only cause indeterminism in (i.e are categorically bounded to) the memory consumption, execution time, latency, and non-termination domains.

Fundamentally all expressions are (composition of) functions, i.e. constants are pure functions without inputs, unary operators are pure functions with one input, binary operators are pure functions with two inputs, constructors are functions, and even control statements (e.g. if, for, while) can be modeled with functions. The order that we evaluate these functions is not defined by the syntax, e.g. f( g() ) could eagerly evaluate g then f on g's result or it could evaluate f and only lazily evaluate g when its result is needed within f.

The former (eager) is call-by-value (CBV) and the latter (lazy) is call-by-name (CBN). CBV has a variant call-by-sharing, which is prevalent in modern OOP languages such as Java, Python, Ruby, etc., where impure functions implicitly input some mutable objects by-reference. CBN has a variant call-by-need (also CBN), where function arguments are only evaluated once (which is not the same as memoizing functions). Call-by-need is nearly always used instead of call-by-name, because it is exponentially faster. Typically both variants of CBN only appear with purity, because of the dissonance between the declared function hierarchy and the runtime order-of-evaluation.

Languages typically have a default evaluation strategy, and some have a syntax to optionally force a function to be evaluated in the non-default. Languages which are eager by default, usually evaluate the boolean conjunction (a/k/a "and", &&) and disjunction (a/k/a "or", ||) operators lazily, because the second operand isn't needed in half the cases, i.e. true || anything == true and false && anything == false.



回答3:

Here is how to emulate reference parameters in Scala.

def someFunc( var_par_x : Function[Int,Unit] ) {
    var_par_x( 42 ) // set the reference parameter to 42
}

var y = 0 
someFunc( (x => y=x) )
println(y)

Well, okay, not exactly what Pascal or C++ programmers are used to; but then, very little in Scala is. The upside is that this gives the caller more flexibility with what they can do with the value sent to the parameter. E.g.

someFunc( (x => println(x) ) )