How does implicit typing make code clearer?

2019-02-16 17:49发布

站内文章 / C#

55 0

做个烂人

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

In a book I'm reading it states the implicit typing makes the following code clearer than if you didn't use the var keyword:

var words = new[] { "a", "b", null, "d" };

foreach (var item in words)
{
    Console.WriteLine(item);
}

It seems to me that the opposite is true: if you used string instead, then readers of the code would immediately know it was a string in the foreach loop, instead of having to look up in the code where the variable is defined.

How does implicit typing make the above code clearer?

Addendum

The book is C # 3.0 - Die Neuerungen. schnell + kompakt which is in German, the actual text is:

Das Schluesselwort var kann auch beim Durchlaufen von foreach-Schleifen verwendet werden, um somit den Code uebersichtlicher und einfacher zu gestalten. Besonders bei komplexen Typen kann man auf diese Art und Weise Programmierfehler verhindern.

here's my translation:

The var keyword can also be used when iterating through foreach loops, thus making the code easier and simpler to create. Especially when using complex types, this can prevent programming errors.

Ok, reading it more closely now he actually states that var in a foreach loop makes the code easier to create but not necessarily easier to read.

回答1:

Personally, I'd agree with you. I'm not sure if clearer is the word I would use but in certain situations the var keyword can certainly make it cleaner, i.e:

var myClass = new ExtremelyLongClassNameIWouldntWantToTypeTwice();

回答2:

I think it's odd that only C# and Java programmers seem to suffer from an affliction that prevents them from extracting information from the context of code, while developers of Python, JavaScript, Ruby, F#, Haskell and others seem to be immune to this. Why is it that they appear to be doing fine, but us C# programmers need to have this discussion?

If foregoing explicit type declarations is sloppy or lazy, does that mean there's no high quality, readable Python code? In fact, don't many people praise Python for being readable? And there are many things that irk me about dynamic typing in JavaScript, but lack of explicit type declarations isn't one of them.

Type inference in statically typed languages should be the norm, not the exception; it reduces visual clutter and reduncancy, while making your intention clearer when you do specify a type explicitly because you want a less derived type (IList<int> list = new List<int>();).

Some might argue a case against var like this:

var c = SomeMethod();

Well, to that I'd say you should give your variables more sensible names.

Improved:

var customer = SomeMethod();

Better:

var customer = GetCustomer();

Lets try explicit typing:

Customer customer = GetCustomer();

What information do you now have that you did not have before? You now know for certain it's of type Customer, but you already knew that, right? If you're familiar already with the code, you know what methods and properties you can expect on customer, just by the name of the variable. If you're not familiar with the code yet, you don't know what methods Customer has any way. The explicit type here added nothing of value.

Perhaps some opponents of var might concede that in the above example, var does no harm. But what if a method doesn't return a simple and well-known type like Customer, or Order, but some processed value, like some sort of Dictionary? Something like:

var ordersPerCustomer = GetOrdersPerCustomer();

I don't know what that returns, could be a dictionary, a list, an array, anything really. But does it matter? From the code, I can infer that I'll have an iterable collection of customers, where each Customer in turn contains an iterable collection of Order. I really don't care about the type here. I know what I need to know, if it turns out I'm wrong, it's the fault of the method for misleading me with its name, something which cannot be fixed by an explicit type declaration.

Lets look at the explicit version:

IEnumerable<IGrouping<Customer,Order>> ordersPerCustomer = GetOrdersPerCustomer();

I don't know about you, but I find it much harder to extract the information I need from this. Not in the least because the bit that contains the actual information (the variable name) is further to the right, where it will take my eyes longer to find it, visually obscured by all those crazy < and >. The actual type is worthless, it's gobbledygook, especially because to make sense of it, you need to know what those generic types do.

If at any point you're not sure what a method does, or what variable contains, just from the name, you should give it a better name. That's way more valuable than seeing what type it is.

Explicit typing should not be needed, if it is, there's something wrong with your code, not with type inference. It can't be needed, as other languages apparently don't need it either.

That said, I do tend to use explicit typing for 'primitives', like int and string. But honestly, that's more a thing of habit, not a conscious decision. With numbers though, the type inference can screw you if you forget to add the m to a literal number that you want to be typed as decimal, which is easy enough to do, but the compiler won't allow you to accidentally lose precision so it's not an actual problem. In fact, had I used var everywhere it would've made a change of quantities in an large application I work on from integers to decimal numbers a lot easier.

Which is another advantage of var: it allows rapid experimentation, without forcing you to update the types everywhere to reflect your change. If I want to change the above example to Dictionary<Customer,Order>[], I can simply change my implementation and all code that called it with var will continue to work (well, the variable declarations at least).

回答3:

In this case it doesn't, which is just my subjective opinion of course. I only use it when the type is elsewhere on the same line like:

var words = new List<String>();

I like var, but I wouldn't use it like in your example as it is not clear what the type is.

回答4:

It's clearer, in the sense of less noise/redundancy. The type of words can be easily deduced by the new[] { ... } statement, both by the compiler and the developer. So var is used in stead of string[], as the latter can visually clutter the code.

It's clearer, in the sense of transparency. You can swap the actual value with an instance of any other type, as long as it's an enumerable type. If you didn't use var, you'd have to change both of the declaration statements in the example.

It's clearer, as it forces you to use good variable names. By using var, you cannot use the type declaration to indicate the contents of the variable, so you'll have to use a descriptive name. You only declare a variable once, but you may use it many times, so it's better to be able to figure out the contents of the variable by it's name. From this perspective, word would have been a better choice for the loop variable name.

Please note that the above reasoning is done from the author's perspective. It doesn't necessarily reflect my personal opinion :)

Edit regarding your addendum:

As I mentioned before, you can swap the underlying collection type, without having to update all your foreach loops. This does make it easier to create and change your code, but doesn't necessarily prevent programming errors. Let's look at both cases after we introduce a Word class as a replacement of the plain strings:

If we don't use the var keyword, the compiler will catch the error:

var words = new[] { new Word("a"), new Word("b"), null, new Word("d") };

// The compiler will complain about the conversion from Word to string,
// unless you have an implicit converion.
foreach (string word in words)
{
    Console.WriteLine(word);
}

If we do use var, the code will compile without errors, but the output of the program will be completely different, if the Word class hasn't (properly) implemented ToString().

var words = new[] { new Word("a"), new Word("b"), null, new Word("d") };

foreach (var word in words)
{
    Console.WriteLine(word); // Output will be different.
}

So, in certain cases subtle bugs can be introduced when you use var, which would have been caught by the compiler otherwise.

回答5:

Clean here means less redundant. Since it is trivial for the compiler to infer that the type of the object is string[], it is considered verbose to specify it explicitly. As you point out, however, it may not be so obvious to the human reading the code.

回答6:

The example is poor, as many examples demonstrating syntactic sugar tend to be - syntactic sugar helps where things are complicated, but nobody likes complicated examples.

There are two cases where you might want to use var, and one where you must:

Where you might want it:

It can be useful in experimental code, when you are switching the types involved quickly while exploring your problem-space.
It can be useful with complicated generic-based types such as IGrouping<int, IEnumerable<IGrouping<Uri, IGrouping<int, string>>>> which can especially happen with intermeditary states within complex queries and enumeration operations.

Personally, I prefer to use even the complicated form over var, as it doesn't cost someone reading it who doesn't care about the exact type (they can just skip it thinking "complicated grouping type"), but is clear to someone who does care without their having to work it out themselves.

Where you need it:

In dealing with anonymous types, in code like:

var res = from item in src select new {item.ID, item.Name};    
foreach(var i in res)    
    doSomething(i.ID, i.Name);

Here res is an IEnumerable or IQueryable of an anonymous type, and i is of that anonymous type. Since the type has no name it's impossible to explicitly declare it.

In this last case, it is not syntactic sugar, but actually vital.

A related gripe, is that SharpDevelop used to have it's own while-editting form of var; one could type:

? words = new string[] { "a", "b", null, "d" };

And at the semi-colon, the editor would produce:

string[] words = new string[] { "a", "b", null, "d" };

Which gave (especially in more complicated cases) the advantage of typing speed along with producing explicit code. They seem to have dropped this now that var does conceptually the same thing, but it's a pity to have lost the typing shortcut to the explicit form.

回答7:

Using implicit typing is a guideline, not a law.

What you brought up is an extreme example where implicit is certainly not ideal.

回答8:

It makes the code clearer when

You have a class with a really long name.
You have linq queries.

回答9:

I don't think I really understood the good aspects of implicit typing in C# until I started using F#. F# has a type inference mechanism that is in some ways similar to implicit typing in C#. In F#, the use of type inference is a very important aspect of code design that can really make the code more readable even in simple examples. Learning to use type inference in F# helped me understand those situations when implicit typing could make C# more readable, as opposed to more confusing. (The Linq case is, I think, fairly obvious, but many cases are not.)

I realize this sounds more like a plug for F# than an answer to the question re.C#, but it's not something that I can pin down to a simple set of rules. It's a learning to look at the code through new eyes when thinking about things like readability and maintenance. (That, and maybe it is a little bit of a plug for F#, lol.)

回答10:

If you wanted to make this code more explicit, I would suggest expanding the new instead of removing var:

var words = new string[] { "a", "b", null, "d" };

Resharper will give you hints to remove redundant code with that example, but you can turn those hints off if you like.

The argument for using var is that in many cases the type identifiers for local variables are redundant, and redundant code should be removed. By removing redundant code, you can make the cases where you do actually care clearer, for example if you want to enforce a specific interface type for a local variable:

ISpecificInterface foo = new YourImplementation()

回答11:

When you need to change your code, you'd have to do it in less places. No more Dictionary or longer type declarations, no more Some.Very.Long.Class.With.Very.Long.Path<string> declaration = functionParameter[index], etc.

Although I do agree that when it is used in other situations than small methods, it might get very confusing.

回答12:

The primary benefit of explicit typing is in my view that it's possible by just looking at the code what type the variable has. So readability is increased.

And the primary benefits of implicit typing are:

Reduces program text in particular for long type names (classes, interfaces) thus improving readibility
Causes fewer changes when a return type of a method changes.

It looks as if both options improve readability.

So I guess it depends on your preferences and maybe also on the programming guidelines in your team. With current (refactoring) tool support in IDE's it has become much easier to change type names (a no brainer) so the reason that implicit typing reduces changes has virtually disappeared from an effort perspective.

I'd suggest: Do what works best for you. There is no right or wrong. Try each approach for a while (e.g. by configuring the options of your favorite refactoring tool), then use what makes your life as a developer easier.

回答13:

Well, you have picked up important idea - that overuse of var can be detrimental indeed and that in cases where actual type is pretty simple it should be stated as such.

var shines however when dealing with larger inheritance hierarchies and templates. You can also read it as "I don't care - just give me data" While templates in C# don't have expressive power of their C++ counterpart they do have more expressive power than Java's generics and that means that people can make constructs which are not just awkward but also hard to located if you have to nail exact type explicitly.

For example - imagine a template wrapper around several kinds of DataReader-s for talking to SQL - so that you can still be efficient (call sproc, get results, done) but without the burden of housekeeping (closing the reader and connection, retrying or errors etc). The code that uses it will just call one function, made to have as short syntax as possible and it will return a wrapper which will act like smart pointer in C++ - act like DataReader for your code but also handle all sideways things. So it looks as simple as:

using (var r = S.ExecuteReader("[MySproc]", params))
{
    while ((+r).Read())
    (
       // get my data and be done
    )
} // at this point wrapper cleans up everything

In a case like this, not just that you couldn't care less how is wrapper named and declared, you don't even care to know it's name - for your code it's irrelevant. You just want your darn data and to go on :-) without dealing with anyones long declarations.

It literally allows you to choose when you care about the full type declaration and when not. It's not all or nothing thing and you'll find yourself using both styles.

Another place where you'll be happy to have it is lambda expressions if you start using them. If you use lambdas in C# it will almost always be because you want some short, compact code that will run once and it's either not worth the trouble to turn into a regular method or it depends on local variables from the host method.

Oh and even VS editor will infer full type for you, offer auto-completion and complain if you try to use something it can't do, so var doesn't break type safety at all (new C++ got it's var equivalent as well - long overdue).