How to use java.util.Scanner to correctly read use

2018-12-30 23:02发布

This is meant to be a canonical question/answer that can be used as a duplicate target. These requirements are based on the most common questions posted every day and may be added to as needed. They all require the same basic code structure to get to each of the scenarios and they are generally dependent on one another.


Scanner seems like a "simple" class to use, and that is where the first mistake is made. It is not simple, it has all kinds of non-obvious side effect and aberrant behaviors that break the Principle of Least Astonishment in very subtle ways.

So this might seem to be overkill for this class, but the peeling the onions errors and problems are all simple, but taken together they are very complex because of their interactions and side effects. This is why there are so many questions about it on Stack Overflow every day.

Common Scanner questions:

Most Scanner questions include failed attempts at more than one of these things.

  1. I want to be able to have my program automatically wait for the next input after each previous input as well.

  2. I want to know how to detect an exit command and end my program when that command is entered.

  3. I want to know how to match multiple commands for the exit command in a case-insensitive way.

  4. I want to be able to match regular expression patterns as well as the built-in primitives. For example, how to match what appears to be a date ( 2014/10/18 )?

  5. I want to know how to match things that might not easily be implemented with regular expression matching - for example, an URL ( http://google.com ).

Motivation:

In the Java world, Scanner is a special case, it is an extremely finicky class that teachers should not give new students instructions to use. In most cases the instructors do not even know how to use it correctly. It is hardly if ever used in professional production code so its value to students is extremely questionable.

Using Scanner implies all the other things this question and answer mentions. It is never just about Scanner it is about how to solve these common problems with Scanner that are always co morbid problems in almost all the question that get Scanner wrong. It is never just about next() vs nextLine(), that is just a symptom of the finickiness of the implementation of the class, there are always other issues in the code posting in questions asking about Scanner.

The answer shows a complete, idiomatic implementation of 99% of cases where Scanner is used and asked about on StackOverflow.

Especially in beginner code. If you think this answer is too complex then complain to the instructors that tell new students to use Scanner before explaining the intricacies, quirks, non-obvious side effects and peculiarities of its behavior.

Scanner is the a great teaching moment about how important the Principle of least astonishment is and why consistent behavior and semantics are important in naming methods and method arguments.

Note to students:

You will probably never actually see Scanner used in professional/commercial line of business apps because everything it does is done better by something else. Real world software has to be more resilient and maintainable than Scanner allows you to write code. Real world software uses standardized file format parsers and documented file formats, not the adhoc input formats that you are given in stand alone assignments.

1条回答
唯独是你
2楼-- · 2018-12-30 23:57

Idiomatic Example:

The following is how to properly use the java.util.Scanner class to interactively read user input from System.in correctly( sometimes referred to as stdin, especially in C, C++ and other languages as well as in Unix and Linux). It idiomatically demonstrates the most common things that are requested to be done.

package com.stackoverflow.scanner;

import javax.annotation.Nonnull;
import java.math.BigInteger;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.*;
import java.util.regex.Pattern;

import static java.lang.String.format;

public class ScannerExample
{
    private static final Set<String> EXIT_COMMANDS;
    private static final Set<String> HELP_COMMANDS;
    private static final Pattern DATE_PATTERN;
    private static final String HELP_MESSAGE;

    static
    {
        final SortedSet<String> ecmds = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
        ecmds.addAll(Arrays.asList("exit", "done", "quit", "end", "fino"));
        EXIT_COMMANDS = Collections.unmodifiableSortedSet(ecmds);
        final SortedSet<String> hcmds = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
        hcmds.addAll(Arrays.asList("help", "helpi", "?"));
        HELP_COMMANDS = Collections.unmodifiableSet(hcmds);
        DATE_PATTERN = Pattern.compile("\\d{4}([-\\/])\\d{2}\\1\\d{2}"); // http://regex101.com/r/xB8dR3/1
        HELP_MESSAGE = format("Please enter some data or enter one of the following commands to exit %s", EXIT_COMMANDS);
    }

    /**
     * Using exceptions to control execution flow is always bad.
     * That is why this is encapsulated in a method, this is done this
     * way specifically so as not to introduce any external libraries
     * so that this is a completely self contained example.
     * @param s possible url
     * @return true if s represents a valid url, false otherwise
     */
    private static boolean isValidURL(@Nonnull final String s)
    {
        try { new URL(s); return true; }
        catch (final MalformedURLException e) { return false; }
    }

    private static void output(@Nonnull final String format, @Nonnull final Object... args)
    {
        System.out.println(format(format, args));
    }

    public static void main(final String[] args)
    {
        final Scanner sis = new Scanner(System.in);
        output(HELP_MESSAGE);
        while (sis.hasNext())
        {
            if (sis.hasNextInt())
            {
                final int next = sis.nextInt();
                output("You entered an Integer = %d", next);
            }
            else if (sis.hasNextLong())
            {
                final long next = sis.nextLong();
                output("You entered a Long = %d", next);
            }
            else if (sis.hasNextDouble())
            {
                final double next = sis.nextDouble();
                output("You entered a Double = %f", next);
            }
            else if (sis.hasNext("\\d+"))
            {
                final BigInteger next = sis.nextBigInteger();
                output("You entered a BigInteger = %s", next);
            }
            else if (sis.hasNextBoolean())
            {
                final boolean next = sis.nextBoolean();
                output("You entered a Boolean representation = %s", next);
            }
            else if (sis.hasNext(DATE_PATTERN))
            {
                final String next = sis.next(DATE_PATTERN);
                output("You entered a Date representation = %s", next);
            }
            else // unclassified
            {
                final String next = sis.next();
                if (isValidURL(next))
                {
                    output("You entered a valid URL = %s", next);
                }
                else
                {
                    if (EXIT_COMMANDS.contains(next))
                    {
                        output("Exit command %s issued, exiting!", next);
                        break;
                    }
                    else if (HELP_COMMANDS.contains(next)) { output(HELP_MESSAGE); }
                    else { output("You entered an unclassified String = %s", next); }
                }
            }
        }
        /*
           This will close the underlying InputStream, in this case System.in, and free those resources.
           WARNING: You will not be able to read from System.in anymore after you call .close().
           If you wanted to use System.in for something else, then don't close the Scanner.
        */
        sis.close();
        System.exit(0);
    }
}

Notes:

This may look like a lot of code, but it illustrates the minimum effort needed to use the Scanner class correctly and not have to deal with subtle bugs and side effects that plague those new to programming and this terribly implemented class called java.util.Scanner. It tries to illustrate what idiomatic Java code should look like and behave like.

Below are some of the things I was thinking about when I wrote this example:

JDK Version:

I purposely kept this example compatible with JDK 6. If some scenario really demands a feature of JDK 7/8 I or someone else will post a new answer with specifics about how to modify this for that version JDK.

The majority of questions about this class come from students and they usually have restrictions on what they can use to solve a problem so I restricted this as much as I could to show how to do the common things without any other dependencies. In the 22+ years I have been working with Java and consulting the majority of that time I have never encountered professional use of this class in the 10's of millions of lines source code I have seen.

Processing commands:

This shows exactly how to idiomatically read commands from the user interactively and dispatch those commands. The majority of questions about java.util.Scanner are of the how can I get my program to quit when I enter some specific input category. This shows that clearly.

Naive Dispatcher

The dispatch logic is intentionally naive so as to not complicate the solution for new readers. A dispatcher based on a Strategy Pattern or Chain Of Responsibility pattern would be more appropriate for real world problems that would be much more complex.

Error Handling

The code was deliberately structured as to require no Exception handling because there is no scenario where some data might not be correct.

.hasNext() and .hasNextXxx()

I rarely see anyone using the .hasNext() properly, by testing for the generic .hasNext() to control the event loop, and then using the if(.hasNextXxx()) idiom lets you decide how and what to proceed with your code without having to worry about asking for an int when none is available, thus no exception handling code.

.nextXXX() vs .nextLine()

This is something that breaks everyone's code. It is a finicky detail that should not have to be dealt with and has a very obfusated bug that is hard to reason about because of it breaks the Principal of Least Astonishment

The .nextXXX() methods do not consume the line ending. .nextLine() does.

That means that calling .nextLine() immediately after .nextXXX() will just return the line ending. You have to call it again to actually get the next line.

This is why many people advocate either use nothing but the .nextXXX() methods or only .nextLine() but not both at the same time so that this finicky behavior does not trip you up. Personally I think the type safe methods are much better than having to then test and parse and catch errors manually.

Immutablity:

Notice that there are no mutable variables used in the code, this is important to learn how to do, it eliminates four of the most major sources of runtime errors and subtle bugs.

  1. No nulls means no possibility of a NullPointerExceptions!

  2. No mutability means that you don't have to worry about method arguments changing or anything else changing. When you step debug through you never have to use watch to see what variables are change to what values, if they are changing. This makes the logic 100% deterministic when you read it.

  3. No mutability means your code is automatically thread-safe.

  4. No side effects. If nothing can change, the you don't have to worry about some subtle side effect of some edge case changing something unexpectedly!

Read this if you don't understand how to apply the final keyword in your own code.

Using a Set instead of massive switch or if/elseif blocks:

Notice how I use a Set<String> and use .contains() to classify the commands instead of a massive switch or if/elseif monstrosity that would bloat your code and more importantly make maintenance a nightmare! Adding a new overloaded command is as simple as adding a new String to the array in the constructor.

This also would work very well with i18n and i10n and the proper ResourceBundles. A Map<Locale,Set<String>> would let you have multiple language support with very little overhead!

@Nonnull

I have decided that all my code should explicitly declare if something is @Nonnull or @Nullable. It lets your IDE help warn you about potential NullPointerException hazards and when you do not have to check.

Most importantly it documents the expectation for future readers that none of these method parameters should be null.

Calling .close()

Really think about this one before you do it.

What do you think will happen System.in if you were to call sis.close()? See the comments in the listing above.

Please fork and send pull requests and I will update this question and answer for other basic usage scenarios.

查看更多
登录 后发表回答