|
|
Start of Tutorial > Start of Trail > Start of Lesson |
Search
Feedback Form |
Objects of typeScannerare useful for breaking down formatted input into tokens and translating individual tokens according to their data type.
Breaking Input into Tokens
By default, a scanner uses white space (blanks, tabs, and line separators) as token separators. To see how this works, let's look atScanFar, a program that reads the individual words in
farrago.txtand prints them out, one per line.Notice thatimport java.io.*; import java.util.*; public class ScanFar { public static void main(String[] args) throws IOException { Scanner s = new Scanner(new BufferedReader(new FileReader("farrago.txt"))); while (s.hasNext()) { System.out.println(s.next()); } s.close(); } }ScanFarcallsScanner'sclosemethod when it is done with the scanner object. Even though a scanner is not a stream, you need to close it to indicate that you're done with its underlying stream.The output of
ScanFarlooks like this:To use a different token separator, call theSo she went into the garden to cut a cabbage-leaf, to make ...useDelimiter()method, specifying a regular expression. For example, suppose you wanted the token separator to be a comma, optionally followed by white space. You would call,s.useDelimiter(",\s*");Translating Individual Tokens
TheScanFarexamples all input tokens as simpleStringvalues.Scanneralso supports tokens for all of the Java language's primitive types (except forchar), as well asBigIntegerandBigDecimal. Also, numeric values can use thousands separators. Thus in aUSlocale,Scannercorrectly reads the string "32,767" as representing an integer value.We have to mention the locale, because thousands separators and decimal symbols are locale-specific. So the following example would not work correctly in all locales if we didn't specify that the scanner should use the
USlocale. That's not something you usually have to worry about, because your input data usually comes from sources that use the same locale as you do. But this example is part of the Java Tutorial, and gets distributed all over the world.The
ScanSumexample reads a list of
doublevalues and adds them up. Here's the source:And here's the sample input file,import java.io.*; import java.util.*; public class ScanSum { public static void main(String[] args) throws IOException { Scanner s = new Scanner(new BufferedReader(new FileReader("usnumbers.txt"))); s.useLocale(Locale.US); double sum = 0; while (s.hasNext()) { sum += s.nextDouble(); } s.close(); System.out.println(sum); } }usnumbers.txt
The output string is "1032778.74159". The period will be a different character in some locales, because8.5 32,767 3.14159 1,000,000.1System.outis aPrintStreamobject, and that class doesn't provide a way to override the default locale. We could override the locale for the whole program — or we could just use formatting, as described in the next topic, Formatting.
ScanSumhas a serious shortcoming: if the input file contains any tokens that aren't validDoublevalues, it throws an exception and dies. To fix that problem, we need to add some kind of error recovery. Here's how we can rewrite the scanning loop so that it skips over any token that isn't aDouble:Eachwhile (s.hasNext()) { if (s.hasNextDouble() { sum += s.nextDouble(); } else { next(); } }Nextmethod has a correspondinghasNextmethod, which provides for this kind of error recovery.
|
|
Start of Tutorial > Start of Trail > Start of Lesson |
Search
Feedback Form |
Copyright 1995-2005 Sun Microsystems, Inc. All rights reserved.