(syllabus and calendar)Ch. 10. Using I/0
|
Session 10
If there is time remaining, we can look at extras |
http://www.write-technical.com/126581/session10/
When we first learned how to read, we started with our 'A', 'B', 'C's, the name of individual letters of the alphabet (bytes, chars). Soon, we read individual words isolation, then sentences containing multiple words, then paragraphs, then entire articles in the Sports section. To read an entire article, we still have to deal with its paragraphs. Consider a newspaper article to be a file, and a paragraph to the chunk of paragraph text we absorb as a page (or screen). Our minds can hold a paragraph of information in short-term memory, which we might consider to be a buffer. Just as we process a two-page article paragraph by paragraph, so Java processes a file by using a buffer.
http://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html
Java 6 introduced the Scanner class, which can be more convenient than previous IO APIs for processing input. Here, we create a scanner object to work with integers.
This scanner gets lines of text, and can parse for a boolean value.
This scanner works with regular expressions.
A scanner can read from a file - https://docs.oracle.com/javase/8/docs/api/index.html?overview-summary.html if it has an input stream from that file, which requires the use of java.io package.
but a scanner cannot write to a file. However, a scanner can read from the console and, if the class also uses the java.io package, the class can write the input the scanner receives to a file. See ScannerWithFileWriter.java.
Scanner cannot create a new file or a new directory, but java.io.FileWriter can.
However, you could use Scanner to get from the user the name that FileWriter users to create the file.
Input and output is about movement or transportation. Just as intermodal containers are fixed blocks for transporting with ships, trains, and trucks,
so, for computers, a disk "block" or "sector" might have a capacity of 512, 1024, 2048, or 4096 bytes, and a block of data might be moved from one network or device location to another.
How do we fill a container? One way is atomically, each individual piece, one-by-one. Another way is with a buffer. What is a buffer? A buffer is a mechanism for TEMPORARY absorption that provides efficiency.
A funnel is a buffer that prevents spilling and waste:
In fluid dynamics, a buffer also provides efficiency and supports steady processing. This input buffer must be filled from the below before it will output from above:
Video streams are buffered so that the flow of data is steady, even if the internet connection is a bit less steady.
A balloon buffers the input stream of air. During this buffer's output or purge operation, the output stream acts like a jet, and can carry the balloon upward.
Here, the buffer, although big, is filled in an efficient manner, but the
output might not be as efficient as the original input operation.
Is this buffer similar to a queue (first-in, first-out "FIFO") or a stack
(last-in, first-out "LIFO")?
The mouth is a fairly small buffer insofar as it cannot contain an entire pie in one operation, but can only process in "bits" and "bytes".
It is more efficient to reduce input/output operations by grouping the individual elements together for processing. Another analogy might be using a big spoon to eat a set of corn flakes, instead of individual flake by flake, which would be a lot of input operations. Note that many people prefer eating potato chips one-by-one (the fun of "finger-food"), instead of crushing them into a big spoon. Potato chips are IO-intensive because sometimes we favor the savor and flavor of inefficiency.
Cattle are efficient in their input and and productive in their output, as a
hike on many trails makes evident. Cows prefer to chomp on entire large mouthfuls of grass rather
than pick at individual blades of grass (or individual potato chips). Similarly,
dogs are IO-efficient, filling the entire buffer of their maul at once, unlike cats,
who nibble
in a more dainty, IO-intensive manner.
Another way to process food is with a stream. IO-efficient folks slurp up long spaghetti instead of cutting each individual spaghetto. Some call it uncouth, others call it IO-efficient.
A straw is an IO-efficient way to stream the contents of a beverage into a mouthful (a buffer).
Each big gulp absorbs in a batch operation what has been streamed gradually into the mouth, and thus we avoid having to make constant swallowing operations for each micro-liter of fluid.
Similarly, in Java, we can use a buffer for the input stream to get (or "consume") a large quantity in a single operation.
The Java reader for an input stream converts the bytes associated with keyboard events to characters of the current charset. http://download.oracle.com/javase/8/docs/api/java/io/InputStreamReader.html
InputStreamReader isr = new InputStreamReader(System.in);
The input stream reader is "wrapped" in a buffered reader for efficiency, that is, the reduction of input/output operations:
BufferedReader buffer = new BufferedReader(isr);
or, the shorter version that wraps a constructor call within a constructor call:
BufferedReader br = new BufferedReader(new InputStreamReader(System.in);
In this case, the newly created input stream reader does not need an identifier because it is the runtime argument to the constructor of a buffered reader.
(In constructor wrapping, we use the inner constructor one time, and do not need a variable (or handle) to the object it initializes.)
Think of the straw as System.in, the fluid moving through the straw as an input stream, and the throat receiving a buffered mouthful/gulp.
An input stream is like an input straw, and a buffered reader is like a big gulp that only occurs from time to time.
Similarly, if you were going to send an email message, you would not do it as
many emails, one letter per email. Instead, you collect a bunch or batch of letters,
words, sentences, paragraphs into a single message and send the whole collection
at once.
import java.io.*;
FileOutputStream fout = new FileOutputStream(args[1]);
BufferedOutputStream bos = new BufferedOutputStream(fout);
The Panama canal has "locks" that empty to flush out water, thus lowering the level of a ship so it can continue its journey (or output).
Flushing the buffer means to empty the buffer immediately, even if it is not
full.
Use cases:
The input and output facilities of Java include:
From the very first release of Java, java.io.InputStream and java.io.OutputStream were abstract classes to support streams of bytes from sources such as keyboard input or a file on disk.
http://docs.oracle.com/javase/8/docs/api/java/io/InputStream.html
http://docs.oracle.com/javase/8/docs/api/java/io/OutputStream.html
For character streams, the second release of Java added java.io.Reader and java.io.Writer, which are abstract classes for code reuse by their subclasses.
http://docs.oracle.com/javase/8/docs/api/java/io/Reader.html
http://docs.oracle.com/javase/8/docs/api/java/io/Writer.html
The java.io package has two sides: one for bytes, and one for characters. For example, PrintStream is for bytes http://docs.oracle.com/javase/8/docs/api/java/io/PrintStream.html and PrintWriter is for characters http://docs.oracle.com/javase/8/docs/api/java/io/PrintWriter.html.
The abstract class, java.io.InputStream http://java.sun.com/javase/6/docs/api/java/io/InputStream.html, provides implementation methods for managing the bytes in an stream, such as read(byte[] b), mark(int readlimit), reset(), skip(long n), and close(). Therefore, subclasses, such as AudioInputStream and FileInputStream have the choice to reuse the implementation or overwrite it. The advantage of having an abstract class is that provides default functionality (unlike an interface) but also allows the flexibility of customization. In this case, using abstract classes make the workflow more efficient within the developer teams of Sun Microsystems.
Somewhat like System.out.println(),
the outfielder in baseball whose long-range throwing strength outputs something visible.
The java.io package ( http://java.sun.com/javase/6/docs/api/java/io/package-summary.html ) "Provides for system input and output through data streams, serialization, and the file system". The default package, java.lang, provides the basic I/O functionality.
For example, the System class ( http://java.sun.com/javase/6/docs/api/java/lang/System.html ) has a static final field, out, the "standard output stream",
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#out
that enables us to call System.out.println(). So, the java.lang package does provide the println() method, and other packages also provide this method.
In this case, for convenience, the language breaks the general rule of encapsulating a specific type of functionality, such as input/output, in the package dedicated to that functionality.
The home base for the output stream is the io package, which includes the PrintStream class ( http://docs.oracle.com/javase/7/docs/api/java/io/PrintStream.html ) with the out field. This field has the println() method that supports sending a stream of bytes to a device (such as the console) or a file. Java follows UNIX insofar as standard error, the static err field, is the same device as standard out, and so error reporting automatically takes advantage of the print stream.
Similarly, we get System.in.read() to read from the console from java.io.InputStream - http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html. So the standard input, output, and error streams of java.io are made available to java.io.System. Standard input is given by System.in - http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#in.
. We
can think of a baseball pitcher as an in-fielder who reads signals from the
catcher, one little bit at a time because the pitching/catching cycle is "io-intensive".
To work with low-level binary data, use a byte stream.
At a higher level, to work with unicode characters, use a character stream. (This is similar to the file-transfer-protocol (ftp) toggle, binary, which sets the transfer operation to work with bytes instead of characters.)
A char is a convenient way to work with what the system considers to be a byte, and that's why we can convert char to byte and byte to char. Similarly, a character stream is a convenience built on top of a byte stream.
The class java.io.Writer - http://docs.oracle.com/javase/7/docs/api/java/io/Writer.html - is abstract to provide both some functionality and some customizability.
This class
For example, within the Java APIs, a BufferedWriter, a StringWriter, and a PrintWriter might use different offsets, ways to flush, and close.
If you want to handle ALL the possible I/O exceptions, catch java.io.IOException, ( http://java.sun.com/javase/6/docs/api/java/io/IOException.html ), a direct subclass of java.lang.Exception. This subclass is the superclass of many specialized exceptions, such as java.io.FileNotFoundException.
Because IOException is not a subclass of RuntimeException, the compiler checks to see that IO exceptions are handled or the method declares it throws the exception. The architects of Java thereby encourage us to be proactive in handling possible IO problems.
http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html provides a high performance was to have a vector-like resizable and mutable structure for a String. However, if your application is multi-threaded, and it is possible that two threads might corrupt the structure, you should use a StringBuffer instead because the StringBuffer class implements the Serializable interface. http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuffer.html
To read bytes from the console, use a byte array. When we want to the
characters in the byte array, we cast to char (line 19). Otherwise, a set of
chars like "hello" would look like
1041011081081111310000.
The compiler "checks" for IOException, so we must do one of the following:
Note that the byte array is of a fixed size, but this is fine for many use cases, such as reading in a credit card number, which can later be parsed as integers using the parseInt method - http://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#parseInt-java.lang.String-
A common way to work with user input is with a buffered reader. This program parses the user input to extract a numeric value. The user can choose to enter an arbitrary number of integers.
If the caller has a try / catch mechanism, the caller does not need to declare it might throw a checked exception:
The compiler prevents you from writing unreachable code. Main's catch block would be unreachable if the method that main calls already catches the only possible exception:
When we work with arrays, we have the limitation of needing to know the size of the array necessary to store the data. To read in an entire file (and not have to know its size beforehand), read in until the end-of-file (EOF) marker, negative 1 (-1). The read method of java.io.inputStream returns a single character as progresses until it returns -1 because there is nothing more to read in the file. That negative one signal might be analogous to that loud sound from a straw you are sipping on where there is no more liquid in the glass.
http://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#read%28%29
Note that this marker, -1, is not a character, but rather an int.
Because the read method is on an instance of type FileInputStream, line 14 calls the constructor for a file input stream. This class is a subclass of the more general InputStream, which also has a read method. Lines 21-16 define a catch block that provides more guidance to the user than the default exception handling of the JVM. In this case, the catch block indicates the proper usage of the application.
Use the Console class so end-users can keep
their password hidden from people near their computer screen.
http://download.oracle.com/javase/6/docs/api/java/io/Console.html
We do not have to construct the Console. Instead,
we call the static console method of the
System class, which is
final.
http://download.oracle.com/javase/6/docs/api/java/lang/System.html#console%28%29
For security, the readPassword method hides the password from display - http://docs.oracle.com/javase/7/docs/api/java/io/Console.html#readPassword%28%29
The program compares the two passwords on Line 28. Each password is stored in an array of bytes, and a static method of java.util.Arrays performs the comparison. This method is overloaded, so it can deal with an array of chars or Strings (other overloads too) - http://download.oracle.com/javase/6/docs/api/java/util/Arrays.html#equals%28char[],%20char[]%29 or more likely http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#equals%28java.lang.Object[],%20java.lang.Object[]%29
A common use for file IO is to update text, such as a name. Here, we use a file input stream to demonstrate substitution. Lines 64-65 replace each blank space with a hyphen and write this substitution to the destination file. To run this program, type java Hypen mySourceFile myDestinationFile
We can compare two files, byte-by-byte, using the read method on a file input stream.
http://download.oracle.com/javase/6/docs/api/java/io/FileInputStream.html#read%28%29
Lines 48-49 call this method for the content of each file, and line 50 performs the comparison.
If I run: java FileComparison myLetters.txt
myLetters2.txt
The output is: original says: g but second file says: z
Here the input stream is well matched to the input buffer.
Here, the buffer is too small for efficient input:
A small buffer can still work, but it takes almost bit-by-bit patience and persistence:
BufferedReader br = new BufferedReader(new InputStreamReader(System.in);
Think of the straw as System.in, the fluid moving throw the straw as an input stream, and the throat receiving a buffered mouthful/gulp. (An input stream is almost an input straw.)
If you know that the file to input or output is not binary but contains characters, use an instance of FileReader, which works with Unicode and supports all the major human languages, including Chinese, Japanese, and Arabic. A FileReader is a subclass of InputStreamReader ( http://java.sun.com/javase/6/docs/api/java/io/InputStreamReader.html ), which works with characters instead of bytes. The FileReader can be constructed from a file name (see Line 7 in the example below).
In the example below, line 8 wraps the FileReader inside a BufferedReader which holds a significant amounts of bytes, typically 1024 bytes (1 kilobyte), in a single buffer. The size of the buffer matters in some circumstances. For example, a small device, such as a cell phone screen, might need a smaller buffer than, say, a powerful server running servlets that processes large amounts of data rapidly. Java provides two constructor signatures for a buffered reader, one of which allows us to set a custom size:
BufferedReader
(Reader in)Creates a buffering character-input stream that uses a default-sized input buffer. http://download.oracle.com/javase/6/docs/api/java/io/BufferedReader.html#BufferedReader%28java.io.Reader%29
BufferedReader
(Reader in, int sz)Creates a buffering character-input stream that uses an input buffer of the
specified size.
Typically, a server running servlets has a buffer of 8 kilobytes, but, in performance tuning, some might use a 16 kilobyte buffer - http://java.sun.com/developer/technicalArticles/Servlets/servletapi/ - but note that the Servlet API is part of the Enterprise Edition, not the Standard Edition.
The new IO package of the fourth version of Java, java.nio, added new classes for buffering - http://download.oracle.com/javase/6/docs/api/java/nio/package-summary.html. For example, java.nio provides a class for each primitive types (except boolean), and these are subclasses of Buffer - http://download.oracle.com/javase/6/docs/api/java/nio/Buffer.html
Buffering provides efficiency because we can read in, say, 4000 unicode (2 bytes each) characters instead of 80 characters. Without buffering, each invocation of read() or readLine() could fetch at most one line of characters. Assuming you want more than one or two characters, a buffered reader is useful for reading characters from the console or a file. Efficiency: reducing network roundtrips is useful because network latency can easily exceed JVM processing time. The throughput might increase 5000%.
This version includes try, catch, and finally blocks. The typical use case for finally is clean up resources, as we do here by closing the file reader.
This version has more robust exception handling and also indicates which classes its uses from the io.package instead of importing *.*
The java.io.Reader class provides a ready() method,
which can be used to ensure that time is not wasted trying to read an empty
buffer.
http://download.oracle.com/javase/6/docs/api/java/io/Reader.html#ready%28%29
Note that the InputStream class does not have a ready()method and is not a superclass of an
InputStreamReader:
http://download.oracle.com/javase/6/docs/api/java/io/InputStream.html
Use input and output to echo input as output to screen:
Allow the input-output cycle to shut itself off when the user says to stop.
This example uses the constructor of java.lang.Character in Line 12 to get a single character object, then a string, then test whether the string representation of the character itself a representation of an integer. The parseInt method accepts a String as input, and outputs the corresponding integer value. The Character class is an object wrapper for a primitive char - http://download.oracle.com/javase/6/docs/api/java/lang/Character.html
Note: Chapter 12 explains autoboxing, which allows you to avoid explicit use of an object wrapper for a primitive.
A more sophisticated use of a buffered reader with integer parsing:
a feature discussed at http://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html
This program is a useful utility: it writes a text file that contains the names of the files in a zip archive.
You can also write a Unicode file from the lines the user types at the console by creating an instance of java.io.FileWriter (Line 17) and calling the signature of the write method that takes a String as input (Line 33) - http://download.oracle.com/javase/6/docs/api/java/io/Writer.html#write%28java.lang.String%29
This program uses the compareTo() method of the String class - http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#compareTo%28java.lang.String%29
Two ways to copy a file are using streams and using channels. Streams are part of the standard io package. This program uses the java.io.FileOutputStream.write method (http://download.oracle.com/javase/6/docs/api/java/io/FileOutputStream.html#write%28byte[],%20int,%20int%29) to output a buffer of 1024 bytes to the FileOutputStream until there are not more bytes to write. The JVM knows there are no more bytes when the file input stream returns -1.
What user error might be expect to see in relation to line 27? What is the usage requirement to run the program successfully?
Channels are part of the the java.nio (new I/O) package - http://download.oracle.com/javase/6/docs/api/. Channels were added for:
If you work with databases, this advanced topic might be valuable with large tables. Classes in this package can also take advantage of a memory mapped file outside the JVM, which means the operating system directly uses hard disk space for paging a large file as if it were loaded into RAM.
Lines 33 to 47 are a do while loop that does not do anything except allow us to read paste the newline character and the carriagefeed character. This way, we get the character that represents the letter key the user typed before hitting the Enter key. Different operating systems represent the user typing Enter or Return in different ways:
CR+LF Windows, DOS CR MacOS up to OS-9 LF UNIX
=====================
-1
is
returned. This value cannot be cast to a char.