There is a java.io.BufferedOutputStream whose purpose is well documented, basically as a good thing to wrap around an unbuffered OutputStream (at least if you want buffering). However, and surprisingly to me, a number of the other OutputStreams in java.io do not document whether they are buffered, and thus it's not clear to me whether I should wrap them or not.
Take FileOutputStream as an example: the docs say only that it's "... an output stream for writing data to a File ...". Can we safely infer that a stream is buffered iff it implements Flushable? Otherwise, what's the way to know when wrapping in a BufferedOutputStream is a good idea and when it would lead to redundant buffering?
Rex Mottram wrote: > There is a java.io.BufferedOutputStream whose purpose is well > documented, basically as a good thing to wrap around an unbuffered > OutputStream (at least if you want buffering). However, and surprisingly > to me, a number of the other OutputStreams in java.io do not document > whether they are buffered, and thus it's not clear to me whether I > should wrap them or not.
> Take FileOutputStream as an example: the docs say only that it's "... an > output stream for writing data to a File ...". Can we safely infer that > a stream is buffered iff it implements Flushable? Otherwise, what's the > way to know when wrapping in a BufferedOutputStream is a good idea and > when it would lead to redundant buffering?
> Thanks, > RM
If it doesn't say buffered it isn't.
--
Knute Johnson email s/nospam/linux/
-- Posted via NewsDemon.com - Premium Uncensored Newsgroup Service ------->>>>>>http://www.NewsDemon.com<<<<<<------ Unlimited Access, Anonymous Accounts, Uncensored Broadband Access
On Fri, 16 May 2008 10:14:04 -0400, Rex Mottram <r...@not.here> wrote, quoted or indirectly quoted someone who said :
>There is a java.io.BufferedOutputStream whose purpose is well >documented, basically as a good thing to wrap around an unbuffered >OutputStream (at least if you want buffering). However, and surprisingly >to me, a number of the other OutputStreams in java.io do not document >whether they are buffered, and thus it's not clear to me whether I >should wrap them or not.
Tick buffered if you want buffered. You will see by default you don't get any buffering.
Keep in mind that when you have lots of RAM you can read a file in a single UNBUFFERED I/O or a number of big chunks faster than you can read it buffered.
On Fri, 16 May 2008, Rex Mottram wrote: > There is a java.io.BufferedOutputStream whose purpose is well > documented, basically as a good thing to wrap around an unbuffered > OutputStream (at least if you want buffering). However, and surprisingly > to me, a number of the other OutputStreams in java.io do not document > whether they are buffered, and thus it's not clear to me whether I > should wrap them or not.
I believe that BufferedOutputStream is the only one that does buffering *in java* (more or less ...), but others may involve buffers out in native code or the OS. FileOutputStream, for instance - i believe every write turns into a call to the OS or C library's write routine, but that may not immediately put bytes onto a platter. The stream you get from a Socket is another - all writes go to the TCP implementation, but that won't necessarily send them immediately.
The point of buffering on the java side is that it saves you native calls - you make one call when you have a kilobyte of data to send, rather than one every time you have a morsel of data to write. This can be a big performance win. Basically, always wrap.
You still have to worry about the native buffering for correctness, though - you can't rely on data being written to a file until you've flushed the FileOutputStream.
Now, that "more or less" above is about the various streams which do transformations on data passing through them, and which have to do some buffering to do that. That means GZIPOutputStream, DeflaterOutputStream, CipherOutputStream, and possibly others. These require special attention to wring all their bytes out of them. However, i think this is pretty well documented in each case.
tom
-- It's the 21st century, man - we rue _minutes_. -- Benjamin Rosenbaum
Tom Anderson wrote: > I believe that BufferedOutputStream is the only one that does buffering > *in java* (more or less ...), but others may involve buffers out in > native code or the OS.
As far as your Java application is concerned, I think you should generally treat "secret buffering at the OS level" as "no buffering" and should wrap in a BufferedInput/OutputStream-- otherwise you have the overhead of the native call on every single read/write.
I believe you don't need extra buffering in the streams given to you by some Servlet implementations (they do their own Java-side buffering to handle the HTTP protocol), though I'd be interested if anyone has further insight on this.
> Now, that "more or less" above is about the various streams which do > transformations on data passing through them, and which have to do some > buffering to do that. That means GZIPOutputStream, DeflaterOutputStream, > CipherOutputStream, and possibly others.
For similar reasons to above, it's generally best to add a Java buffer unless you have strong grounds for not doing so. These compression stream classes may "naturally" work on a buffer, but if the buffer is held natively, then it's a native call to fetch each individual byte unless you buffer in Java.
If memory serves correctly, it was the flavour of InputStream you get from ZipFile.getInputStream() whose single-byte read() method creates a new one-element byte array on each call and then calls the multi-byte version...
On Sat, 17 May 2008, Neil Coffey wrote: > Tom Anderson wrote:
>> I believe that BufferedOutputStream is the only one that does buffering *in >> java* (more or less ...), but others may involve buffers out in native code >> or the OS.
> As far as your Java application is concerned, I think you should > generally treat "secret buffering at the OS level" as "no buffering" and > should wrap in a BufferedInput/OutputStream-- otherwise you have the > overhead of the native call on every single read/write.
Yes, that's exactly what i said in my post:
"The point of buffering on the java side is that it saves you native calls - you make one call when you have a kilobyte of data to send, rather than one every time you have a morsel of data to write. This can be a big performance win. Basically, always wrap."
Except that you *do* need to be aware of the secret buffering for correctness reasons:
"you can't rely on data being written to a file until you've flushed the FileOutputStream."
So, for things like FileOutputStream, you have to treat them as both unbuffered (by wrapping them in a buffered stream) and buffered (by remembering to flush) at the same time!
> I believe you don't need extra buffering in the streams given to you by > some Servlet implementations (they do their own Java-side buffering to > handle the HTTP protocol), though I'd be interested if anyone has > further insight on this.
Good point.
>> Now, that "more or less" above is about the various streams which do >> transformations on data passing through them, and which have to do some >> buffering to do that. That means GZIPOutputStream, DeflaterOutputStream, >> CipherOutputStream, and possibly others.
> For similar reasons to above, it's generally best to add a Java buffer > unless you have strong grounds for not doing so. These compression > stream classes may "naturally" work on a buffer, but if the buffer is > held natively, then it's a native call to fetch each individual byte > unless you buffer in Java.
True.
> If memory serves correctly, it was the flavour of InputStream you get > from ZipFile.getInputStream() whose single-byte read() method creates a > new one-element byte array on each call and then calls the multi-byte > version...
Tom Anderson wrote: > On Sat, 17 May 2008, Neil Coffey wrote: > Except that you *do* need to be aware of the secret buffering for > correctness reasons:
> "you can't rely on data being written to a file until you've flushed the > FileOutputStream."
> So, for things like FileOutputStream, you have to treat them as both > unbuffered (by wrapping them in a buffered stream) and buffered (by > remembering to flush) at the same time!
Actually, you are still not sure if the data has been written to the file, only that the data has been passd from the Java side to the OS side. To ensure data has been commited to disk you should call synch() on the FileDescriptor. (Although this is not necessary for most applications).
On Tue, 20 May 2008, Roger Lindsjö wrote: > Tom Anderson wrote: >> On Sat, 17 May 2008, Neil Coffey wrote:
>> Except that you *do* need to be aware of the secret buffering for >> correctness reasons:
>> "you can't rely on data being written to a file until you've flushed the >> FileOutputStream."
>> So, for things like FileOutputStream, you have to treat them as both >> unbuffered (by wrapping them in a buffered stream) and buffered (by >> remembering to flush) at the same time!
> Actually, you are still not sure if the data has been written to the > file, only that the data has been passd from the Java side to the OS > side. To ensure data has been commited to disk you should call synch() > on the FileDescriptor. (Although this is not necessary for most > applications).
My impression was that this was not the case - that FileOutputStream.flush() operates the C library or OS's flush mechanism.
Ah, no, OutputStream.flush:
"If the intended destination of this stream is an abstraction provided by the underlying operating system, for example a file, then flushing the stream guarantees only that bytes previously written to the stream are passed to the operating system for writing; it does not guarantee that they are actually written to a physical device such as a disk drive."
How unhelpful.
tom
-- there is never a wrong time to have your bullets passing further into someone's face -- D
>> I believe that BufferedOutputStream is the only one that does buffering >> *in java* (more or less ...), but others may involve buffers out in >> native code or the OS.
> As far as your Java application is concerned, I think you > should generally treat "secret buffering at the OS level" as > "no buffering" and should wrap in a BufferedInput/OutputStream-- > otherwise you have the overhead of the native call on every > single read/write.
> I believe you don't need extra buffering in the streams given to you > by some Servlet implementations (they do their own Java-side > buffering to handle the HTTP protocol), though I'd be interested > if anyone has further insight on this.
You ought not to need extra buffering, since with Servlet 2.2 response buffering is part of the API. This doesn't so much "handle" the HTTP protocol as just make it easier to work with, especially as regards error handling.
Working in cooperation with that would be buffering to support chunked encoding, which would be directly "handling" the HTTP protocol. Wikipedia (http://en.wikipedia.org/wiki/HTTP#Persistent_connections) rather confusingly says that chunking allows data on persistent connections to be streamed rather than buffered, but of course the mechanism is still going to be doing buffering.