String, text, numbers, I/O II part
Due to its size, I have to directly jump to a given line number so that I can make a
readLine(). Is there a fast way to do so without traversing all the previous lines?
Note that I ve tried the LineNumberReader class but it only keeps track of line
numbers, and does not allow me to go to a specific position in the stream.
Answer, Part 1:
No, there is no easy way because the exact beginning of a line isn t stored
anywhere. It s determined by the content of the file, and nobody is keeping track of
the line feeds, carriage returns etc.
If the file doesn t change, you could find out the offset of each line relative to the
beginning of the file once in a separate run over the original file and store these
offset values in a binary file that has one long value for each line.
You can then load these long values into an array and access it whenever you need
a line. You then seek to that position and read the line directly. If there are too many
long offsets to be kept in memory, you still have a speed advantage; if you want the
offset of line i (i zero based, so the first line is 0, the second 1 and so on), you seek
to i * 8 (that s the size of a long) with a RandomAccessFile, load the offset value, go
that position in the data file and load the line.
With intelligent caching, you could avoid quite some disk access.
Marco Schmidt
Answer, Part 2:
In addition to the answers already given, consider restructuring the file.
For example, if it is created once and accessed many times, you could do the
following:
1. Put a line number field at the start of each line, and use binary search.
or
2. Pre read the file and construct an index.
The two approaches can even be combined, with an index being built as you do
binary searches, so that you start a search in the right area, and learn more about
which lines are where as you go along.
Alternatively, you could keep an index at the start or end of the file.
Note that in any indexing technique, including the one Marco Schmidt suggested
using a separate file, you don t necessarily need to keep a complete index.
For example, if you know the offset in file of the start of every 0 mod 100 record you
can go straight to a record near the one you need and scan from there.
Patricia
(c)1999, 2000, 2001. JavaFAQ.nu. All rights reserved worldwide.
This document is free for distribution, you can send it to everybody who is interested in Java.
This document can not be changed, either in whole or in part
without the express written permission of the publisher.
All questions please
mailto:info@javafaq.nu
file:///F|/a_jsite/350_tips/stings_text__date_numbers_io II.htm (7 of 7) [2001 07 08 11:25:00]
footer
Visionwebhosting.net Business web hosting division of Web
Design Plus. All rights reserved