Faster implementation of multiple inputs on one line (Java)

Well, that might be a silly problem.

I just need a faster implementation of the following problem.

I want to take three integer inputs in one line, for example:

10 34 54

      

One way is to create a BufferedReader and then use readLine () which will read the entire line as a string then we can use the StringTokenizer to split the three integers. (Slow implementation)

Another way is to use the Scanner and enter the input using the nextInt () method. (Slower than the previous method)

I want the fast implementation to accept this kind of input, since I have to read over 2,000,000 lines and these implementations are very slow.

My implementation:

BufferedReader br=new BufferedReader(new InputStreamReader(System.in));

for(i=0;i<n;i++) {
    str=br.readLine();
    st = new StringTokenizer(str);
    t1=Integer.parseInt(st.nextElement().toString());
    t2=Integer.parseInt(st.nextElement().toString());
    z=Long.parseLong(st.nextElement().toString());
}

      

This loop loops n times. (n is the number of entries) Since I know each line will only contain three integers, there is no need to check forhasMoreElements()

+3


source to share


3 answers


I just want to complete the next task faster.

Chances are you DO NOT need a faster implementation. Jokes aside. Even with a 2 million line input file.

It is likely that



  • more time is spent processing the file than reading it, and
  • most of the "read time" is spent working at the operating system level or simply waiting for the next disk block read.

My advice is not to bother optimizing this unless the app as a whole lingers for too long. And when you find that it is, profile your application and use the profile statistics to tell you where to spend your optimization efforts.

(My feeling is that there isn't much to optimize for this part of the application, but don't rely on that. Profile!)

+3


source


Here's a basic example that will be pretty quick:

public static void main(String[] args) throws IOException {
    BufferedReader reader = new BufferedReader(new FileReader("myfile.txt"));
    String line;
    while ((line = reader.readLine()) != null) {
        for (String s : line.split(" ")) {
            final int i = Integer.parseInt(s);
            // do something with i...
        }
    }
    reader.close();
}

      



However, your task will mostly take time.

If you are doing this on a website and you are hitting a timeout, then you should think about it on a background thread and send a response to the user indicating that the data is being processed. You will probably need to add the ability for the user to check progress.

0


source


This is what I mean when I say "specialized scanner". Depending on the efficiency of the parser (or splitting), this might be slightly faster (it probably isn't):

BufferedReader br=new BufferedReader(...);  
for(i=0;i<n;i++) 
{     
    String str=br.readLine();
    long[] resultLongs = {-1,-1,-1};
    int startPos=0;
    int nextLongIndex=0;
    for (int p=0;p<str.length();p++)
    {
        if (str.charAt(p)== ' ')
        {
            String intAsStr=str.substring(startPos, p-1);
            resultLongs[nextLongIndex++]=Integer.parseInt(intAsStr);
            startpos=p+1;
        }
    }
    // t1, t2 and z are in resultLongs[0] through resultLongs[2]     
    }

      

HTHS.

And of course this fails if the input file contains garbage, i.e. something else, but longs, separated by spaces.

And besides, to minimize OS callbacks, it's a good idea to provide the buffered reader with a non-standard (larger than standard) buffer.

Another hint I gave in the comment is clarified: if you need to read such a huge text file more than once, i.e. more than once after updating it, you can read all the long lines in the data structure (perhaps a List of elements that contain three longitude) and stream to file "cache". Next time compare the time stamp of the text file with the "cache" file. If it's older, read the cache file. Since stream I / O does not serialize longs to its string representation, you will see much better read times.

EDIT: Skipped startPos reassignment. EDIT2: Added a description of the cache idea.

0


source







All Articles