Speed ​​up a Java application

How to speed up a Java application?

I am working on a Java application that processes Cobol files one at a time, extracts the required data from them, and populates them into a DB2 database.

If there are more files to parse, then it takes more than 24 hours to complete the application, which is unacceptable.

So I am doing some table in a separate thread to speed up .. eg.

ArrayList list = (ArrayList)vList.clone();
ThreadPopulator populator = new ThreadPopulator(connection, list, srcMbr);
Thread thread = new Thread(populator);
thread.run();
return;


And ThreadPopulator class is implementing Runnable interface and run method as

public void run()
{
    try
    {
        synchronized (this)
        {
           int len = Utils.length(list);
           for (int i = 0; i < len; i++)
           {
              .....
              stmt.addBatch();
            if ((i + 1) % 5000 == 0)
                    stmt.executeBatch(); // Execute every 5000 items.
           }
        }
    }
    catch (Throwable e)
    {
        e.printStackTrace():
    }
    finally
    {
        if (list != null)
            list.clear();
    }
}

      

Note. A clone must be used so that the next thread cannot disappear from the records.

Am I thinking right?

Please suggest me how I should choose to speed up my application over thousands of Cobol files.

+3


source to share


2 answers


You need to first determine what he spends most of his time on. This requires CPU measurement and possibly memory usage. It is parsing that uses CPU, or database that uses IO.

Without measuring what is your performance bottleneck, you cannot make an informed decision about what needs to be improved.

In my experience, I would first suspect the database. You have a batch size of 5000 which should be enough. How much CPU does it use when the program is running, for example. one processor is always busy?

Note. You can write a simple text parser to read around 40-100MB / s. To work for 24 hours, you will need to download a lot of TB of data, which is hardly the reason.



In fact, you first need to rewrite the file in the proper format, then read those lines and extract the data you need, even the original lines read 2-3 times for one file (actually, this is the logical part). When I run the app on 4000K files it runs for 24 hours.

4 million files will be a performance issue. Even a trivial open file takes about 8ms for a fast hard drive, and if you open it 2-3 times it only takes about 30 hours. (I guess your disk cache will save you a few hours). The only way to make it faster is:

  • use fewer files. 4 million is a crazy number that can be opened multiple times. Opening them only once, each of them will take about 10 hours (doesn't matter if you do something with them).
  • use a faster disk for example. An SSD can do this at about 1 / 100th the time. a hard drive can do up to 120 IOPS, a cheap SSD can do 40,000 IOPS and a good 23,000 IOPS. Later, 4 million files could be opened in 12 seconds, which is faster than 10 hours.;)
  • transfer all files only once. It will be slow, but it will be 2-3 times faster.

Note. Using more threads will not speed up your hard drives.

+7


source


You call

thread.run();

      

instead



thread.start();

      

which means you don't actually run your code on a separate thread ...

Also, I would like to answer @ Peter's second answer.

+1


source







All Articles