Java String.split alternative for better performance
In the process of adding import data from a split csv / tab file, my code takes a long time to load the data. Is there an alternative for this in a faster way? This is the code I am using to split the fields in an array.
//Here - lineString = fileReader.readLine()
public static String [] splitAndGetFieldNames(String lineString ,String fileType)
{
if(lineString==null || lineString.trim().equals("")){
return null;
}
System.out.print("LINEEEE " + lineString);
String pattern = "(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))";
if(fileType.equals("tab"))
pattern = "\t" + pattern;
else
pattern = "," + pattern;
String fieldNames[] = lineString.split(pattern);
for(int i=0 ; i < fieldNames.length ; i++){
//logger.info("Split Fields::"+fieldNames[i]);
if (fieldNames[i].startsWith("\""))
fieldNames[i] = fieldNames[i].substring(1);
if (fieldNames[i].endsWith("\""))
fieldNames[i] = fieldNames[i].substring(0, fieldNames[i].length()-1);
fieldNames[i] = fieldNames[i].replaceAll("\"\"","\"").trim();
//logger.info("Split Fields after manipulation::"+fieldNames[i]);
}
return fieldNames;
}
+3
source to share
2 answers
Use a CSV parser like super-csv .
Univocity provides a CSV parsers tag . It says univocity-parsers are fast, which isn't surprising. You may try.
+3
source to share