Can I change the default line separator for Hive?
I can change the default field separator for the hive, but I cannot find a way to change the default line separator, I have many files whose data line separator is '\u0002'
, and if I overwrite the InputFormat of the hive, overriding next()
:
@Override
public boolean next(LongWritable key, Text value) throws IOException {
while (reader.next(key, text)) {
String strReplace = text.toString().toLowerCase()
.replaceAll("\u0002", "\n");
Text txtReplace = new Text();
txtReplace.set(strReplace);
value.set(txtReplace.getBytes(), 0, txtReplace.getLength());
return true;
}
return false;
}
But it didn't work, so what if there are other ways?
+3
source to share
No one has answered this question yet
See similar questions:
or similar: