Can I change the default line separator for Hive?

I can change the default field separator for the hive, but I cannot find a way to change the default line separator, I have many files whose data line separator is '\u0002'

, and if I overwrite the InputFormat of the hive, overriding next()

:

@Override
        public boolean next(LongWritable key, Text value) throws IOException {
            while (reader.next(key, text)) {                
                String strReplace = text.toString().toLowerCase()
                        .replaceAll("\u0002", "\n");
                Text txtReplace = new Text();
                txtReplace.set(strReplace);
                value.set(txtReplace.getBytes(), 0, txtReplace.getLength());
                return true;
            }
            return false;
        }

      

But it didn't work, so what if there are other ways?

+3


source to share





All Articles