Validate Sqoop using QUERY and WHERE clauses
I am streamlining a data import process that takes data from an existing database and splits it into an HDFS schema. By default, the job is split into four map processes and I now have a job configured for a daily interval via Apache Oozie.
Since Oozie is DAG oriented, is it possible to create a validationStep in the Oozie workflow so that:
- Run HIVE query on newly imported data to return row count
- Run SQL Query to Return Count of Rows in Original Data Source
- Compare two values
- If they don't match, return FAIL and KILL JOB, if they match, return TRUE and OK
I understand that there is a validation process in sqoop, but I understand that since I am not doing this against a single table, this does not apply (each of my sqoop imports are split on a specific date).
Is it possible?
+3
source to share
No one has answered this question yet
See similar questions:
or similar: