Query for dates listed in another table

I want to select specific rows of a table that are between two dates (which are in a separate table). The details of my tables and query can be found in the previous question here (I am now interested in how to do this in HIVE / hiveQL). As my current request stands, it takes a long time and then seems to hang indefinitely, whereas when I hard-code the dates it ends up quickly. Tables and Query for reference:

VISIT_INFO, with these columns:

pers_key - unique identifyer for each person
pers_name - name of person
visit_date - date at which they visited a business

      

VALID_DATES, with these columns:

condition - string
start_date - date
end_date - date 

      

And the query itself:

select pers_key, pers_name from VISIT_INFO a
CROSS JOIN
(select start_date, end_date from VALID_DATES where condition = 'condition1') b
WHERE (a.visit_date >= b.start_date and a.visit_date <= b.end_date)
GROUP BY a.pers_key

      

It's worth noting that I'm using HIVE 0.12, so getting rid of the join and putting the select clause in the WHERE clause is out of the question. I am wondering what exactly is wrong with this request or what could cause it to fail. Any suggestions on how to improve this would be appreciated.

+3


source to share


1 answer


Try:

select pers_key, pers_name 
from VISIT_INFO a 
join 
valid_dates b
WHERE a.visit_date BETWEEN b.start_date AND b.end_date
GROUP BY pers_key, pers_name;

      



As of Ul 0.13:

select pers_key, pers_name 
from VISIT_INFO a , valid_dates b
WHERE a.visit_date BETWEEN b.start_date AND b.end_date
GROUP BY pers_key, pers_name;

      

+3


source







All Articles