Hadoop HIVE - How to query part of rows

If I have a table,

table name : mytable
columns : id, name, sex, age, score
row1 : 1,Albert,M,30,70
row2 : 2,Scott,M,34,60
row3 : 3,Amilie,F,29,75
...
row100 : 100,Jim,M,35,80

      

I want to select them five times.

1st iteration : row1 ~ row20
2nd iteration : row21 ~ row40
...
5th iteration : row81 ~ row100

      

How can I request a hive? Is there any known request? All 100 lines are returned below.

SELECT * FROM mytable;

      

But I really only want to see 20 lines each time.

+3


source to share


2 answers


This is easily done using the Limit Offset's Myqsl

. Limit hive support but not offset (not 100%) But you can limit your output to

SELECT * FROM mytable
LIMIT 20;

      

it will only give 20 entries, but not 20-40;

You can do ROW_NUMBER

in the hive



SELECT *,ROW_NUMBER over (Order by id)  as rowid FROM mytable
where rowid > 0 and rowid <=20;

      

next time you need to change the condition in the where clause.

SELECT *,ROW_NUMBER over (Order by id)  as rowid FROM mytable
    where rowid > 20 and rowid <=40;

      

You can also pass the rowid variable using a text file or set the variable run the os command and set the value of the put to vive variable

+2


source


Updating this. Just in case someone else is trying this solution.

For me, it only worked with parentheses after the row number and a new SELECT statement around the query with the where clause, since the "rowid" alias was not available in the inner SELECT. Wasted me trying to figure it out.



SELECT * FROM (
    SELECT *, ROW_NUMBER() OVER(Order by id) as rowid FROM mytable
) t1
WHERE rowid > 0 and rowid <= 20;

      

+1


source







All Articles