Hadoop HIVE - How to query part of rows

Question

Hadoop HIVE - How to query part of rows

If I have a table,

table name : mytable
columns : id, name, sex, age, score
row1 : 1,Albert,M,30,70
row2 : 2,Scott,M,34,60
row3 : 3,Amilie,F,29,75
...
row100 : 100,Jim,M,35,80

I want to select them five times.

1st iteration : row1 ~ row20
2nd iteration : row21 ~ row40
...
5th iteration : row81 ~ row100

How can I request a hive? Is there any known request? All 100 lines are returned below.

SELECT * FROM mytable;

But I really only want to see 20 lines each time.

+3

hive

Dorr Dec 10. 14 at 23:28

source to share

2 answers

Updating this. Just in case someone else is trying this solution.

For me, it only worked with parentheses after the row number and a new SELECT statement around the query with the where clause, since the "rowid" alias was not available in the inner SELECT. Wasted me trying to figure it out.

SELECT * FROM (
    SELECT *, ROW_NUMBER() OVER(Order by id) as rowid FROM mytable
) t1
WHERE rowid > 0 and rowid <= 20;

+1

Luke P 10 oct. 17 at 12:40

source to share

Kishore kumar suthar · Accepted Answer · 2014-12-11T05:26:36+0000

This is easily done using the Limit Offset's Myqsl

. Limit hive support but not offset (not 100%) But you can limit your output to

SELECT * FROM mytable
LIMIT 20;

it will only give 20 entries, but not 20-40;

You can do ROW_NUMBER

in the hive

SELECT *,ROW_NUMBER over (Order by id)  as rowid FROM mytable
where rowid > 0 and rowid <=20;

next time you need to change the condition in the where clause.

SELECT *,ROW_NUMBER over (Order by id)  as rowid FROM mytable
    where rowid > 20 and rowid <=40;

You can also pass the rowid variable using a text file or set the variable run the os command and set the value of the put to vive variable

Hadoop HIVE - How to query part of rows

More articles: