Limiting the number of rows in subqueries with Teradata

I'm new to Teradata and I ran into a problem that I didn't have a previous database I was using. Basically, I'm trying to reduce the number of rows returned in subqueries inside a where clause. I haven't had a problem with this before using the ROWNUM function.

My previous request was something like this:

WHERE field1 = 'foo' AND field2 in(
    SELECT field2 FROM anotherTable
    WHERE field3 = 'bar' AND ROWNUM<100);


Since I cannot use ROWNUM in TD, I was looking for equivalent functions, or at least functions that could get me where I wanted, even if they were not exactly equivalent. I found and tried: ROW_NUMBER, TOP and SAMPLE.

I tried ROW_NUMBER (), but Teradata does not allow analytic functions in WHERE clauses. I tried TOP N but this parameter is not supported in the subquery. I tried SAMPLE N but it is also not supported in subqueries.

So ... I must admit that I am a bit stuck right now and was wondering if there was any solution that would allow me to limit the number of rows returned in a subquery using Teradata and it would be very similar to what I did still? Also, if they are not there, how could the query be constructed differently to use it with Teradata?



source to share

1 answer

The limited use of SAMPLE or TOP in a subquery is probably due to the fact that it might be a correlated subquery.

But there are two workarounds.

Place SAMPLE or TOP in a derived table in a subquery (so this can no longer be correlated):

WHERE field1 = 'foo'
AND field2 IN (
       ( SELECT field2 FROM anotherTable -- or TOP 100
         WHERE field3 = 'bar'  SAMPLE 100
       ) AS dt


Or rewrite it as a join to a derived table:

JOIN ( SELECT DISTNCT field2 FROM anotherTable -- can't use TOP if you need DISTINCT 
         WHERE field3 = 'bar' SAMPLE 100
       ) AS dt
WHERE field1 = 'foo'
AND myTable.field2 = dt.field1;


TOP without ORDER BY is very similar to ROWNUM. It's not random at all, but running it a second time may still return a different set of results.

SAMPLE is really random, returning a different result each time.

ROW_NUMBER can also use QUALIFY instead of WHERE, but OLAP functions always require some ORDER BY, so this is a lot more overhead.



All Articles