How can I shorten this SQL query without a million AND / OR?

Question

How can I shorten this SQL query without a million AND / OR?

I have a table with 13 columns (one for the ID # row and another 12 - one for each month of the year). Each column will only contain numbers and I want to write multiple queries to check each row for certain conditions and return the id of any rows that match.

So far, I've been good at writing all basic SELECT queries, but now I've gotten a bit stuck in writing SELECT queries to check multiple conditions at once without writing millions of lines of code every time.

The current query I'm writing needs to check every couple of consecutive months (like Jan / Feb, Feb / Mar, etc.) to see how big the difference between them is and needs to return any rows where 2 lots consecutive months have a 20% difference between them, and the remaining pairs have a 10% difference.

For example, if there were 1000 in January and 1300 in February, then the difference is 30%, so 1 lot. Then, if we say that April was 1500, and May was 2100, then the difference is 40%, so the second batch. Then while every other pair (Feb / Mar, Mar / Apr, ..., Nov / Dec) has a 10% difference each, then this row should be returned.

Unfortunately, the only way I can get this to work is to manually check every possible possibility (which works), but not very good for recording similar queries.

Here's a shorthand version of what I have so far:

SELECT pkID
FROM dbo.tblMonthData
WHERE

((colFeb > colJan * 1.2 AND colMar > colFeb * 1.2) AND (colApr < colMar * 1.1 AND colMay < colApr * 1.1 AND colJun < colMay * 1.1 AND colJul < colJun * 1.1 AND colAug < colJul * 1.1 AND colSep < colAug * 1.1 AND colOct < colSep * 1.1 AND colNov < colOct * 1.1 AND colDec < colNov * 1.1))

OR ((colFeb > colJan * 1.2 AND colApr > colMar * 1.2) AND (colMar < colFeb * 1.1 AND colMay < colApr * 1.1 AND colJun < colMay * 1.1 AND colJul < colJun * 1.1 AND colAug < colJul * 1.1 AND colSep < colAug * 1.1 AND colOct < colSep * 1.1 AND colNov < colOct * 1.1 AND colDec < colNov * 1.1))

OR ((colFeb > colJan * 1.2 AND colMay > colApr * 1.2) AND (colMar < colFeb * 1.1 AND colApr < colMar * 1.1 AND colJun < colMay * 1.1 AND colJul < colJun * 1.1 AND colAug < colJul * 1.1 AND colSep < colAug * 1.1 AND colOct < colSep * 1.1 AND colNov < colOct * 1.1 AND colDec < colNov * 1.1))

...

OR ((colNov > colOct * 1.2 AND colDec > colNov * 1.2) AND (colFeb < colJan * 1.1 AND colMar < colFeb * 1.1 AND colApr < colMar * 1.1 AND colMay < colApr * 1.1 AND colJun < colMay * 1.1 AND colJul < colJun * 1.1 AND colAug < colJul * 1.1 AND colSep < colAug * 1.1 AND colOct < colSep * 1.1))

There are about 55 lines of OR statements in total to test all possible combinations of this. If I then went to request something like this (for example, return all Row ID #s where 2 pairs are greater than 50% and 4 pairs are less than 10%), that would involve writing another long query from scratch, checking every possible combination.

So I was wondering how can I rewrite this in a shorter version that might also be more reusable for similar queries?

+3

sql sql-server

Calcolat 17 Aug 14 at 21:13

source to share

3 answers

Do not use the table for comparison, and then revert to the original format if necessary .:

with 
unpvt as (
    select    
         YearNo     
        ,case MonthNo when  1 then colJan
                      when  2 then colFeb
                      when  3 then colMar
                      when  4 then colApr
                      when  5 then colMay
                      when  6 then colJun
                      when  7 then colJul
                      when  8 then colAug
                      when  9 then colSep
                      when 10 then colOct
                      when 11 then colNov
                      when 12 then colDec 
                              else 0
         end as Value
        ,YearNo * 12 + MonthNo - 1 as PeriodNo
        /* other columns */
--    from dbo.tblMonthData
    from tblMonthData
    cross join ( values
        (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)
    )months(MonthNo)
)
select 
    this.*,
    this.Value - isnull(prev.Value,0) as Delta
from unpvt          this
left join unpvt     prev
   on prev.PeriodNo = this.PeriodNo - 1
;

etc .. Depending on your version of SQL Server, you may have access to the UNPIVOT article to compress the source even further.

If you have a static NUMBERS table , then the first twelve rows from it can be used instead of initializing VALUE in a CROSS JOIN.

+4

Pieter Geerkens 17 Aug 14 at 21:42

source to share

You can turn off your data row by row in cross-use by using the table-valued constructor that creates a derived table with two consecutive month values. In addition, you can perform calculations and count rows that match certain criteria using case and sum .

Your request might look something like this.

select MD.pkID
from dbo.tblMonthData as MD
  cross apply (
              select sum(case when P.Mon1 * 1.2 < P.Mon2 then 1 end),
                     sum(case when P.Mon1 * 1.1 > P.Mon2 then 1 end)
              from (values(MD.colJan, MD.colFeb),
                          (MD.colFeb, MD.colMar),
                          (MD.colMar, MD.colApr),
                          (MD.colApr, MD.colMay),
                          (MD.colMay, MD.colJun),
                          (MD.colJun, MD.colJul),
                          (MD.colJul, MD.colAug),
                          (MD.colAug, MD.colSep),
                          (MD.colSep, MD.colOct),
                          (MD.colOct, MD.colNov),
                          (MD.colNov, MD.colDec)) as P(Mon1, Mon2)
              ) as C(MoreThan20, LessThan10)
where C.MoreThan20 = 2 and
      C.LessThan10 = 9

+1

Mikael Eriksson 19 Aug 14 at 6:43

source to share

Turophile · Accepted Answer · 2014-08-19T02:41:06+0000

So, as an alternative answer to @ Pieter's question and to illustrate how a simplified data structure can make your task easier, I suggest this:

create view tblEasy as (
select pkID, 1 as colMonth, colJan as colValue from tblMonthData
UNION
select pkID, 2 as colMonth, colFeb as colValue from tblMonthData
UNION
select pkID, 3 as colMonth, colMar as colValue from tblMonthData
UNION
select pkID, 4 as colMonth, colApr as colValue from tblMonthData
UNION
select pkID, 5 as colMonth, colMay as colValue from tblMonthData
UNION
select pkID, 6 as colMonth, colJun as colValue from tblMonthData
UNION
select pkID, 7 as colMonth, colJul as colValue from tblMonthData
UNION
select pkID, 8 as colMonth, colAug as colValue from tblMonthData
UNION
select pkID, 9 as colMonth, colSep as colValue from tblMonthData
UNION
select pkID, 10 as colMonth, colOct as colValue from tblMonthData
UNION
select pkID, 11 as colMonth, colNov as colValue from tblMonthData
UNION
select pkID, 12 as colMonth, colDec as colValue from tblMonthData
);

This makes it look similar to how I would first structure the table. Then it is easy to create pairs by comparing the value on colMonth

with that on colMonth + 1

.

I've made a fiddle to illustrate how the comparison can also be done in the view, and then the query itself, if that's obvious enough.

http://sqlfiddle.com/#!3/600f6/4

Please note that performance is poor due to the original table structure.

Update Since this is accepted as an answer, I'll add more details from the sqlfiddle.

Additional view for a preliminary calculation of the difference between consecutive months:

create view tblPairs as (
  select t1.pkId , t1.colMonth as colStart, (t2.colValue * 100 / t1.colValue) as colPercentage
  from tblEasy as t1
  inner join tblEasy as t2 
  on t1.pkId = t2.pkId and t1.colMonth = t2.colMonth - 1);

Query to find where 2 months have more than 20% increase and the other 9 have less than 10%:

select distinct pkid
from tblPairs as t1
where 2 = (
    select count(*)
    from tblPairs as t2
    where t2.pkid = t1.pkid
    and colPercentage >= 120)
and 9 = (
    select count(*)
    from tblPairs as t2
    where t2.pkid = t1.pkid
    and colPercentage <= 110)
;

How can I shorten this SQL query without a million AND / OR?

More articles: