DISTINCT inside SQL Server count
I am writing a product search module that uses faceted search. To simplify things a bit, there are two objects: product information and price information. Since products may be sold by multiple online stores, one product may have more pricing information.
product
- product_guid
- product_category
- product_brand
priceInformation
- priceInformation_guid
- priceInformation_price
- priceInformation_product_guid
I list all products in a specific category and the user can filter by price range. The price ranges are indicated with checkboxes, behind each price range you can see the number of products that match the search criteria. Price ranges that do not match are not displayed. For each price range, I make a query to determine the number of matches (number of edges).
select count(distinct(product.product_guid)) as count
from product
INNER JOIN priceInformation ON product.product_guid =
priceInformation.priceInformation_product_guid
WHERE category= 'beer'
AND product.priceInformation_price > 20
AND product.priceInformation_price <= 30
Everything works fine, but ... because the number of ranges is growing (this is just a simplified example). I run hundreds of queries for each search. This is a big killer. I tried to get all the facet counts in one query, but since there is a 1: n relationship between product and price information, the result is the number of prices that match, not the number of products.
select (Count(Case WHEN product.priceInformation_price > 20 AND
product.priceInformation_price <= 30 THEN 1
ELSE NULL END)) as Range2030,
(Count(Case WHEN product.priceInformation_price > 30 AND
product.priceInformation_price <= 40 THEN 1
ELSE NULL END)) as Range3040
from product
INNER JOIN priceInformation ON product.product_guid =
priceInformation.priceInformation_product_guid
WHERE category= 'beer'
The bottom line is that I missed DISTINCT here. I pull my hair out for a few days. Can anyone please help?
source to share
Use a derived table to get different price ranges, make GROUP BY
to count different. Something like:
select count(distinct(dt.product_guid)), dt.pricerange as count
from
(
select product.*, case when price < 20 then 'price < 20'
when price between 20 and 30 then 'price 20 - 30'
else 'price > 30' end as pricerange
from product
INNER JOIN priceInformation ON product.product_guid =
priceInformation.priceInformation_product_guid
WHERE category= 'beer'
) dt
group by dt.pricerange
Or have I now answered something to something else ...?
source to share
I think you should create a table with price ranges, for example:
create table PriceRanges(minPrice money,maxPrice money);
insert PriceRanges values (0,9),(10,19),(20,29),(100,149) ;
Then, using this table, the query:
SELECT minPrice,maxPrice,COUNT(DISTINCT p.product_guid)
as ProductCount
FROM PriceRanges pr
LEFT JOIN priceInformation pi ON pi.priceInformation_price
BETWEEN pr.minPrice AND pr.maxPrice
LEFT JOIN product p ON pi.priceInformation_product_guid= p.product_guid
WHERE p.product_category = 'beer'
GROUP BY minPrice,maxPrice
ORDER BY minPrice
Or use JOIN instead of LEFT JOIN if you don't need empty price ranges.
source to share