Minimum number of lines in a group

Below is my request:

SELECT *
FROM (
    SELECT f.max, f.min, p.user_id, p.id, p.title, p.rating,
    RANK() OVER (
        PARTITION BY p.user_id
        ORDER BY p.rating DESC, p.id DESC
    ) AS rnk
    FROM posts AS p
    INNER JOIN friends AS f ON (p.user_id = f.friend_id)
    WHERE f.user_id=1
) AS subq
WHERE (subq.rnk <= subq.max)
LIMIT 10

      

It looks for messages from my friends, sorted by rating and date. The window feature implemented in this query allows me to limit the number of rows returned for each friend according to the field MAX

in the table Friends

.

However, I also have a field MIN

that is used to specify the minimum number of messages I want to receive from a request for a given friend. How is this possible?

I also wonder if SQL is the best option for these types of queries? I've already tried the Neo4j Graph database and while it seemed like a good solution, I would rather avoid using two separate databases.

SQLFiddle

Scheme:

CREATE TABLE friends(
    user_id int,
    friend_id int,
    min int,
    max int
);

CREATE TABLE posts(
   id int,
   title varchar(255),
   rating int,
   date date,
   user_id int
);

      

Let's assume we have the following data:

INSERT INTO friends VALUES
  (1,2,1,3)
, (1,3,0,5)
, (1,4,2,10);

INSERT INTO posts VALUES
  (1,  'posts1',  2,  now(), 2)
, (2,  'posts2',  1,  now(), 2)
, (3,  'posts3',  5,  now(), 2)
, (4,  'posts4',  2,  now(), 2)
, (5,  'posts5',  11, now(), 2)
, (6,  'posts6',  7,  now(), 2)
, (7,  'posts7',  3,  now(), 2)
, (8,  'posts8',  4,  now(), 3)
, (9,  'posts9',  1,  now(), 3)
, (10, 'posts10', 0,  now(), 3)
, (11, 'posts11', 7,  now(), 3)
, (12, 'posts12', 3,  now(), 3)
, (13, 'posts13', 2,  now(), 3)
, (14, 'posts14', 4,  now(), 4)
, (15, 'posts15', 9,  now(), 4)
, (16, 'posts16', 0,  now(), 4)
, (17, 'posts17', 3,  now(), 4)
, (18, 'posts18', 2,  now(), 4)
, (19, 'posts19', 1,  now(), 4)
, (20, 'posts20', 2,  now(), 4);

      

Hence, I would like to see combinations (post_id, title, rating, date, friend_id)

with the following conditions, if possible:

  • between 1 and 3 messages from a friend with id

    = 2
  • between 0 and 5 messages from a friend with id

    = 3
  • from 2 to 10 messages from a friend with id

    = 4

So, basically, if a friend of mine has friend_id=2

published 1 or more articles, I need at least 2 of them. If he has published more than 3 articles, I want no more than 3.

+3


source to share


3 answers


Let's say I want you to have 2-5 messages from you every day if you post this a lot. If you only post one, everything is fine and I will only have one post.

Your explanation in the comment still doesn't add up. Your number min

will sound without effect according to this explanation.

This is not what you wrote, but it would make sense:

Given the max impressions for posts (external LIMIT

), I want to get posts min

from each friend first (if any). If there are free slots after that, fill up max

messages for a friend.

In the example, which will be 1 ( min

) message from friend 2 with the highest priority and 2 more ( max - min

) messages if the number of slots is still available.

It would be arbitrary for messages to make the cut if there are not enough slots for each priority. I went ahead and assumed that the first post from each should be selected first, etc.



The rest is still arbitrary, but can be easily resolved if you can manage to formulate the requirement.

SELECT *
FROM   friends f
,  LATERAL (
   SELECT *
        , row_number() OVER (ORDER  BY rating DESC NULLS LAST, id DESC) AS rn
   FROM   posts p
   WHERE  user_id = f.friend_id  -- LATERAL reference
   ORDER  BY rating DESC NULLS LAST, date DESC NULLS LAST
   LIMIT  f.max  -- LATERAL reference
   ) p
WHERE  f.user_id = 1
ORDER  BY (p.rn > f.min)  -- minimum posts from each first
        , p.rn
LIMIT  10;  -- arbitrary total maximum

      

SQL Fiddle.

Notes

  • Assuming friends.user_id

    u posts.id

    are the primary keys. Your table definition is not there.

  • All other columns must be defined NOT NULL

    to make sense.

  • Use LATERAL

    join to select only max

    transactions for a friend in a subquery:

  • Use , not in a subquery. It is a common mistake to confuse both. row_number()

    rank()

  • You mentioned date

    , but it was not shown in your request. Maybe you really want:

    , row_number() OVER (ORDER  BY rating DESC NULLS LAST
                                 , date   DESC NULLS LAST) AS rn
    
          

  • DESC NULLS LAST

    just because rating

    u date

    can be NULL:

  • In Postgres, you can use a simple boolean expression in ORDER BY

    :

    ORDER  BY (p.rn > f.min), p.rn
    
          

    Which puts min

    messages for a friend first. The second element ( p.rn

    ) gives each friend an equal chance (first post first, etc.).

  • Do not use as an identifier. It is a reserved word in standard SQL and the base type name in Postgres. date

+5


source


Am I missing something? Can you add and subq.rnk >= subq.min

to the proposal where

?



SELECT *
FROM (SELECT f.max, f.min, p.friend_id, p.id, p.title, p.rating
             row_number() over (partition by p.user_id
                                order by p.rating DESC, p.id DESC
                               ) as rnk
      FROM posts p INNER JOIN
           friends f
           ON p.friend_id = f.user_id
      WHERE f.user_id = 1
     )  subq
WHERE subq.rnk <= subq.max and subq.rnk >= subq.min

      

0


source


I think from a neo4j / cypher perspective, this is really what you want to do ...

match (u:User {id: 1})-[r:FOLLOWS]->(p:Publisher)
with u, p, r
match p-[:PUBLISHED]-(i:Item)
with u, p, r, i
order by i.name
return u.name, p.name, i.name
skip 5
limit 2

      

You will just need to parameterize min and max and link them at runtime, which will involve two requests, not one, but I think this is still an elegant solution. I've tried including properties in skip

and limit

before, but cypher obviously doesn't support this (yet). He wants a parameter or unsigned integer.

match (u:User {id: 1})-[r:FOLLOWS]->(p:Publisher)
with u, p, r
match p-[:PUBLISHED]-(i:Item)
with u, p, r, i
order by i.name
return u.name, p.name, i.name
skip {min}
limit {max}

      

0


source







All Articles