Saving links for contests in Postgres
I am trying to determine if there is a "low cost" for the next request. We have implemented a system in which "tickets" earn "points" and therefore can be ranked. To support the analytical type of queries, we save the rank of each ticket and save the ticket along with the ticket.
I found that at scale, storing the is_tied field is very slow. I am trying to run the below script on a set of "tickets" which is around 20-75k tickets.
I hope someone can help determine the cause and offer some help.
We are on postgres 9.3.6
Here's a simplified ticket schema:
ogs_1=> \d api_ticket
Table "public.api_ticket"
Column | Type | Modifiers
------------------------------+--------------------------+---------------------------------------------------------
id | integer | not null default nextval('api_ticket_id_seq'::regclass)
status | character varying(3) | not null
points_earned | integer | not null
rank | integer | not null
event_id | integer | not null
user_id | integer | not null
is_tied | boolean | not null
Indexes:
"api_ticket_pkey" PRIMARY KEY, btree (id)
"api_ticket_4437cfac" btree (event_id)
"api_ticket_e8701ad4" btree (user_id)
"api_ticket_points_earned_idx" btree (points_earned)
"api_ticket_rank_idx" btree ("rank")
Foreign-key constraints:
"api_ticket_event_id_598c97289edc0e3e_fk_api_event_id" FOREIGN KEY (event_id) REFERENCES api_event(id) DEFERRABLE INITIALLY DEFERRED
(user_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
Here is the query I'm running:
UPDATE api_ticket t SET is_tied = False
WHERE t.event_id IN (SELECT id FROM api_event WHERE status = 'c');
UPDATE api_ticket t SET is_tied = True
FROM (
SELECT event_id, rank
FROM api_ticket tt
WHERE event_id in (SELECT id FROM api_event WHERE status = 'c')
AND tt.status <> 'x'
GROUP BY rank, event_id
HAVING count(*) > 1
) AS tied_tickets
WHERE t.rank = tied_tickets.rank AND
tied_tickets.event_id = t.event_id;
A set of roughly 35k rows is explained here:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Update on api_ticket t (cost=3590.01..603570.21 rows=157 width=128)
-> Nested Loop (cost=3590.01..603570.21 rows=157 width=128)
-> Subquery Scan on tied_tickets (cost=2543.31..2556.18 rows=572 width=40)
-> HashAggregate (cost=2543.31..2550.46 rows=572 width=8)
Filter: (count(*) > 1)
-> Nested Loop (cost=0.84..2539.02 rows=572 width=8)
-> Index Scan using api_event_status_idx1 on api_event (cost=0.29..8.31 rows=1 width=4)
Index Cond: ((status)::text = 'c'::text)
-> Index Scan using api_ticket_4437cfac on api_ticket tt (cost=0.55..2524.99 rows=572 width=8)
Index Cond: (event_id = api_event.id)
Filter: ((status)::text <> 'x'::text)
-> Bitmap Heap Scan on api_ticket t (cost=1046.70..1050.71 rows=1 width=92)
Recheck Cond: (("rank" = tied_tickets."rank") AND (event_id = tied_tickets.event_id))
-> BitmapAnd (cost=1046.70..1046.70 rows=1 width=0)
-> Bitmap Index Scan on api_ticket_rank_idx (cost=0.00..26.65 rows=708 width=0)
Index Cond: ("rank" = tied_tickets."rank")
-> Bitmap Index Scan on api_ticket_4437cfac (cost=0.00..1019.79 rows=573 width=0)
Index Cond: (event_id = tied_tickets.event_id)
+3
source to share
No one has answered this question yet
Check out similar questions: