Oracle: significantly improved query performance
I have the following query for which I am trying to improve performance:
select atx.journal_id
,ab.c_date
from acct_batch ab
join acct_tx atx on ab.acct_id = atx.acct_id
and ab.batch_id = atx.batch_id
join journal j on j.journal_id = atx.journal_id
and j.journal_type_id = 6
join acct a on a.acct_id = atx.acct_id
and a.acct_type_id = 32
join payments p on p.payment_id = j.payment_id
join routing r on r.route_id = p.route_id
and r.acq_code = 'RZ_NS'
join acq_acct aa on aa.acq_code = r.acq_code
and aa.acq_acct_code = r.acq_acct_code
and aa.slc = 'MXM'
where ab.c_date between to_date(to_char('01-JUL-2015')) and last_day(sysdate);
I ran and reviewed the explain plan and the total cost is 7388. Of this, the most expensive part is the table join journal
, which costs 6319.
The table contains about 1.6 million rows with 87 sections, of which only two contain rows (section 6 with 1.4 million and section 12 with an approximate remaining 200,000 rows).
The first thing I tried was to rewrite the query to avoid a full scan when matching the actual file__discs__and_and up to 6, but I think my understanding was wrong because the cost remained at 7388.
select atx.journal_id
,ab.c_date
from acct_batch ab
join acct_tx atx on ab.acct_id = atx.acct_id
and ab.batch_id = atx.batch_id
join (select
journal_id
, payment_id
from journal
where journal_type_id = 6) j on j.journal_id = atx.journal_id
join acct a on a.acct_id = atx.acct_id
and a.acct_type_id = 32
join payments p on p.payment_id = j.payment_id
join routing r on r.route_id = p.route_id
and r.acq_code = 'RZ_NS'
join acq_acct aa on aa.acq_code = r.acq_code
and aa.acq_acct_code = r.acq_acct_code
and aa.slc = 'MXM'
where ab.c_date between to_date(to_char('01-JUL-2015')) and last_day(sysdate);
I searched a lot of resources and one of the reasons I decided to rewrite the request was this video .
I'm still actively looking for ways to improve performance, but I thought I would post the question here to get some hints.
I think the person from the video is talking about the first thing to do is determine which one is the "driving table" (the one that determines which rows are fetched - based on the key) m is currently looking for a way to rewrite the query to determine and using this driving table and its indexes as much as possible.
I don't know if I'm still on the right track, but please stop me if you think I should act differently. Also, please note that I am new to performance tuning, this is actually my first.
Any help is appreciated.
Update:
Some indexes containing columns used in queries are as follows:
╔════════════╦═══════════════╦════════════╦═══════════╦═════════════╦═══════════════════════════════════╗
║ Table ║ IndexName ║ Uniqueness ║ IndexType ║ Partitioned ║ Columns ║
╠════════════╬═══════════════╬════════════╬═══════════╬═════════════╬═══════════════════════════════════╣
║ Acct_Batch ║ Acct_Batch_PK ║ UNIQUE ║ NORMAL ║ NO ║ Acct_ID, Batch_ID ║
║ Acct_TX ║ Acct_TX_IDX ║ NONUNIQUE ║ NORMAL ║ YES ║ Acct_ID, Batch_ID ║
║ Acct_TX ║ Acct_TX_BIDX ║ NONUNIQUE ║ NORMAL ║ YES ║ Journal_ID, Acct_ID ║
║ Journal ║ Journal_PK ║ UNIQUE ║ NORMAL ║ YES ║ Journal_ID ║
║ Journal ║ JType_BIDX ║ NONUNIQUE ║ NORMAL ║ YES ║ Journal_Type_ID, Book_Date ║
║ Journal ║ JType_BIDX_2 ║ NONUNIQUE ║ NORMAL ║ YES ║ MCODE, Journal_Type_ID, Book_Date ║
║ Journal ║ JPay_BIDX ║ NONUNIQUE ║ NORMAL ║ YES ║ Payment_ID, Journal_ID ║
╚════════════╩═══════════════╩════════════╩═══════════╩═════════════╩═══════════════════════════════════╝
Let me know if you need to see more pointers or details regarding other tables.
Example of an explanation plan:
-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 160 | 7388 (1)| 00:01:29 | | |
|* 1 | FILTER | | | | | | | |
| 2 | NESTED LOOPS | | | | | | | |
| 3 | NESTED LOOPS | | 1 | 160 | 7388 (1)| 00:01:29 | | |
|* 4 | HASH JOIN | | 4 | 604 | 7380 (1)| 00:01:29 | | |
| 5 | NESTED LOOPS | | | | | | | |
| 6 | NESTED LOOPS | | 107 | 14338 | 7372 (1)| 00:01:29 | | |
|* 7 | HASH JOIN | | 27 | 3186 | 7298 (1)| 00:01:28 | | |
| 8 | NESTED LOOPS | | | | | | | |
| 9 | NESTED LOOPS | | 102 | 10302 | 978 (0)| 00:00:12 | | |
| 10 | NESTED LOOPS | | 11 | 638 | 37 (0)| 00:00:01 | | |
|* 11 | TABLE ACCESS BY INDEX ROWID | ACQ_ACCT | 11 | 253 | 4 (0)| 00:00:01 | | |
|* 12 | INDEX RANGE SCAN | AA_PK | 16 | | 2 (0)| 00:00:01 | | |
| 13 | TABLE ACCESS BY INDEX ROWID | ROUTES | 1 | 35 | 3 (0)| 00:00:01 | | |
|* 14 | INDEX RANGE SCAN | R_A_BIDX | 1 | | 2 (0)| 00:00:01 | | |
| 15 | PARTITION RANGE ALL | | 95 | | 84 (0)| 00:00:02 | 1 | 84 |
|* 16 | INDEX RANGE SCAN | P_R_ID_BIDX | 95 | | 84 (0)| 00:00:02 | 1 | 84 |
| 17 | TABLE ACCESS BY LOCAL INDEX ROWID| PAYMENTS | 9 | 387 | 100 (0)| 00:00:02 | 1 | 1 |
| 18 | PARTITION RANGE ALL | | 107K| 1782K| 6319 (1)| 00:01:16 | 1 | 87 |
|* 19 | TABLE ACCESS FULL | JOURNAL | 107K| 1782K| 6319 (1)| 00:01:16 | 1 | 87 |
| 20 | PARTITION RANGE ITERATOR | | 4 | | 2 (0)| 00:00:01 | KEY | KEY |
|* 21 | INDEX RANGE SCAN | ATX_A_IDX | 4 | | 2 (0)| 00:00:01 | KEY | KEY |
| 22 | TABLE ACCESS BY LOCAL INDEX ROWID | ACCT_TX | 4 | 64 | 3 (0)| 00:00:01 | 1 | 1 |
|* 23 | INDEX RANGE SCAN | AB_B_A_IDX | 5006 | 85102 | 8 (0)| 00:00:01 | | |
|* 24 | INDEX UNIQUE SCAN | ACC_PK | 1 | | 1 (0)| 00:00:01 | | |
|* 25 | TABLE ACCESS BY INDEX ROWID | ACCT | 1 | 9 | 2 (0)| 00:00:01 | | |
-------------------------------------------------------------------------------------------------------------------------------
source to share
So, after a closer look at the code of the data displayed based on the columns listed in the query part SELECT
, I noticed that the last merged table does not bring any input (does not require any data to be displayed on it) on the output.
join acq_acct aa on aa.acq_code = r.acq_code
and aa.acq_acct_code = r.acq_acct_code
and aa.slc = 'MXM'
Hence, I moved this request to offer EXISTS
and re-executed the request. My modified request looks like this:
select atx.journal_id
,ab.c_date
from acct_batch ab
join acct_tx atx on ab.acct_id = atx.acct_id
and ab.batch_id = atx.batch_id
join journal j on j.journal_id = atx.journal_id
and j.journal_type_id = 6
join acct a on a.acct_id = atx.acct_id
and a.acct_type_id = 32
join payments p on p.payment_id = j.payment_id
join routing r on r.route_id = p.route_id
and r.acq_code = 'RZ_NS'
where ab.c_date between to_date(to_char('01-JUL-2015')) and last_day(sysdate)
and exists (select 1
from acq_acct aa
where aa.acq_code = r.acq_code
and aa.acq_acct_code = r.acq_acct_code
and aa.slc = 'MXM');
This helped to improve the cost of my request from 7388 to 292, which is a colossal difference.
Hopefully I understood the correct understanding of this and my explanation also made sense.
If anyone thinks my conclusions were disabled or the "logical reasoning" was wrong, please leave a comment (now my conclusions / explanations initially make sense to me).
source to share
First, check that your statistics are up to date: the optimizer relies heavily on statistics! Second, you have to say something about the number of rows you are getting with this query: depending on the number of rows each condition fetches, a full scan may be better than an index search.
source to share