Thursday, June 18, 2009

You don't really need that index (or that SQL)

I was given this simple looking piece of code to review.


select audit_rec_updt_dts, mod(rownm,4) as rownm, pol_seq_nbr
from
(select a.AUDIT_REC_UPDT_DTS, a.POL_SEQ_NBR, rownum as rownm
from TR_POLICY_REGISTER a, TB_TRANSACTION b
where a.POL_STAT_CD = 'REPORTED' and a.JACKET_SOURCE_CD = 'LAT'and
a.AUDIT_REC_UPDT_DTS > b.EXTRACT_START_DTS and
a.AUDIT_REC_UPDT_DTS < b.EXTRACT_END_DTS and
b.TRANS_SYS_ID in (select max(trans_sys_id) from tb_transaction)) dt
where mod(rownm,4) = 0;


I was also given 3 other almost identical statements except that the where clause changed the 0 to 1, 2, and 3. Danger Will Robinson! (Yes, that reference dates me. How about you? Remember "Lost in Space"?) It looks like somebody is thinking the result set is so big they need to break it up into chunks. More on this later...

Anyway, in order to get the query to complete in a reasonable time (my test without the index didn't come back within 30 minutes so I killed it), a "temporary" index was being created just before this statement was executed. By "temporary", I mean that they created the index just prior to running the SQL and then dropped it immediately after the SQL completed. Creating the index (parallel nologging) took several minutes to complete.

Here's execution plan for the query with the index in place:


-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------------
|* 1 | VIEW | | 1 | 995 | 2627 |00:00:00.11 | 9815 |
| 2 | COUNT | | 1 | | 10508 |00:00:00.08 | 9815 |
| 3 | NESTED LOOPS | | 1 | 995 | 10508 |00:00:00.07 | 9815 |
| 4 | TABLE ACCESS BY INDEX ROWID | TB_TRANSACTION | 1 | 1 | 1 |00:00:00.01 | 3 |
|* 5 | INDEX UNIQUE SCAN | TB_TRANSACTION_PK | 1 | 1 | 1 |00:00:00.01 | 2 |
| 6 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
| 7 | INDEX FULL SCAN (MIN/MAX)| TB_TRANSACTION_PK | 1 | 289 | 1 |00:00:00.01 | 1 |
| 8 | TABLE ACCESS BY INDEX ROWID | TR_POLICY_REGISTER | 1 | 995 | 10508 |00:00:00.07 | 9812 |
|* 9 | INDEX RANGE SCAN | TR_POLICY_REGISTER_DMIDX1 | 1 | 2 | 10508 |00:00:00.01 | 73 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(MOD("ROWNM",4)=0)
5 - access("B"."TRANS_SYS_ID"=)
9 - access("A"."JACKET_SOURCE_CD"='LAT' AND "A"."POL_STAT_CD"='REPORTED' AND
"A"."SYS_NC00029$">SYS_OP_DESCEND("B"."EXTRACT_END_DTS") AND
"A"."SYS_NC00029$" filter((SYS_OP_UNDESCEND("A"."SYS_NC00029$")>"B"."EXTRACT_START_DTS" AND
SYS_OP_UNDESCEND("A"."SYS_NC00029$")<"B"."EXTRACT_END_DTS"))


Looks good, right? The query executed in less than 1 second (.11 secs) and returned 2627 rows. Remember the predicate? There were 4 queries. Each one retrieved 1/4 of the data (2627 x 4 = 10508). That's a pretty small result set. Why run the query 4 times when once will do? And, it's not doing what the developer thought. I think the developer thought it would only access 1/4 of the rows each time and that somehow that would be faster than accessing all of them at once. Nope. Even though running this query 4 times in a row is still a fairly insignificant amount of resources used, it's not necessary. Each execution is getting all the rows and then throwing out 75% of them. Why not just run it once?

The next question is why is an index created just before and dropped just after this? Does having it be permanent cause a problem somewhere? Plus, there's already a persistent index that would/should work although it does not include all the columns used in the temp index. But, obviously something went wrong with the query that it didn't use that index properly or they wouldn't have created the temp index and attempted to break up the query into 4 parts, right?

OK. So, how about just rewriting the query to see if the current index can be used efficiently enough to meet their needs? I tried this:


select a.AUDIT_REC_UPDT_DTS, a.POL_SEQ_NBR
from TR_POLICY_REGISTER a
where a.POL_STAT_CD = 'REPORTED'
and a.JACKET_SOURCE_CD = 'LAT'
and a.AUDIT_REC_UPDT_DTS > (select EXTRACT_START_DTS from tb_transaction where trans_sys_id = (select max(trans_sys_id) from tb_transaction))
and a.AUDIT_REC_UPDT_DTS < (select EXTRACT_END_DTS from tb_transaction where trans_sys_id = (select max(trans_sys_id) from tb_transaction))



And here's the execution plan:


------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------------------
|* 1 | TABLE ACCESS BY INDEX ROWID | TR_POLICY_REGISTER | 1 | 995 | 10508 |00:00:00.07 | 14259 |
|* 2 | INDEX RANGE SCAN | TR_POLICY_REGISTER_N6 | 1 | 269K| 23576 |00:00:00.01 | 138 |
| 3 | TABLE ACCESS BY INDEX ROWID | TB_TRANSACTION | 1 | 1 | 1 |00:00:00.01 | 3 |
|* 4 | INDEX UNIQUE SCAN | TB_TRANSACTION_PK | 1 | 1 | 1 |00:00:00.01 | 2 |
| 5 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
| 6 | INDEX FULL SCAN (MIN/MAX)| TB_TRANSACTION_PK | 1 | 289 | 1 |00:00:00.01 | 1 |
| 7 | TABLE ACCESS BY INDEX ROWID | TB_TRANSACTION | 1 | 1 | 1 |00:00:00.01 | 3 |
|* 8 | INDEX UNIQUE SCAN | TB_TRANSACTION_PK | 1 | 1 | 1 |00:00:00.01 | 2 |
| 9 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
| 10 | INDEX FULL SCAN (MIN/MAX)| TB_TRANSACTION_PK | 1 | 289 | 1 |00:00:00.01 | 1 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(("A"."JACKET_SOURCE_CD"='LAT' AND "A"."POL_STAT_CD"='REPORTED'))
2 - access("A"."AUDIT_REC_UPDT_DTS"> AND "A"."AUDIT_REC_UPDT_DTS"<)
4 - access("TRANS_SYS_ID"=)
8 - access("TRANS_SYS_ID"=)


I can live with that.

It took me a lot longer to type this than to rewrite the SQL and test it! Now, the query doesn't need to be "chunked up" nor does the index need to be temporarily created. Maybe there's another way to do it but this was my first thought and it worked as desired, so all is well!

Good deal...

Monday, June 8, 2009

Photo of my favorite kid...mine!

She fell and busted her upper lip yesterday but seems to be none to worse for the wear.  Nothing like a ride in the car while watching a DVD and eating Teddy Grahams!  

Posted via email from Karen Morton's posterous