Saturday, October 28, 2006

Domain index not used after migrating from 8.1.7.4 to 10.2.0.2

If you have application running on Oracle 8.1.7.4 and you are massively using CONTEXT indexes, you better analyse the queries where context indexes are involved, before you decide to migrate to 10.2.0.x.
This is because execution paths might be way different in 10.2.0.x that may lead to performance degradation.
Here is one example where you should consider query modification in order to avoid bad performances.
Let assume that you have a query that has OR-condition where OR-predicates use CONTEXT operators.

Example:

select a.owner, a.object_name, b.subobject_name, c.object_id
from all_objects_a a, all_objects_b b, all_objects_c c
where
a.object_id=b.object_id
and b.object_id=c.object_id
and
(
CONTAINS(a.text,:a,0)>0
or
CONTAINS(b.text,:b,1)>0
or
CONTAINS(c.text,:c,2)>0
)

This query in 8.1.7.4 most likely will have execution plan similar to this one:

Plan Table
----------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost |
----------------------------------------------------------------
| SELECT STATEMENT | | 42 | 8K| 93 |
| CONCATENATION | | | | |
| NESTED LOOPS | | 14 | 1K| |
| NESTED LOOPS | | 14 | 2K| 17 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 2K| 135K| 1 |
| DOMAIN INDEX |ALL_OBJEC | 2K| | |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 2K| 178K| 1 |
| INDEX UNIQUE SCAN |ALL_OBJ_B | 2K| | |
| TABLE ACCESS BY INDEX R|ALL_OBJEC | 14 | 1K| 2 |
| INDEX UNIQUE SCAN |ALL_OBJ_A | | | 2 |
| NESTED LOOPS | | 14 | 1K| |
| NESTED LOOPS | | 14 | 2K| 31 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 2K| 178K| 1 |
| DOMAIN INDEX |ALL_OBJEC | 2K| | |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 2K| 135K| 1 |
| INDEX UNIQUE SCAN |ALL_OBJ_C | 2K| | |
| TABLE ACCESS BY INDEX R|ALL_OBJEC | 14 | 1K| 2 |
| INDEX UNIQUE SCAN |ALL_OBJ_A | | | 2 |
| NESTED LOOPS | | 14 | 2K| 31 |
| NESTED LOOPS | | 14 | 2K| 17 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 14 | 1K| 2 |
| DOMAIN INDEX |ALL_OBJEC | | | 2 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 2K| 178K| 1 |
| INDEX UNIQUE SCAN |ALL_OBJ_B | 2K| | |
| TABLE ACCESS BY INDEX R|ALL_OBJEC | 2K| 135K| 1 |
| INDEX UNIQUE SCAN |ALL_OBJ_C | 2K| | |
----------------------------------------------------------------

where domain indexes are used to support CONTAINS function.
Well, in 10.2.0.2, you better be prepared to have at least twice slower response time, since for this type of query no domain indexes can be used:
The execution plan for my 10.2.0.2 database looks like:

Plan Table
----------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost |
----------------------------------------------------------------
| SELECT STATEMENT | | 42 | 8K| 1650 |
| HASH JOIN | | 42 | 8K| 1650 |
| TABLE ACCESS FULL |ALL_OBJEC | 51K| 4M| 234 |
| HASH JOIN | | 51K| 6M| 835 |
| TABLE ACCESS FULL |ALL_OBJEC | 51K| 2M| 234 |
| TABLE ACCESS FULL |ALL_OBJEC | 51K| 3M| 234 |
----------------------------------------------------------------

All tables were identical in both databases.

I have opened a SR with Oracle Support on this one, and the answers that I got were very interesting and make sense.
The key reason for such a behaviuor is how Oracle optimizer handles OR-expansion condition where OR-predicates use CONTEXT operators.
Oracle optimizer usually probes OR-expansion to see if a cheaper index based plan can be produced. But, Or-expansion is not legitimate if OR-predicate has a CONTEXT operator, because some of the new "concatenated" query blocks will loose context of this operator.
So, generally, the plan generated in 8.1.7.4 is not legitimate, even though it is very good one, performance wise.
Anyway, if you want to use plan similar to the one you used to have, then you better transform your queries.
I modified the query shown above, to this one:


select * from
(
select a.owner, a.object_name, b.subobject_name, c.object_id
from all_objects_a a, all_objects_b b, all_objects_c c
where
a.object_id=b.object_id
and b.object_id=c.object_id
and
(
CONTAINS(a.text,:a,0)>0
)
)
UNION ALL
(
select a.owner, a.object_name, b.subobject_name, c.object_id
from all_objects_a a, all_objects_b b, all_objects_c c
where
a.object_id=b.object_id
and b.object_id=c.object_id
and
(
CONTAINS(a.text,:a,0)<=0 and CONTAINS(b.text,:b,1)>0
)
)
UNION ALL
(select a.owner, a.object_name, b.subobject_name, c.object_id
from all_objects_a a, all_objects_b b, all_objects_c c
where
a.object_id=b.object_id
and b.object_id=c.object_id
and
(
CONTAINS(a.text,:a,0)<=0 and CONTAINS(b.text,:b,1)<=0 and CONTAINS(c.text,:c,2)>0
)
)


Execution plan has been changed to this one:

Plan Table
----------------------------------------------------------------
| Operation | Name | Rows | Bytes| Cost |
----------------------------------------------------------------
| SELECT STATEMENT | | 42 | 6K| 31 |
| UNION-ALL | | | | |
| NESTED LOOPS | | 14 | 1K| 19 |
| NESTED LOOPS | | 14 | 1K| 19 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 14 | 1K| 5 |
| DOMAIN INDEX |ALL_OBJEC | | | 2 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 1 | 21 | 1 |
| INDEX UNIQUE SCAN |ALL_OBJ_B | 1 | | 0 |
| INDEX UNIQUE SCAN |ALL_OBJ_C | 1 | 4 | 0 |
| NESTED LOOPS | | 14 | 2K| 5 |
| HASH JOIN | | 14 | 2K| 5 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 14 | 1K| 2 |
| DOMAIN INDEX |ALL_OBJEC | | | 2 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 14 | 994 | 2 |
| DOMAIN INDEX |ALL_OBJEC | | | 2 |
| INDEX UNIQUE SCAN |ALL_OBJ_C | 1 | 4 | 0 |
| HASH JOIN | | 14 | 2K| 7 |
| HASH JOIN | | 14 | 2K| 5 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 14 | 1K| 2 |
| DOMAIN INDEX |ALL_OBJEC | | | 2 |
| TABLE ACCESS BY INDEX |ALL_OBJEC | 14 | 994 | 2 |
| DOMAIN INDEX |ALL_OBJEC | | | 2 |
| TABLE ACCESS BY INDEX R|ALL_OBJEC | 14 | 756 | 2 |
| DOMAIN INDEX |ALL_OBJEC | | | 2 |
----------------------------------------------------------------

end the response time was much better compared to both previous plans.

Now, the questions are:
How this can be implemented in the application?
How many queries similar to this one are being used in the application?
Is it worth the effort?
Can your application live with the execution paths based on full table scan?

You better answer to these questions before you decide what your next step would be.

Friday, October 27, 2006

On the road again

It’s been a long time since my last post on this blog.
Ok, one of the reasons for this “absenteeism” was my new job.
Honestly, I didn’t have too much time to write anything except reports, research papers, proof of concepts, etc.
Although I commute more than I used to, I don’t mind since I use every chance to read something interesting while riding on CTA trains.
In three months, only in train cars I’ve read three books, two for the first time and one for the second time (just because it is perfect).

1. Oracle High-performance SQL tuning - Don Burleson believe or not, I found it for $9.99 in Borders Outlet Store.

Oracle Replication: Snapshot, Multi-master & Materialized Views Scripts - John Garmany & Robert Freeman

3. Effective Oracle by Design by Tom Kyte I’ve read this book once, and I read it again, just because it is perfect.

Who said that riding in CTA trains is boring? Actually it is sometimes when train cars are so crowded.

Anyways, I am on the road again, hopefully I’ll be more active in the future writing on this blog.