Thursday, June 19, 2008

Sql Tuning and Analysis

The Problem Scenario...

At a client site, a concurrent program "Import Items" took 4 and half hours on an average to complete (for loading 1000 items), this was not acceptable as the client had to load 45,000 items for 11 organizations so, 11* 45,000 items would mean, for one orgnization it would take 202.5 hours so for 11 organizations it would take them 2227.5 hours which is around 93 days assuming it runs all through the day and night.

Environment Details

Apps Version : 11.5.10.2
Database Version : 10.2.0.3
OS version : RHEL AS 4 update 6
RAM size : 8 GB
No.of Processors : 1 processor
SGA size : 2 GB

From preliminary investigation using trace in oracle applications 11i, that the maximum time (about 3 hrs) was taken up by the following SQL itself for 1000 items to be imported.

MERGE INTO eni_oltp_item_star STAR
USING (SELECT item.inventory_item_id inventory_item_id,
item.organization_id organization_id,
item.CONCATENATED_SEGMENTS|| ' (' || mtp.organization_code || ')'value,
decode(item.organization_id,mtp.master_organization_id,null,
item.inventory_item_id || '-' || mtp.master_organization_id)
master_id,
nvl(mic.category_id,-1) inv_category_id,
nvl(mic.category_set_id, :l_inv_category_set) inv_category_set_id,
nvl(kfv.concatenated_segments,'Unassigned') inv_concat_seg,
nvl(mic1.category_id,-1) vbh_category_id,
nvl(mic1.category_set_id, :l_vbh_category_set) vbh_category_set_id,
nvl(kfv1.concatenated_segments,'Unassigned') vbh_concat_seg,
nvl(item.item_catalog_group_id,-1) item_catalog_group_id,
item.primary_uom_code,
item.creation_date,
item.last_update_date
FROM mtl_system_items_interface interface,
mtl_system_items_b_kfv item,
mtl_parameters mtp,
mtl_item_categories mic ,
mtl_categories_b_kfv kfv ,
mtl_item_categories mic1,
mtl_categories_b_kfv kfv1
WHERE item.inventory_item_id = interface.inventory_item_id
-- AND interface.set_process_id = p_set_process_id
AND interface.set_process_id in
(:p_set_process_id, :p_set_process_id +1000000000000)
AND interface.process_flag = 7
AND item.organization_id = interface.organization_id
AND item.organization_id= mtp.organization_id
AND mic.organization_id(+) = item.organization_id
AND mic.inventory_item_id(+) = item.inventory_item_id
AND mic.category_id = kfv.category_id (+)
and mic.category_set_id(+) = :l_inv_category_set
AND mic1.organization_id(+) = item.organization_id
AND mic1.inventory_item_id(+) = item.inventory_item_id
AND mic1.category_id = kfv1.category_id (+)
and mic1.category_set_id(+) = :l_vbh_category_set) mti
ON (STAR.inventory_item_id = mti.inventory_item_id
AND STAR.organization_id = mti.organization_id)
WHEN MATCHED THEN
UPDATE SET STAR.value = mti.value,
STAR.inv_category_id = mti.inv_category_id,
STAR.inv_category_set_id = mti.inv_category_set_id,
STAR.inv_concat_seg = mti.inv_concat_seg,
STAR.vbh_category_id = mti.vbh_category_id,
STAR.vbh_category_set_id = mti.vbh_category_set_id,
STAR.vbh_concat_seg = mti.vbh_concat_seg,
STAR.master_id = mti.master_id,
STAR.item_catalog_group_id = mti.item_catalog_group_id,
STAR.primary_uom_code = mti.primary_uom_code,
STAR.last_update_date = mti.last_update_date
WHEN NOT MATCHED THEN
INSERT (
id,
value,
inventory_item_id,
organization_id,
inv_category_id,
inv_category_set_id,
inv_concat_seg,
vbh_category_id,
vbh_category_set_id,
vbh_concat_seg,
master_id,
item_catalog_group_id,
primary_uom_code,
creation_date,
last_update_date)
VALUES(
mti.inventory_item_id || '-' || mti.organization_id,
mti.value,
mti.inventory_item_id,
mti.organization_id,
mti.inv_category_id,
mti.inv_category_set_id,
mti.inv_concat_seg,
mti.vbh_category_id,
mti.vbh_category_set_id,
mti.vbh_concat_seg,
mti.master_id,
mti.item_catalog_group_id,
mti.primary_uom_code,
mti.creation_date,
mti.last_update_date)

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 559.14 546.14 0 315621488 0 0
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 559.14 546.14 0 315621488 0 0

Evaluating options
The first thing which we have done is to look into the optimizer, It is CHOOSE based. used Index and its fine.The Second thing we saw is that what are all the tables included in this Issue and when it is analyzed.

For example

select LAST_ANALYZED from dba_tables where TABLE_NAME like 'eni.eni_oltp_item_star';

This was last analyzed on 2005

Now Analyzed the table by

ANALYZE TABLE eni.eni_oltp_item_star COMPUTE STATISTICS;

The Improvement
===============

The same query now executed under 1.4 hrs to update 1000 records . Though it is faster this would still mean lot of time to import 45000 items for 11 organizations, so we further looked into the issue using the tracefile / tkprof output.
Using the tkprof outputfile we found out what SQL's are consuming time. The quickest way to look through a large tkprof output file is to grep the lines start with "total" and sort i based on the 3 or 4th column so that sql's that take time can be easily located instead of having to look at every page of the tkprof output.

The command used to do the above activity is

$grep "^total" tkrpofoutputfile | sort -n +2

The above command will give you an output as shown below

total 2001 143.68 140.33 0 58128000 0 1000
total 2001 145.98 142.54 0 58128000 0 1000
total 2001 147.57 144.11 0 58128000 0 1000
total 2001 158.26 154.56 0 58128000 0 1000
total 2001 160.53 156.76 0 58128000 0 1000
total 2001 163.98 160.14 0 58128000 0 1000
total 2001 183.42 179.16 0 58128000 0 1000
total 2001 185.27 180.95 0 58128000 0 1000
total 2001 187.72 183.39 0 58128000 0 1000
total 2001 190.38 185.93 0 58128000 0 1000
total 2001 191.76 187.31 0 58128000 0 1000
total 2001 192.59 188.12 0 58128000 0 1000
total 2001 193.99 189.47 0 58128000 0 1000
total 2001 195.28 190.75 0 58128000 0 1000
total 2001 197.38 192.77 0 58128000 0 1000
total 2001 203.64 198.90 0 58128000 0 1000
total 2001 206.08 201.31 0 58128000 0 1000
total 2001 208.57 203.69 0 58128000 0 1000
total 2001 219.98 214.88 0 58128000 0 1000
total 2001 222.24 217.06 0 58128000 0 1000
total 2001 275.63 269.23 0 106943000 0 1000
total 271846 6074.20 5933.56 18800 2191690345 232725 186675
total 2 6088.02 5947.58 18800 2191686839 232550 1
total 198 6088.49 5948.03 18800 2191687323 232572 132

Note : Here There is no need to take care of last 2 rows which consists of total recursive and Non-recursive sql's time spent. our intension is to find which individual sql that takes long time to execute.

The improved plan was as follows:

Now Again started analyzing the tables which are all included in this scenario. while the concurrent request is running we took the request id and from that found the session_id using the table FND_CONCURRENT_REQUESTS and from that found the hash _value and from v$sqltext we found the sql involved in it

The query is as follows:

select a.sql_text from v$sqltext a,v$session b,apps.fnd_concurrent_requests c
where c.request_id=&request_id
and c.oracle_session_id=b.audsid
and c.os_process_id=b.process
and b.sql_hash_value=a.hash_value
and b.sql_address=a.address

Here give the Request id for which we are searching for, So that we can get the sql which ran for that particular request. Which is as follows.

SELECT INVENTORY_ITEM_ID FROM MTL_SYSTEM_ITEMS MSI WHERE MSI.INVENTORY_ITEM_ID = :B19 AND MSI.ORGANIZATION_ID = :B18 AND DECODE(:B17 ,:B2 ,NVL(:B16 ,-1),NVL(MSI.RELEASE_TIME_FENCE_CODE,-1))=NVL(MSI.RELEASE_TIME_FENCE_CODE,-1) AND DECODE(:B15 ,:B2 ,NVL(:B14 ,-1),NVL(MSI.RELEASE_TIME_FENCE_DAYS,-1))=NVL(MSI.RELEASE_TIME_FENCE_DAYS,-1) AND DECODE(:B13 ,:B2 ,NVL(:B12 ,-1),NVL(MSI.CONTAINER_ITEM_FLAG,-1))=NVL(MSI.CONTAINER_ITEM_FLAG,-1) AND DECODE(:B11 ,:B2 ,NVL(:B10 ,-1),NVL(MSI.CONTAINER_TYPE_CODE,-1))=NVL(MSI.CONTAINER_TYPE_CODE,-1) AND DECODE(:B9 ,:B2 ,NVL(:B8 ,-1),NVL(MSI.INTERNAL_VOLUME,-1))=NVL(MSI.INTERNAL_VOLUME,-1) AND DECODE(:B7 ,:B2 ,NVL(:B6 ,-1),NVL(MSI.MAXIMUM_LOAD_WEIGHT,-1))=NVL(MSI.MAXIMUM_LOAD_WEIGHT,-1) AND DECODE(:B5 ,:B2 ,NVL(:B4 ,-1),NVL(MSI.MINIMUM_FILL_PERCENT,-1))=NVL(MSI.MINIMUM_FILL_PERCENT,-1) AND DECODE(:B3 ,:B2 ,NVL(:B1 ,-1),NVL(MSI.VEHICLE_ITEM_FLAG,-1))=NVL(MSI.VEHICLE_ITEM_FLAG,-1)

And Now we found that, most of the time is spent in the above sql and when we searched in the Tkprof file from the udump by using the filter the “total” ,We searched where it took more time that is the above sql and it was related to the table inv.mtl_system_items_b.

Note: Here MTL_SYSTEM_ITEMS is a SYNONYM and it points to the table inv.mtl_system_items_b. so we saw that table and we came to know that it was also not analyzed. So the table has been analyzed using the following command:
ANALYZE TABLE inv.mtl_system_items_b COMPUTE STATISTICS;

Now the same "Import Item Program" was executed and it hardly takes 5 mins for 1000 Records to import.

Later we also tested this with 15000 items, the Import item only took about 1 hour.

So now this customer can import 45000 items for one organization in 3 hours, so for 11 organizations they could finish this in around 33 hours even if they do this sequentially. (they will in reality do this in parallel and finish this within one day)

Conclusion

It may be required to run the Gather Schema Statistics for all schemas regularly atleast once every week, So that all the objects will be analysed. If all those tables are analysed the performance can be improved.
The analysis also included monitoring their hardware resources like cpu / memory / i-o, but none of these were an issue in this scenario. The cpu was only 25% used, SGA was only 50% used, and there was no swapping

Note : The Trace should be enabled in concurrent -> programs -> enable Trace option.

1 comment:

Unknown said...

I have also heard about another application that repairs corrupted database files. are you curious how to restore corrupted mdf file?