Last week, I was working on a package, which is used to collect stats.
I realized we collect stats of all partioned tables,indexes at both partition and global level.
While working on that package, i was pondering when will we actually use partition/local level stats (over global stats) in real-world scenario as most of our "where" clause
values are passed using bind variables. How will optimizer will come to know which partition (stats) to use because value passed will checked run time except in the case when it
is run for the first time and Bind peek happens.
A bit confused with this I performed a test, where my test case was inspired by David Aldridge's blog post (however i conducted a different set of tests).
I first checked the difference in 10053 trace in sqls where we pass literals Vs bind variables - when bind peek is enabled (default - enabled, but not in our case).
Test case:
============
drop table test_par
/
create table test_par
(
col1 number not null,
col2 number not null
)
nologging compress pctfree 0
partition by range (col1)
(partition p1_to_4 values less than (5),
partition p5 values less than (6)
)
/
insert /*+ append */ into test_par
select mod(rownum,4)+1,rownum
from dual
connect by level <= 100000
/
insert into test_par values(5,10);
commit;
create index idx02_test_par
on test_par (col2)
local nologging
/
begin
DBMS_STATS.GATHER_TABLE_STATS (
ownname => user,
tabname => 'test_par',
partname => null,
estimate_percent => 100,
block_sample => false,
method_opt => 'for all columns size 1',
degree => null,
granularity => 'ALL',
cascade => true,
stattab => NULL,
statid => NULL,
statown => NULL,
no_invalidate => FALSE);
end;
/
So - i have 75000 rows in partition p1_to_4 and 1 row in p5
SQL> select count(*),col1 from test_par partition(p1_to_4) group by col1;
COUNT(*) COL1
---------- ----------
25000 1
25000 2
25000 4
25000 3
SQL> select count(*),col1 from test_par partition(p5) group by col1;
COUNT(*) COL1
---------- ----------
1 5
Now we have the set up ready, we have to enable the trace and check which stats are used when.
Case 1: - Bind peek enable. I use literal value to query table and see which stats are used.
SQL> alter session set "_optim_peek_user_binds"=true;
SQL> alter session set events '10053 trace name context forever, level 2';
SQL> select * from test_par where col1=2 and col2=5;
Now checking the trace file, we see:
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: TEST_PAR Alias: TEST_PAR Partition [0] -- Do you see Partition stats getting used?
#Rows: 100000 #Blks: 149 AvgRowLen: 7.00
#Rows: 100000 #Blks: 149 AvgRowLen: 7.00
Index Stats::
Index: IDX02_TEST_PAR Col#: 2 PARTITION [0]
LVLS: 1 #LB: 222 #DK: 100000 LB/K: 1.00 DB/K: 1.00 CLUF: 149.00
LVLS: 1 #LB: 222 #DK: 100000 LB/K: 1.00 DB/K: 1.00 CLUF: 149.00
***************************************
So, with Bind peek enabled - we use partition stat (when we should) when literals are used
Case 2: Bind peek enable. I use Bind value to query table and see which stats are used.
SQL> alter session set "_optim_peek_user_binds"=true;
SQL> alter session set events '10053 trace name context forever, level 2';
SQL> variable lcol1 number;
SQL> variable lcol2 number;
SQL> execute :lcol1:=2;
SQL> execute :lcol2:=5;
SQL> select * from test_par where col1=:lcol1 and col2=:lcol2;
Trace file:
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: TEST_PAR Alias: TEST_PAR Partition [0] -- Here again, partition stats getting used
#Rows: 100000 #Blks: 149 AvgRowLen: 7.00
#Rows: 100000 #Blks: 149 AvgRowLen: 7.00
Index Stats::
Index: IDX02_TEST_PAR Col#: 2 PARTITION [0]
LVLS: 1 #LB: 222 #DK: 100000 LB/K: 1.00 DB/K: 1.00 CLUF: 149.00
LVLS: 1 #LB: 222 #DK: 100000 LB/K: 1.00 DB/K: 1.00 CLUF: 149.00
***************************************
Since this was the first time, we used this sql, Bind peek had to happen.
This is confirmed by following section of trace file:
*******************************************
Peeked values of the binds in SQL statement
*******************************************
kkscoacd
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000000 frm=00 csi=00 siz=48 off=0
kxsbbbfp=2a973589d8 bln=22 avl=02 flg=05
value=2
Bind#1
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000000 frm=00 csi=00 siz=0 off=24
kxsbbbfp=2a973589f0 bln=22 avl=02 flg=01
value=5
CASE 3: Bind peek enabled - I use Bind value to query table and see which stats are used. -- This will be 2nd time - so this time i dont expect peeking to happen.
SQL> alter session set "_optim_peek_user_binds"=true;
SQL> alter session set events '10053 trace name context forever, level 2';
SQL> variable lcol1 number;
SQL> variable lcol2 number;
SQL> execute :lcol1:=5;
SQL> execute :lcol2:=10;
SQL> select * from test_par where col1=:lcol1 and col2=:lcol2;
Trace file:
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: TEST_PAR Alias: TEST_PAR Partition [0] -- Part Stats .. again
#Rows: 100000 #Blks: 149 AvgRowLen: 7.00
#Rows: 100000 #Blks: 149 AvgRowLen: 7.00
Index Stats::
Index: IDX02_TEST_PAR Col#: 2 PARTITION [0]
LVLS: 1 #LB: 222 #DK: 100000 LB/K: 1.00 DB/K: 1.00 CLUF: 149.00
LVLS: 1 #LB: 222 #DK: 100000 LB/K: 1.00 DB/K: 1.00 CLUF: 149.00
***************************************
In this case - we still use old execution path and old stats - which is agony of 10g.
We had only one row for this partition and we ended up using INDEX access when direct access would be really fast.
I heard about Adaptive cursor sharing in 11g will avoid this, but still to run tests on it.
Now we will run the same tests by keeping the bind peeking disabled (as in my environment)
CASE 4: Bind peek disabled - we use literals
SQL> alter system flush shared_pool;
SQL> alter session set "_optim_peek_user_binds"=false;
SQL> alter session set events '10053 trace name context forever, level 2';
SQL> select * from test_par where col1=5 and col2=10;
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: TEST_PAR Alias: TEST_PAR Partition [1] -- Picked the right stats of right partition
#Rows: 1 #Blks: 1 AvgRowLen: 6.00
#Rows: 1 #Blks: 1 AvgRowLen: 6.00
Index Stats::
Index: IDX02_TEST_PAR Col#: 2 PARTITION [1]
LVLS: 0 #LB: 1 #DK: 1 LB/K: 1.00 DB/K: 1.00 CLUF: 1.00
LVLS: 0 #LB: 1 #DK: 1 LB/K: 1.00 DB/K: 1.00 CLUF: 1.00
***************************************
Literals - as seen knows where to go.
CASE 5: Bind peek disabled - we use binds
SQL> variable lcol1 number;
SQL> variable lcol2 number;
SQL> execute :lcol1:=5;
SQL> execute :lcol2:=10;
SQL> alter session set "_optim_peek_user_binds"=false;
SQL> alter session set events '10053 trace name context forever, level 2';
SQL> select * from test_par where col1=:lcol1 and col2=:lcol2;
Trace file:
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: TEST_PAR Alias: TEST_PAR (Using composite stats) -- These are global stats
#Rows: 100001 #Blks: 75 AvgRowLen: 7.00
Index Stats::
Index: IDX02_TEST_PAR Col#: 2
USING COMPOSITE STATS
LVLS: 1 #LB: 223 #DK: 100000 LB/K: 1.00 DB/K: 1.00 CLUF: 150.00
***************************************
So - the conclusion is :
When we have BIND peeeking disabled, and with usage of bind values - we are most likely (any one of you could come up where above doesnt hold good) NOT using partiton stats.
3 comments:
It's interesting that you're querying the table with bind variables. Normally that would mean lots of queries differing only by their bound variables, and you would not expect partition pruing (where partition statistics are significant) to be a factor. Can you explain more about your environment?
Thanks David for your comments.
Yeah - and my test was just to show that only.
We have few very huge tables - so for those, the practice is to collect stats for biggest partition and set the same stats for all. Although it may sound wierd, that is the current situation.
In addition to that, we have bind peeking disabled in our prod system - to avoid surprises!
To understand more on why the "set" partition level stats - i conducted this test.
One more point (although logical) - which i dont think I wrote is - when one uses partition name in "from" clause - i.e. one passes the part name to the query at run time - optimizer should use part stats.
Hi,
You said:
"We have few very huge tables - so for those, the practice is to collect stats for biggest partition and set the same stats for all. Although it may sound wierd, that is the current situation."
Question: Can you explain why you collect stats for biggest partition and set the same stats for all.
Regards,
CMT
Post a Comment