Saturday, March 3, 2007

Optimizer_mode – ALL_ROWS or FIRST_ROWS?

Out of all Oracle RDBMS modules, optimizer code is actually the most complicated code and different optimizer modes seem like jack while lifting your car in case of a puncture.

This paper focuses on how optimizer behaves differently when you have optimizer mode set to ALL_ROWS or FIRST_ROWS.

Possible values for optimizer_mode = choose/ all_rows/ first_rows/ first_rows[n]

By default, the value of optimizer_mode is CHOOSE which basically means ALL_ROWS (if statistics on underlying tables exist) else RULE (if there are no statistics on underlying tables). So it is very important to have statistics collected on your tables on regular intervals or else you are living in Stone Age.

FIRST_ROWS and ALL_ROWS are both cost based optimizer features. You may use them according to their requirement.

FIRST_ROWS/ FIRST_ROWS[n]

In simple terms it ensures best response time of first few rows (n rows).

This mode is good for interactive client-server environment where server serves first few rows and by the time user scroll down for more rows, it fetches other. So user feels that he has been served the data he requested, but in reality the request is still pending and query is still fetching the data in background.

Best example for this is toad, if you click on data tab, it instantaneously start showing you data and you feel toad is faster than sqlplus, but the fact is if you scroll down, you will see the query is still running.

Ok, let us simulate this on SQLPLUS

Create a table and index over it:


SQL> create table test as select * from all_objects;

Table created.

SQL> create index test_in on test(object_type);

Index created.

SQL> exec dbms_stats.gather_table_stats(‘SAC’,'TEST')

PL/SQL procedure successfully completed.

SQL> select count(*) from test;

COUNT(*)
----------
37944

SQL> select count(*) from test where object_type='JAVA CLASS';

COUNT(*)
----------
14927

You see out of almost 38k records, 15k are of JAVA class. And now if you select the rows having object_type=’JAVA_CLASS’, it should not use index as almost half of the rows are JAVA_CLASS. It will be foolish of optimizer to read the index first and then go to table.

Check out the Explain plans



SQL> set autotrace traceonly exp
SQL> select * from test where object_type='JAVA CLASS';

Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1001 | 94094 | 10 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 1001 | 94094 | 10 (0)| 00:00:01 |
--------------------------------------------------------------------------

As you see above, optimizer has not used Index we created on this table.

Now use FIRST_ROWS hint:


SQL> select /*+ FIRST_ROWS*/ * from test where object_type='JAVA CLASS';

Execution Plan
----------------------------------------------------------
Plan hash value: 3548301374

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14662 | 1345K| 536 (1)| 00:00:07 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 14662 | 1345K| 536 (1)| 00:00:07 |
|* 2 | INDEX RANGE SCAN | TEST_IN | 14662 | | 43 (3)| 00:00:01 |
---------------------------------------------------------------------------------------

In this case, optimizer has used the index.

Q> Why?

Ans> Because you wanted to see first few rows quickly. So, following your instructions oracle delivered you first few rows quickly using index and later delivering the rest.

See the difference in cost, although the response time (partial) of second query was faster but resource consumption was high.

But that does not mean that this optimizer mode is bad. As I said this mode may be good for interactive client-server model. In most of OLTP systems, where users want to see data fast on their screen, this mode of optimizer is very handy.

Important facts about FIRST_ROWS

  1. It gives preference to Index scan Vs Full scan (even when index scan is not good).
  2. It prefers nested loop over hash joins because nested loop returns data as selected (& compared), but hash join hashes one first input in hash table which takes time.
  3. Cost of the query is not the only criteria for choosing the execution plan. It chooses plan which helps in fetching first rows fast.
  4. It may be a good option to use this in an OLTP environment where user wants to see data as early as possible.


ALL_ROWS

In simple terms, it means better throughput

While FIRST_ROWS may be good in returning first few rows, ALL_ROWS ensures the optimum resource consumption and throughput of the query. In other words, ALL_ROWS is better to retrieve the last row first.

In above example while explaining FIRST_ROWS, you have already seen how efficient ALL_ROWS is.

Important facts about ALL_ROWS

  1. ALL_ROWS considers both index scan and full scan and based on their contribution to the overall query, it uses them. If Selectivity of a column is low, optimizer may use index to fetch the data (for example ‘where employee_code=7712’), but if selectivity of column is quite high ('where deptno=10'), optimizer may consider doing Full table scan. With ALL_ROWS, optimizer has more freedom to its job at its best.
  2. Good for OLAP system, where work happens in batches/procedures. (While some of the report may still use FIRST_ROWS depending upon the anxiety level of report reviewers)
  3. Likes hash joins over nested loop for larger data sets.

Conclusion

Cost based optimizer gives you flexibility to choose response time or throughput. So use them based on your business requirement.

24 comments:

Anand Diwan said...

Superb note !

Anand Diwan said...

Can you pls explain difference b/w Nested loops and hash joins? Or recommend a good note.

Thankyou !

Sachin said...

Thanks Anand for your feedback. You may read http://oracle-online-help.blogspot.com/2007/03/nested-loops-hash-join-and-sort-merge.html for diff b/w nested loop and hash joins.

Prakash GR said...

superb article about optimizer_mode= first_rows/all_rows.

thanks for providing thid details.

cheers

prakash

Anonymous said...

Good explanation Sachin.
Keep Doing.

Anonymous said...

Excellent

Anonymous said...

Great explanation... -Wandi-

Unknown said...

Awesome Job....Simple things makes life easy

Unknown said...

Awesome Job....Simple things makes life easy

Anonymous said...

Hi sachin,
Very nice explanation regarding optimizers..nice differnce between all_rows and first_rows

Anonymous said...

Simply Great Article!!!

Anonymous said...

Excellent note......

vprasad said...

Very good explanation

Unknown said...

Nice explainations in simple manner.Keep it up Sachin... We are expecting more articles about the performance of database...
Thanks a lot.

Ashwin...

Harmandeep Singh said...

Nice articvle..

I feel first_rows is used for the report queries only anot in general for the OLTP systems.

In OLTP all_rows is used as the rows to be processed is usually very less.

PS said...

Great article ! It saved me a lot of reading ;)

Unknown said...

Great info. Thank you!

Vanitha said...

This blog is really awesome. Helped me to understand the basics concepts. Thanks a lot.

Anonymous said...

Awesome Blog. Helped me understand the basics.

Bhanu said...

Thank you nice explanation

Anonymous said...

Excellent Sachin. Thanks for your time to explain in easy way to understand

Anonymous said...

Excellent sachin... nice article on optimizer modes..

Anonymous said...

Job Well done

Anonymous said...

Superb!!