Friday, March 2, 2007

Nested loops, Hash join and Sort Merge joins – difference?

Nested loop (loop over loop)

In this algorithm, an outer loop is formed which consists of few entries and then for each entry, and inner loop is processed.

Ex:

Select tab1.*, tab2.* from tabl, tab2 where tabl.col1=tab2.col2;

It is processed like:

For i in (select * from tab1) loop
For j in (select * from tab2 where col2=i.col1) loop
Display results;
End loop;
End loop;

The Steps involved in doing nested loop are:

a) Identify outer (driving) table

b) Assign inner (driven) table to outer table.

c) For every row of outer table, access the rows of inner table.

In execution plan it is seen like this:

NESTED LOOPS
outer_loop
inner_loop

When optimizer uses nested loops?

Optimizer uses nested loop when we are joining tables containing small number of rows with an efficient driving condition. It is important to have an index on column of inner join table as this table is probed every time for a new value from outer table.

Optimizer may not use nested loop in case:

  1. No of rows of both the table is quite high
  2. Inner query always results in same set of records
  3. The access path of inner table is independent of data coming from outer table.

Note: You will see more use of nested loop when using FIRST_ROWS optimizer mode as it works on model of showing instantaneous results to user as they are fetched. There is no need for selecting caching any data before it is returned to user. In case of hash join it is needed and is explained below.

Hash join

Hash joins are used when the joining large tables. The optimizer uses smaller of the 2 tables to build a hash table in memory and the scans the large tables and compares the hash value (of rows from large table) with this hash table to find the joined rows.

The algorithm of hash join is divided in two parts

  1. Build a in-memory hash table on smaller of the two tables.
  2. Probe this hash table with hash value for each row second table

In simpler terms it works like

Build phase

For each row RW1 in small (left/build) table loop
Calculate hash value on RW1 join key
Insert RW1 in appropriate hash bucket.
End loop;

Probe Phase

For each row RW2 in big (right/probe) table loop
Calculate the hash value on RW2 join key
For each row RW1 in hash table loop
If RW1 joins with RW2
Return RW1, RW2
End loop;
End loop;

When optimizer uses hash join?

Optimizer uses has join while joining big tables or big fraction of small tables.

Unlike nested loop, the output of hash join result is not instantaneous as hash joining is blocked on building up hash table.

Note: You may see more hash joins used with ALL_ROWS optimizer mode, because it works on model of showing results after all the rows of at least one of the tables are hashed in hash table.

Sort merge join

Sort merge join is used to join two independent data sources. They perform better than nested loop when the volume of data is big in tables but not as good as hash joins in general.

They perform better than hash join when the join condition columns are already sorted or there is no sorting required.

The full operation is done in two parts:

  • Sort join operation

get first row RW1 from input1
get first row RW2 from input2.


  • Merge join operation

while not at the end of either input loop
if RW1 joins with RW2
get next row R2 from input 2
return (RW1, RW2)
else if RW1 < style=""> get next row RW1 from input 1
else
get next row RW2 from input 2
end loop

Note: If the data is already sorted, first step is avoided.

Important point to understand is, unlike nested loop where driven (inner) table is read as many number of times as the input from outer table, in sort merge join each of the tables involved are accessed at most once. So they prove to be better than nested loop when the data set is large.

When optimizer uses Sort merge join?

a) When the join condition is an inequality condition (like <, <=, >=). This is because hash join cannot be used for inequality conditions and if the data set is large, nested loop is definitely not an option.

b) If sorting is anyways required due to some other attribute (other than join) like “order by”, optimizer prefers sort merge join over hash join as it is cheaper.

Note: Sort merge join can be seen with both ALL_ROWS and FIRST_ROWS optimizer hint because it works on a model of first sorting both the data sources and then start returning the results. So if the data set is large and you have FIRST_ROWS as optimizer goal, optimizer may prefer sort merge join over nested loop because of large data. And if you have ALL_ROWS as optimizer goal and if any inequality condition is used the SQL, optimizer may use sort-merge join over hash join

42 comments:

Pratheej said...

Great Job Sachin,

Thank You

Pratheej

Sushma said...

Hi,

You have explained the concept so well.

Great work

Thanks.

Lauro said...

Well done!
Very clear!

Tks

Anonymous said...

Hi Sachin,

Thanks a Lot.

Arvind

Anonymous said...

Hi sachin,
This is very well explained article on join algoriths.

simply great!!!!!

Best wishes....

Amol

Anonymous said...

good job man!!!

Arek Masny said...

Great and Clear explanation. I've understood all just after first read.
Thank's Sachin

Anonymous said...

thats really a great explaination

Anonymous said...

Simply superb !!

Unknown said...

Fantabulous Explanation !!!

Anonymous said...

Tendulkaristic...Sachin

Unknown said...

Sachin very good thanx a lot

Anonymous said...

Awesome explanantion Sachin - Kudos to you!!!

Anonymous said...

Really great explanation in simple words..........thanks a lot

John said...

Cheers sachin, share more stuff with us.

Anonymous said...

With Oracle 10g its perfectly working as you mentioned like after increasing rownum ,plan changed from "Nested loop" to "Hash join" . But for Oracle 11g plan remains same, means its uses ""Nested loop" even after changing artificial stats .Itd not turn to "Hash Join"

scott>exec dbms_stats.set_table_stats(ownname => 'SCOTT', tabname => 'E', numrows => 1000000, numblks => 10000, avgrlen => 124);

PL/SQL procedure successfully completed.

scott> exec dbms_stats.set_index_stats(ownname => 'SCOTT', indname => 'E_DEPTNO', numrows => 1000000, numlblks => 1000);


PL/SQL procedure successfully completed.

scott>scott>exec dbms_stats.set_table_stats(ownname => 'SCOTT', tabname => 'D', numrows => 1000000,numblks => 10000 , avgrlen => 124);


PL/SQL procedure successfully completed.

scott>scott>select e.ename,d.dname from e, d where e.deptno=d.deptno;



Execution Plan
----------------------------------------------------------
Plan hash value: 3723262532

--------------------------------------------------------------------------------
---------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| T
ime |

--------------------------------------------------------------------------------
---------

| 0 | SELECT STATEMENT | | 250G| 7683G| 1013K (2)| 0
3:22:40 |

| 1 | NESTED LOOPS | | | | |
|

| 2 | NESTED LOOPS | | 250G| 7683G| 1013K (2)| 0
3:22:40 |

| 3 | TABLE ACCESS FULL | D | 1000K| 12M| 2816 (4)| 0
0:00:34 |

|* 4 | INDEX RANGE SCAN | E_DEPTNO | 32 | | 0 (0)| 0
0:00:01 |

| 5 | TABLE ACCESS BY INDEX ROWID| E | 250K| 4882K| 1 (0)| 0
0:00:01 |

--------------------------------------------------------------------------------
---------


Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("E"."DEPTNO"="D"."DEPTNO")

Anonymous said...

Sachin,
Honestly this is the first time ever i understood something so clearly in first reading! You are simply VERY GOOD..you should write a book :-)

Anonymous said...

in 11G...did you first delete the existing stats and then reset stats artificially on the tables/index and give it a try and please let us know if that makes it work..

Sachin,
You did a tremendously good job in explaining the differences in CBO behaviour based on stats info/clustering info...Thanks!

Majid Khan

Anonymous said...

Also one more thing, in 11G how is your PGA_TARGET setup, is it large enough to accommodate hash joins? Remember starting from 10G R2 the % usage of PGA per sql/session has changed, i believe the max is 20% of PGA_TARGET. Also check v$pga_stat and see if you are noticing any multipasses...Those were my two cents..:-)

Majid Khan

Anonymous said...

Sachin,

I got a question..in your test example for nested loops table e will be the driving table (outer loop) and d will be the driven table (inner loop) right? If yes, then you recommended an index on the inner table which is d and no index is created on d...please clarify.

Sachin said...

In nested loop example, its "d" (dept) which is driving table and "e" (emp) is driven table. That is why we have index on "e" and not on "d".

Anonymous said...

Sachin,

By using the approach of artificially populating stats we can simulate production volume of rows/blocks/cluster factoring in the test environment to generate execution plans without in fact having the data volume right? If yes, man this is gooooood...

Thanks

Sachin said...

of course we can manipulate the table/index stats to check change in optimizer plan.

In fact, many change the stats in production to "guarantee" a particular type of plan.

Amit Deshpande said...

Great post Sachin

RK Veluvali said...

Thanks a lot sachin, I'm breaking my head about these topics, and you really made it clear...
Thanks a lot again.

Unknown said...

Please i need someone to help me, to explain a join index method with example of execution plan of query using index join

Sachin said...

Sara, could you explain what do you mean when you refer this term: "join index method"

Pushpendu said...

Thanks a ton for explaining things so clearly.Your approach is awesome, remind me of Tom Kyte.

Jaydeep Vishwakarma said...

Thanks Sachin, This is really very nice post

Mohib Alvi said...

I have deleted the post but i did mention your post as the base article and i wrote on the blog for my own reference but now its according to your will.

Anonymous said...

great job man, thank you

Raj Kiran Mattewada said...
This comment has been removed by the author.
Anonymous said...

Great article!

shailesh said...

Good Explanation and easy ro understand

Venkatesh Thammichetti said...

This is really very clear. Great work! This post gave some basic idea how the joins will work.

Chandrakanth said...

Excellent sachin! keep posting us many more

Dexter said...

wonderful explanation, superb post, thanks a lot

Anonymous said...

Good Post.
As someone mentioned earlier..you should write a book

Anonymous said...

@Sachin : That is an awsome blog..thanks a lott

Rohit Gholap

Anonymous said...

Nice Post . it helps me a lot as a starter in this area

Chandra Kambham said...

Thanks Sachin for the very simple way you used to explain the join methods, it has really helped me a lot.

Anonymous said...

Great work