What is hash join in oracle with example?SQL PLSQL
INTRODUCTION:
It is the join strategy where the one table data is loaded into the hash table for an efficient lookup. Other table data is scanned to match with the hash table which is suitable for the large datasets. Database optimizer uses the two of the smaller tables to build the hash table into the memory. Its function is used to map join key values to the hash table locations. Larger table is also scanned for each row. It's fast and efficient for quick matching of rows which is based on the join condition. It is an effective when one or both of the tables are large and cannot fit in the memory. It is also effective when the join condition is an equally predicate i.e. table1.column=table2.column.
It's cost is limited to the single read pass which is over the two data sets. It is situated in the PGA and the rows are accessed without latching. It will reduce I/O by preventing the necessity of repeatedly latching and reading blocks in the database buffer cache. If the data sets do not fit into the memory, the database partitions the rows and join proceed with the partition. It uses the large sort of area memory and I/O to the temporary tablespace. It is cost effective in the case of parallel query servers are used by the database. It takes the set of inputs and applies deterministic function to generate the random value.
Input values are said to be the join keys and output values are said to be the indexes in the array. The term hash collision is the same hash value for the different functions. The term build table is used when the database performs the full scan on the smaller data sets. Its algorithm emphasizes on the weak spot of the nested loops join. It requires the different indexing approach than the nested loop join. There is no need to index the join columns.
It works by building and probing the hash table. Oracle uses the smaller table or the result of the subquery which stores into the memories by creating hash table which is based on the join key and Oracle scans the larger table and looks up the match table with the hash table. It is most effective where the indexed and nested joins are inefficient and used for the large datasets.
Let us consider the two tables
emp(empno,deptno)
dept(dname,deptno)
Two tables emp and dept are combined with the hash join by using the below query.
select e.empno,e.deptno,d.dname
FROM emp e JOIN dept d ON e.deptno=d.deptno;
EXPLAIN PLAN FOR
select e.empno,e.deptno,d.dname
FROM emp e JOIN dept d ON e.deptno=d.deptno;
The below query is to check the execution plan.
select * from table(dbms_xplan.display);
select e.empno,e.deptno,d.dname
FROM emp e JOIN dept d ON e.deptno=d.deptno where e.hiredate<trunc(sysdate)- INTERVAL '6' MONTH
When to use?
When the large tables are joined without using the indexes.
When the full table scan is cheaper than the index look ups.
When we are working with the parallel execution.
Comments
Post a Comment