How to Calculate Join Cardinality ~ Gurjas's Oracle Blog

How Calculate Join Cardinality


SQL> create table tab1 as select
 2                           trunc(dbms_random.value(0,100)) col1,
 3                           chr(trunc(dbms_random.value (97, 103))) col2
 4                       from dual connect by level <= 1000;  
Table created.  

Wrote file afiedt.buf    
1  create table tab2 as select   
2                       trunc(dbms_random.value(0,1000)) col1,   
3                       chr(trunc(dbms_random.value (100, 103))) col2   
4*                    from dual connect by level <= 10000 SQL> 
/

Table created.

SQL>

Gather stats for both the tables with "For all columns size 1" with no histograms.


SQL> exec dbms_stats.gather_table_stats (ownname=>'SCOTT', tabname=>'TAB1',
 method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats (ownname=>'SCOTT', tabname=>'TAB2', 
method_opt=>'for all columns size 1');

PL/SQL procedure successfully completed.

SQL>

SQL> ed
Wrote file afiedt.buf

 1  select count(*)
 2  from
 3     tab1,
 4     tab2
 5  where
 6     tab1.col1 = tab2.col1 and
 7*    tab1.col2 in (chr(97), chr(98), chr(99))
SQL> /

 COUNT(*)
----------
     5318


Execution Plan
----------------------------------------------------------
Plan hash value: 2043035240

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |     8 |    10  (10)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |     8 |            |          |
|*  2 |   HASH JOIN         |      |  5000 | 40000 |    10  (10)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| TAB1 |   500 |  2000 |     3   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| TAB2 | 10000 | 40000 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

  2 - access("TAB1"."COL1"="TAB2"."COL1")
  3 - filter("TAB1"."COL2"='a' OR "TAB1"."COL2"='b' OR
             "TAB1"."COL2"='c')

The forumula to calucate the Join cardinaliy as per reference from CBO Fundamentals and NoteId 68992.1 is..

Join Selectivity = 1 / max[(NDV(tab1.col1), NDV(tab2.col1)] *
((card tab1 - # tab1.col1 NULLS) / card tab1) *
((card tab2 - # tab2.col1 NULLS) / card tab2))

with the join cardinality then given as
Join Cardinality = card(tab1) * card(tab1) * sel(Pj)

Where
NDV: Number of Distinct Values
Cardinality: Number of rows
Selectivity: Proportion of a dataset returned by a particular predicate(or group of predicates)


Wrote file afiedt.buf

 1  select  TABLE_NAME, COLUMN_NAME, NUM_DISTINCT, NUM_NULLS, NUM_BUCKETS from  user_tab_col_statistics
 2* where column_name  = 'COL1'
SQL> /

TABLE_NAME                     COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS NUM_BUCKETS
------------------------------ ------------------------------ ------------ ---------- -----------
TAB2                           COL1                                   1000          0           1
TAB1                           COL1                                    100          0           1

SQL>

So In our case
Join Selectivity = 1 / Max[1000,100] *
((500-0) / 500) *
((10000-0) / 10000) = 1/1000

Join Cardinality = 500 * 10000 * 1/1000 = 5000

If we match the calculated join cardinality with join cardinality in execution plan of the query it will be same.

References: Metalink Note ID: 68992.1 and Cost Based Oracle Fundamentals by Jonathan Lewis.

Gurjas's Oracle Blog

Wednesday, July 7, 2010

How to Calculate Join Cardinality

No comments: