Function-based index
April 11, 2011 3 Comments
There are several types of indices: B-tree, Bitmap, Function-based, IOT.
I will discuss what function-based index is and in what cases is it used.
Let’s start by creating a demo example.
–Create a test table
create table tesTable(
col1 number,
col2 VARCHAR2(50)
);
–Insert some values
insert into testable
values(1,‘Giorgi’);
commit;
insert into testable
values(2,‘Mariami’);
commit;
–Gather table statistics
begin
dbms_stats.gather_table_stats(null,‘TESTABLE’);
end;
–Check if the statistics is OK
select column_name,num_distinct,hidden_column
from dba_tab_cols
where table_name=‘TESTABLE’;
–Output
COLUMN_NAME | NUM_DISTINCT | HIDDEN_COLUMN |
---|---|---|
COL1 | 2 | NO |
COL2 | 2 | NO |
Let’s see the explain plan for the following sql statement:
explain plan for select *
from testable
where UPPER(col2)=‘MARIAMI’;
select * from table(dbms_xplan.display);
——————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————
| 0 | SELECT STATEMENT | | 1 | 40 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TESTABLE | 1 | 40 | 2 (0)| 00:00:01 |
——————————————————————————
If we have a big amount of data in this table, for example 1000000 rows what will happen?
Oracle will convert each row value into UPPER case and then compare it to our string. And this will happen 1000000 times
To make faster our query, there exists function-based index.
–Create function-based index
create index testable_func_indx on testable(upper(col2));
–Gather table statistics
begin
dbms_stats.gather_table_stats(null,‘TESTABLE’);
end;
–Check if the statistics is OK
select column_name,num_distinct,hidden_column
from dba_tab_cols
where table_name=‘TESTABLE’;
–Output
COLUMN_NAME | NUM_DISTINCT | HIDDEN_COLUMN |
---|---|---|
COL1 | 2 | NO |
COL2 | 2 | NO |
SYS_NC00003$ | 2 | YES |
As you can see there appeared hidden column by system generated name. Oracle behind the scenes creates a hidden virtual column on the parent table in order to capture the data characteristics of the function so that the CBO can make an accurate determination of the selectivity associated with using the function.
–See explain plan
explain plan for select *
from testable
where upper(col2)=‘MARIAMI’;
select * from table(dbms_xplan.display);
————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 40 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TESTABLE | 1 | 40 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TESTABLE_FUNC_INDX | 1 | | 1 (0)| 00:00:01 |
————————————————————————————————–
You see there is INDEX RANGE SCAN. It will be faster then FULL TABLE SCAN if there is a big amount of data.
If we want to use user-defined function instead of Oracle built-in function, such as UPPER, we should do the following:
create or replace function my_fun(x varchar2) RETURN VARCHAR2
deterministic
is
begin
return upper(x);
end;
You must indicate that the function is deterministic,which means that for same input values the output will be the same. So you must not use random function or something like that .
explain plan for select *
from testable
where upper(col2)=‘MARIAMI’;
select * from table(dbms_xplan.display);
————————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 40 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TESTABLE | 1 | 40 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TESTABLE_FUNC_INDX | 1 | | 1 (0)| 00:00:01 |
————————————————————————————————–
Hi! Someone may create another column, store computed values in it and use simple b-tree index for searching. Have you compared your implementation to that one in both performance and storage issues? Such kind of information would be useful. Thanks!
Hello,
Thank you for your good comment.
Using B-tree index instead of Function-Based index is faster. But you should considers several aspects:
1. Creating additional column requires additional space.
So if you have a lot of rows you should think about it.
2. You should create trigger for to reflect changes on the original column to newly-added column.
So if the changes to the column happens many times it is a bad practice.
(But if changes happens frequently on the column then index on that column becomes fragmented which makes it to work inappropriately and you may prefer not to create any index).
So if you do not have space problem and records are not changed frequently it is better to create additional column and B-tree index on that column than function-based index.
@Raj if you have additional info about this, please let me know 🙂