INDEXES IN ORACLE
Indexes
Indexes Indexes are optional structures associated with tables. Indexes can be created to increase the performance of data retrieval. Just as the index in a manual helps you quickly locate specific information, an Oracle index provides an access path to table data.
-
What is an Index?
An index is used to increase read access performance. A book, having an index, allows
rapid access to a particular subject area within that book. Indexing a database table
provides rapid location of specific rows within that table, where indexes are used to
optimize the speed of access to rows. When indexes are not used or are not matched by
SQL statements submitted to that database then a full table scan is executed. A full table
scan will read all the data in a table to find a specific row or set of rows, this is extremely
inefficient when there are many rows in the table.
It is often more efficient to full table scan small tables. The optimizer will often
assess full table scan on small tables as being more efficient than reading both index and
data space, particularly where a range scan rather than an exact match would be used
against the index.
An index of columns on a table contains a one-to-one ratio of rows between index and
indexed table, excluding binary key groupings, more on this later. An index is effectively
a separate table to that of the data table. Tables and indexes are often referred to as data
and index spaces. An index contains the indexed columns plus a ROWID value for each
of those column combination rows. When an index is searched through the indexed
columns rather than all the data in the row of a table is scanned. The index space ROWID
is then used to access the table row directly in the data space. An index row is generally
much smaller than a table row, thus more index rows are stored in the same physical
space, a block. As a result less of the database is accessed when using indexes as opposed
to tables to search for data. This is the reason why indexes enhance performance.
The Basic “How to” of Indexing
There are a number of important factors with respect to efficient and effective creation
and use of indexing.
The number of indexes per table.
The number of table columns to be indexed.
What datatypes are sensible for use in columns to be indexed?
Types of indexes from numerous forms of indexes available.
How does SQL behave with indexes?
What should be indexed?
What should not be indexed?
Number of Indexes per Table
Whenever a table is inserted into, updated or deleted from, all indexes plus the table must
be updated. Thus if one places ten indexes onto a single table then every change to that
table requires an effective change to a single table and ten indexes. The result is that
performance will be substantially degraded since one insert requires eleven inserts to
insert the new row into both data and index spaces. Be frugal with indexing and be
conscious of the potential ill as well as the good effects produced by indexing. The
general rule is that the more dynamic a table is the fewer indexes it should have.
A dynamic table is a table changes constantly, such as a transactions table. Catalog
tables on the other hand store information such as customer details; customers change a
lot less often than invoices. Customer details are thus static in nature and over-indexing
may be advantageous to performance.
Number of Columns to Index
Composite indexes are indexes made up of multiple columns. Minimize on the number of
columns in a composite key. Create indexes with single columns. Composite indexes are
often a requirement of traditional relational database table structures.
With the advent of object-oriented application programming languages such as Java,
sequence identifiers tend to be used to identify every row in every table uniquely. The
result is single column indexes for every table. The only exceptions are generally manyto-
many join resolution entities.
It may sometimes be better to exclude some of the lower-level or less relevant columns
from the index since at that level there may not be much data, if there are not many rows
to index it can be more efficient to read a group of rows from the data space. For
instance, a composite index comprised of five columns could be reduced to the first three
columns based on a limited number of rows traversed as a result of ignoring the last two
columns. Look at your data carefully when constructing indexes. The more columns you
add to a composite index the slower the search will be since there is a more complex
requirement for that search and the indexes get physically larger. The benefit of indexes is
that an index occupies less physical space than the data. If the index gets so large that it is
as large as the data then it will become less efficient to read both the index and data
spaces rather than just the data space.
Most database experts recommend a maximum of three columns for composite keys.
Datatypes of Index Columns
Integers make the most efficient indexes. Try to always create indexes on columns with
fixed length values. Avoid using VARCHAR2 and any object data types. Use integers if
possible or fixed length, short strings. Also try to avaoid indexing on dates and floatingpoint
values. If using dates be sure to use the internal representation or just the date, not
the date and the time. Use integer generating sequences wherever possible to create
consistently sequential values.
Types of Indexes
There are different types of indexes available in different databases. These different
indexes are applicable under specific circumstances, generally for specific search
patterns, for instance exact matches or range matches.
The simplest form of indexing is no index at all, a heap structure. A heap structure is
effectively a collection of data units, rows, which is completely unordered. The most
commonly used indexed structure is a Btree (Binary Tree). A Btree index is best used for
exact matches and range searches. Other methods of indexing exist.
1. Hashing algorithms produce a pre-calculated best guess on general row location and
are best used for exact matches.
2. ISAM or Indexed Sequential Access Method indexes are not used in Oracle.
3. Bitmaps contain maps of zero’s and 1’s and can be highly efficient access methods for
read-only data.
4. There are other types of indexing which involve clustering of data with indexes.
In general every index type other than a Btree involves overflow. When an index is
required to overflow it means that the index itself cannot be changed when rows are
added, changed or removed. The result is inefficiency because a search to find
overflowing data involves a search through originally indexed rows plus overflowing
rows. Overflow index space is normally not ordered. A Btree index can be altered by
changes to data. The only exception to a Btree index coping with data changes in Oracle
is deletion of rows. When rows are deleted from a table, physical space previously used
by the index for the deleted row is never reclaimed unless the index is rebuilt. Rebuilding
of Btree indexes is far less common than that for other types of indexes since non-Btree
indexes simply overflow when row changes are applied to them.
Oracle uses has the following types of indexing available.
Btree index. A Btree is a binary tree. General all-round index and common in OLTP
systems. An Oracle Btree index has three layers, the first two are branch node layers and
the third, the lowest, contains leaf nodes. The branch nodes contain pointers to the lower
level branch or leaf node. Leaf nodes contain index column values plus a ROWID pointer
to the table row. The branch and leaf nodes are optimally arranged in the tree such that
each branch will contain an equal number of branch or leaf nodes.
Bitmap index. Bitmap containing binary representations for each row. A zero
implies that a row does not have a specified value and a 1 denotes that row having that
value. Bitmaps are very susceptible to overflow in OLTP systems and should only be
used for read-only data such as in Data Warehouses.
Function-Based index. Contains the result of an expression pre-calculated on each
row in a table.
Index Organized Tables. Clusters index and data spaces together physically for a
single table and orders the merged physical space in the order of the index, usually the
primary key. An index organized table is a table as well as an index, the two are merged.
Clusters. Partial merge of index and data spaces, ordered by an index, not
necessarily the primary key. A cluster is similar to an index organized table except that it
can be built on a join (more than a single table). Clusters can be ordered using binary tree
structures or hashing algorithms. A cluster could also be viewed as a table as well as an
index since clustering partially merges index and data spaces.
Bitmap Join index. Creates a single bitmap for one table in a join.
Domain index. Specific to certain application types using contextual or spatial data,
amongst others.
Indexing Attributes
Various types of indexes can have specific attributes or behaviors applied to them. These
behaviors are listed below, some are Oracle specific and some are not.
Ascending or Descending. Indexes can be order in either way.
Uniqueness. Indexes can be unique or non-unique. Primary keys must be unique
since a primary key uniquely identifies a row in a table referentially. Other columns such
as names sometimes have unique constraints or indexes, or both, added to them.
Composites. A composite index is an index made up of more than one column in a
table.
Compression. Applies to Btree indexes where duplicated prefix values are removed.
Compression speeds up data retrieval but can slow down table changes.
Reverse keys. Bytes for all columns in the index are reversed, retaining the order of
the columns. Reverse keys can help performance in clustered server environments
(Oracle8i Parallel Server / RAC Oracle9i) by ensuring that changes to similar key values
will be better physically spread. Reverse key indexing can apply to rows inserted into
OLTP tables using sequence integer generators, where each number is very close to the
previous number. When searching for and updating rows with sequence identifiers, where
rows are searched for
Null values. Null values are generally not included in indexes.
Sorting (NOSORT). This option is Oracle specific and does not sort an index. This
assumes that data space is physically ordered in the desired manner.
What SQL does with Indexes
In general a SQL statement will attempt to match the structure of itself to an index, the
where clause ordering will attempt to match available indexes and use them if possible. If
no index is matched then a full table scan will be executed. A table scan is extremely
inefficient for anything but the smallest of tables. Obviously if a table is read
sequentially, in physical order then an index is not required. A table does not always need
an index.
What to Index
Use indexes where frequent queries are performed with where and order by clause
matching the ordering of columns in those indexes. Use indexing generally on larger
tables or multi-table, complex joins. Indexes are best created in the situations listed
below.
Columns used in joins.
Columns used in where clauses.
Columns used in order by clauses.
In most relational databases the order-by clause is generally executed on the subset
retrieved by the where clause, not the entire data space. This is not always unfortunately
the case for Oracle.
Traditionally the order-by clause should never include the columns contained in the
where cause. The only case where the order-by clause will include columns contained in
the where clause is the case of the where clause not matching any index in the database or
a requirement for the order by clause to override the sort order of the where, typically in
highly complex, multi-table joins.
The group-by clause can be enhanced by indexing when the range of values being
grouped is small in relation to the number of rows in the table selected.
What not to Index
Indexes will degrade performance of inserts, updates and deletes, sometimes
substantially.
Tables with a small number of rows.
Static tables.
Columns with a wide range of values.
Tables changed frequently and with a low amount of data retrieval.
Columns not used in data access query select statements.