2012年11月8日星期四

Oracle 11g Release 1 (11.1)比较 B

Oracle 11g Release 1 (11.1)比较 B

本文内容

  • 创建测试数据表
  • 测试 B-tree 索引
  • 测试 Bitmap 索引
  • Bitmap 索引与分组

本文演示数据列值的特点,对索引的影响。如数据列的取值是可以穷举的,Bitmap 索引比 B-tree 索引更合适。

创建测试数据表


代码段一:

CREATE TABLE TESTIDX_A AS
SELECT *
  FROM (SELECT ROWNUM as id, t.object_name, t.object_type FROM dba_objects t);


利用 dba_objects 系统表生成测试数据表 testidx_a,并以同样方式生成表 testidx_btestidx_c。用下面语句查看一下数据量。

代码段二:

SQL> select count(*) from TESTIDX_A;
 
  COUNT(*)
----------
     69448
 
SQL>


以这样方式生成的测试数据,即便是新创建的数据库实例,也有将近 7 万行。

另外,本测试数据有个特点。object_type 列值的取值范围是可以枚举、穷举的,比如 FUNCTION、INDEX、PACKAGE 等,而 object_name 是不能穷举的。这个特点对索引类型很重要。

测试 B-tree 索引


首先,TESTIDX_A 表,不创建任何索引。其次,为 TESTIDX_B 表的 object_name 和 object_type 列分别建立 idx_tb_on 和 idx_tb_ot 的 B-tree 索引。默认情况下,Oracle 创建的是 B-tree 索引。

代码段三:

SQL> create index idx_tb_on on TESTIDX_B(object_name);
 
索引已创建。
 
SQL> create index idx_tb_ot on TESTIDX_B(object_type);
 
索引已创建。
 
SQL>


检索 testidx_atestidx_bobject_name 字段值 EMP 表。如下所示,检索 testidx_a

代码段四:

SQL> select * from TESTIDX_A where object_name='EMP';
 
 
执行计划
----------------------------------------------------------
Plan hash value: 1043052094
 
-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     5 |   450 |   119   (1)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| TESTIDX_A |     5 |   450 |   119   (1)| 00:00:02 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("OBJECT_NAME"='EMP')
 
Note
-----
   - dynamic sampling used for this statement
 
 
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        418  consistent gets
          0  physical reads
          0  redo size
        565  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
SQL>


注意:

  • 执行计划的 Rows、Bytes、Cost 列。其中,Rows 和 Bytes 列表示选择多少行,以及它们的大小;Cost 是该 SQL 的代价,它仅仅是根据 CPU、IO 等代价计算出来一个值。
  • 但统计信息部分是可变的,也就是说,初次与再次执行 SQL 时,物理 IO 的 consistent gets 值可能不同。但无论怎么样,执行计划中表的值不会变化。

检索 testidx_b,如下所示:

代码段五:

SQL> select * from TESTIDX_B where object_name='EMP';
 
 
执行计划
----------------------------------------------------------
Plan hash value: 4045461513
 
-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     1 |    90 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TESTIDX_B |     1 |    90 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IDX_TB_ON |     1 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_NAME"='EMP')
 
Note
-----
   - dynamic sampling used for this statement
 
 
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        569  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 
SQL>


说明:

  • 从执行计划上的 Operations 列看,代码段四检索时,因为,没有建立任何索引,所以进行了全表扫描;而代码段五,使用了索引 idx_tb_on。
  • 这样,执行计划 Cost 列的值,差距很大。创建索引后,该 SQL 执行的明显快。

检索 testidx_atestidx_bobject_type 列为 TABLE 的记录。检索 testidx_a 如下所示:

代码段六:

SQL> select * from TESTIDX_A
  2  where object_type='TABLE'
  3  order by object_name asc;
 
已选择2750行。
 
 
执行计划
----------------------------------------------------------
Plan hash value: 3733299015
 
----------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |  3623 |   318K|       |   197   (2)| 00:00:03 |
|   1 |  SORT ORDER BY     |           |  3623 |   318K|   728K|   197   (2)| 00:00:03 |
|*  2 |   TABLE ACCESS FULL| TESTIDX_A |  3623 |   318K|       |   119   (1)| 00:00:02 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("OBJECT_TYPE"='TABLE')
 
Note
-----
   - dynamic sampling used for this statement
 
 
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        417  consistent gets
          0  physical reads
          0  redo size
     104070  bytes sent via SQL*Net to client
       2429  bytes received via SQL*Net from client
        186  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       2750  rows processed
 
SQL>


检索 testidx_b 表,如下所示:

代码段七:

SQL> select * from TESTIDX_B
  2  where object_type='TABLE'
  3  order by object_name asc;
 
已选择2751行。
 
 
执行计划
----------------------------------------------------------
Plan hash value: 1436749833
 
--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes |TempSpc| Cost(%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |  3555 |   312K|       |   155   (1)| 00:00:02 |
|   1 |  SORT ORDER BY               |           |  3555 |   312K|   712K|   155   (1)| 00:00:02 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TESTIDX_B |  3555 |   312K|       |    80   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | IDX_TB_OT |  3555 |       |       |    10   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_TYPE"='TABLE')
 
Note
-----
   - dynamic sampling used for this statement
 
 
统计信息
----------------------------------------------------------
          9  recursive calls
          0  db block gets
        178  consistent gets
          8  physical reads
          0  redo size
     104086  bytes sent via SQL*Net to client
       2429  bytes received via SQL*Net from client
        186  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       2751  rows processed
 
SQL>


说明:同上。

测试 Bitmap 索引


接下来,为 testidx_c 表的 object_name 和 object_type 列都建立 Bitmap 索引。如下所示。

代码段八:

SQL> create bitmap index idx_tc_on on TESTIDX_C(object_name);
 
索引已创建。
 
SQL> create bitmap index idx_tc_ot on TESTIDX_C(object_type);
 
索引已创建。
 
SQL>


同代码段六和七执行同样的操作——查找 object_type 列为 TABLE 的记录,如下所示。

代码段九:

SQL> select * from TESTIDX_C
  2  where object_type='TABLE'
  3  order by object_name asc;
 
已选择2752行。
 
 
执行计划
----------------------------------------------------------
Plan hash value: 1820242233
 
-------------------------------------------------------------------------------------------
| Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |           |  2500 |   219K|    78   (2)| 00:00:01 |
|   1 |  SORT ORDER BY                |           |  2500 |   219K|    78   (2)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID | TESTIDX_C |  2500 |   219K|    77   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS|           |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE | IDX_TC_OT |       |       |            |          |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("OBJECT_TYPE"='TABLE')
 
Note
-----
   - dynamic sampling used for this statement
 
统计信息
----------------------------------------------------------
        103  recursive calls
          0  db block gets
        170  consistent gets
          1  physical reads
          0  redo size
     104107  bytes sent via SQL*Net to client
       2429  bytes received via SQL*Net from client
        186  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       2752  rows processed


说明:

  • 代码段六,没有创建任何索引;代码段七,创建的是 B-tree 索引;代码段九,创建的是 Bitmap 索引。
  • 当检索作用在 object_type 列时,Bitmap 索引比 B-tree 索引的效果更好。这是因为 object_type 的列决定的。该列的值是可以枚举,或是说穷举出来的。

因此,对某个列,它的值若有范围,可穷举,用 Bitmap 索引比较合适。否则,Bitmap 索引的效果,就不是很明显。如下所示,执行同代码段四和五同样的操作,查找 object_name 列为 EMP 的记录。

代码段十:

SQL> select * from testidx_c
  2  where object_name='EMP';
 
 
执行计划
----------------------------------------------------------
Plan hash value: 2697327847
 
------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)|Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    90 |    77   (0)|00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | TESTIDX_C |     1 |    90 |    77   (0)|00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|           |       |       |            |         |
|*  3 |    BITMAP INDEX SINGLE VALUE | IDX_TC_ON |       |       |            |         |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_NAME"='EMP')
 
Note
-----
   - dynamic sampling used for this statement
 
 
统计信息
----------------------------------------------------------
          9  recursive calls
          0  db block gets
         76  consistent gets
          1  physical reads
          0  redo size
        569  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


说明:看代码段四、五和十的执行计划,Cost 列的值,代码段十还是不如代码段五。因为,object_name 列的值不能穷举,不具有像 object_type 列那样的特点。所以,虽然 SQL 使用了 Bitmap 索引,但效果明显不如 B-tree 索引。

这样,我们会想到一个问题,是不是 Bitmap 索引对分组语句效果很好?

Bitmap 索引与分组


分别对 testidx_btestidx_c 表的 object_type 字段进行分组。如下所示,分组 testidx_b:

代码段十一:

SQL> select t.object_type,count(*) from testidx_b t
  2  group by t.object_type;
 
已选择41行。
 
 
执行计划
----------------------------------------------------------
Plan hash value: 141313140
 
--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 79998 |   859K|   122   (4)| 00:00:02 |
|   1 |  HASH GROUP BY     |           | 79998 |   859K|   122   (4)| 00:00:02 |
|   2 |   TABLE ACCESS FULL| TESTIDX_B | 79998 |   859K|   119   (1)| 00:00:02 |
--------------------------------------------------------------------------------
Note
-----
   - dynamic sampling used for this statement
 
 
统计信息
----------------------------------------------------------
          5  recursive calls
          0  db block gets
        492  consistent gets
          0  physical reads
          0  redo size
       1449  bytes sent via SQL*Net to client
        438  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         41  rows processed


分组 testidx_c,如下所示:

SQL> select t.object_type,count(*) from testidx_c t
  2  group by t.object_type;
 
已选择41行。
 
 
执行计划
----------------------------------------------------------
Plan hash value: 3862313015
 
--------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           | 72610 |   779K|     9   (0)| 00:00:01 |
|   1 |  SORT GROUP BY NOSORT    |           | 72610 |   779K|     9   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION COUNT|           | 72610 |   779K|     9   (0)| 00:00:01 |
|   3 |    BITMAP INDEX FULL SCAN| IDX_TC_OT |       |       |            |          |
--------------------------------------------------------------------------------------
Note
-----
   - dynamic sampling used for this statement
 
 
统计信息
----------------------------------------------------------
          5  recursive calls
          0  db block gets
         78  consistent gets
          3  physical reads
          0  redo size
       1449  bytes sent via SQL*Net to client
        438  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         41  rows processed


说明:从执行计划的 Cost 列看,效果果然很明显。但是,当执行分组的同时,还有过滤条件,SQL 就不会使用 Bitmap 索引,除非将过滤条件中的列与分组列同时建立 Bitmap 索引。




TAG: