Thursday, July 7, 2016

TO improve the Query performance when the script is going for FTS(full Table Scan) on a fragmented table? with Prctical


TO improve the Query performance when the script is going for FTS(full Table Scan) on a fragmented table?


Recently one of my application users complaint about a performance issue. They had a query that they were running in two different environments (Dev and QA) of the same application. The problem was that the query was taking less time where it had more number of rows to fetch and surprisingly taking more time where it had less number of rows to fetch.

To give you an idea, following are the execution plans from both of these environments.


   
---//
---// execution plan from Dev environment //---
---//
Plan hash value: 3586353056

-------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |               |      1 |        |      1 |00:00:02.93 |     153K|    153K|
|   1 |  SORT AGGREGATE    |               |      1 |      1 |      1 |00:00:02.93 |     153K|    153K|
|*  2 |   TABLE ACCESS FULL| EVENT_DETAILS |      1 |   8000K|   8000K|00:00:03.25 |     153K|    153K|
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("STATUS"='ACTIVE')





---//
---// execution plan from QA environment //---
---//
Plan hash value: 3586353056

-------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |               |      1 |        |      1 |00:00:07.64 |     338K|    338K|
|   1 |  SORT AGGREGATE    |               |      1 |      1 |      1 |00:00:07.64 |     338K|    338K|
|*  2 |   TABLE ACCESS FULL| EVENT_DETAILS |      1 |   1999K|   1999K|00:00:07.14 |     338K|    338K|
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("STATUS"='ACTIVE')

We have the same execution plan in both the environments. However, in Dev the query is completing in just 2.93 seconds processing 8 million rows. Whereas in QA the same query is taking 7.64 seconds processing less than quarter of rows (~ 2 million) when compared to Dev.

Here is the query, executed against the Dev and QA environments.




---//
---// query executed against Dev //---
---//
SQL> desc EVENT_DETAILS
 Name                                Null?    Type
 ----------------------------------- -------- ------------------------
 EVENT_ID                            NOT NULL NUMBER
 EVENT_MSG                           NOT NULL VARCHAR2(100)
 EVENT_TIME                          NOT NULL TIMESTAMP(6)
 STATUS                              NOT NULL VARCHAR2(10)

SQL> select /*+ gather_plan_statistics */ count(*) from EVENT_DETAILS where status='ACTIVE';

  COUNT(*)
----------
   8000000

Elapsed: 00:00:02.96

---//
---// Query executed against QA //---
---//
SQL> desc EVENT_DETAILS
 Name                                Null?    Type
 ----------------------------------- -------- ------------------------
 EVENT_ID                            NOT NULL NUMBER
 EVENT_MSG                           NOT NULL VARCHAR2(100)
 EVENT_TIME                          NOT NULL TIMESTAMP(6)
 STATUS                              NOT NULL VARCHAR2(10)

SQL> select /*+ gather_plan_statistics */ count(*) from EVENT_DETAILS where status='ACTIVE';

  COUNT(*)
----------
   1999533

Elapsed: 00:00:07.67

As we have seen earlier, both the queries are using the same execution plan. Now, the question is why it is taking more time in QA where it has to process less rows (~2 million) when compared to Dev. To find that answer, we need to have a closer looks at the execution plans from both the environments. If we analyse the execution plans, we could see the query has performed 153K block reads in Dev while processing 8 million rows. Whereas, in QA the same query performed 338K block reads while processing only ~2 million rows.

This is a clear indication that Oracle had to scan through more blocks in QA while processing the ~2 million rows. We can also confirm this fact by setting query autotrace as shown below.






---//
---// execution statistics from Dev //---
---//
SQL> set autotrace traceonly stat
SQL> select count(*) from EVENT_DETAILS where status='ACTIVE';


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     153854  consistent gets
     153847  physical reads
          0  redo size
        542  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


---//
---// execution statistics from QA //---
---//
SQL> set autotrace traceonly stat
SQL> select count(*) from EVENT_DETAILS where status='ACTIVE';


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     338948  consistent gets
     338941  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        551  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

As we can see, the query has performed more number of block (physical/logical) reads in QA when compared to Dev, even though it had processed less rows in QA. Let’s take a look at the size of the table which the query is accessing in both Dev and QA environments.





---//
---// record count and table size in Dev //---
---//
SQL> select count(*) from EVENT_DETAILS;

  COUNT(*)
----------
  10000000

SQL> select sum(bytes)/1024/1024 Size_MB from dba_segments where segment_name='EVENT_DETAILS';

   SIZE_MB
----------
      1216


---//
---// record count and table size in QA //---
---//
SQL> select count(*) from EVENT_DETAILS;

  COUNT(*)
----------
   4000000

SQL> select sum(bytes)/1024/1024 Size_MB from dba_segments where segment_name='EVENT_DETAILS';

   SIZE_MB
----------
      2688

Here the problem is!! In Dev the table is consuming ~ 1.2 GB of space for 10 million rows, whereas in QA it is consuming almost double of the space (~ 2.5 GB) for just 4 million rows. This is a clear case of fragmentation which has caused the table to grow in size by leaving free spaces in between the table segments.

Let’s check, how fragmented is the table in both of the environments. Query used from this OTN thread.




---//
---// table fragmentation in Dev //---
---//
SQL> select table_name,round((blocks*8/1024),2) "size (MB)" ,
  2  round((num_rows*avg_row_len/1024/1024),2) "actual_data (MB)",
  3  (round((blocks*8/1024),2) - round((num_rows*avg_row_len/1024/1024),2)) "wasted_space (MB)"
  4  from dba_tables
  5  where (round((blocks*8),2) > round((num_rows*avg_row_len/1024),2))
  6  and table_name = 'EVENT_DETAILS'
  7  ;

TABLE_NAME       size (MB) actual_data (MB) wasted_space (MB)
--------------- ---------- ---------------- -----------------
EVENT_DETAILS       1206.9          1001.36            205.54


---//
---// table fragmentation in QA //---
---//
SQL> select table_name,round((blocks*8/1024),2) "size (MB)" ,
  2  round((num_rows*avg_row_len/1024/1024),2) "actual_data (MB)",
  3  (round((blocks*8/1024),2) - round((num_rows*avg_row_len/1024/1024),2)) "wasted_space (MB)"
  4  from dba_tables
  5  where (round((blocks*8),2) > round((num_rows*avg_row_len/1024),2))
  6  and table_name = 'EVENT_DETAILS'
  7  ;

TABLE_NAME       size (MB) actual_data (MB) wasted_space (MB)
--------------- ---------- ---------------- -----------------
EVENT_DETAILS      2654.38           404.36           2250.02

As we can observe, the table EVENT_DETAILS is heavily fragmented in the QA environment and around 2 GB of free space is wasted there. Now, if we look back to our execution plan, we could see the query is doing a full table scan (FTS) against the EVENT_DETAILS table. As we know, in a full table scan Oracle has to scan trough all the block until it reaches the High Water Mark even though the most of the blocks are empty.

If we look into the high water mark, we could see the table EVENT_DETAILS in QA has much higher high water mark when compared to Dev as found below.





---//
---// EVENT_DETAILS high water mark from Dev //---
---//
SQL> select t.table_name, s.blocks  "Block Allocated",
  2  t.blocks "Block Used", (s.blocks-t.empty_blocks-1) "High Water Mark"
  3  from user_tables t, user_segments s
  4  where t.table_name=s.segment_name
  5  and t.blocks <> (s.blocks-t.empty_blocks-1) and t.table_name ='EVENT_DETAILS'
  6  ;

TABLE_NAME      Block Allocated Block Used High Water Mark
--------------- --------------- ---------- ---------------
EVENT_DETAILS            155648     154483          155647


---//
---// EVENT_DETAILS high water mark from QA //---
---//
SQL> select t.table_name, s.blocks  "Block Allocated",
  2  t.blocks "Block Used", (s.blocks-t.empty_blocks-1) "High Water Mark"
  3  from user_tables t, user_segments s
  4  where t.table_name=s.segment_name
  5  and t.blocks <> (s.blocks-t.empty_blocks-1) and t.table_name ='EVENT_DETAILS'
  6  ;

TABLE_NAME      Block Allocated Block Used High Water Mark
--------------- --------------- ---------- ---------------
EVENT_DETAILS            344064     339761          344063

Since the table is heavily fragmented, we could do reorganization of the table to address the fragmentation. The simplest way to do the reorganization is to use the ALTER TABLE MOVE command as shown below. However, we may opt for other options (ONLINE REDFINITION, export/import) depending on the requirement and resource availability.

Note: If we choose to do reorganization using the MOVE command, then we need to rebuild the indexes associated with the table once the move operation is completed (as those indexes would be marked as UNSUABLE)




   
---//
---// reorganize fragmented table in QA //---
---//
SQL> alter table EVENT_DETAILS move tablespace MYAPP_TS;

Table altered.

---//
---// gather table stats in QA //---
---//
SQL> exec dbms_stats.gather_table_stats('MYAPP','EVENT_DETAILS',degree=>2);

PL/SQL procedure successfully completed.


---//
---// recheck table fragmentation in QA //---
---//
SQL> select table_name,round((blocks*8/1024),2) "size (MB)" ,
  2  round((num_rows*avg_row_len/1024/1024),2) "actual_data (MB)",
  3  (round((blocks*8/1024),2) - round((num_rows*avg_row_len/1024/1024),2)) "wasted_space (MB)"
  4  from dba_tables
  5  where (round((blocks*8),2) > round((num_rows*avg_row_len/1024),2))
  6  and table_name = 'EVENT_DETAILS'
  7  ;

TABLE_NAME       size (MB) actual_data (MB) wasted_space (MB)
--------------- ---------- ---------------- -----------------
EVENT_DETAILS       487.24           404.36             82.88

---//
---// check high water mark after table reorganization //---
---//
SQL> select t.table_name, s.blocks  "Block Allocated",
  2  t.blocks "Block Used", (s.blocks-t.empty_blocks-1) "High Water Mark"
  3  from user_tables t, user_segments s
  4  where t.table_name=s.segment_name
  5  and t.blocks <> (s.blocks-t.empty_blocks-1) and t.table_name ='EVENT_DETAILS'
  6  ;

TABLE_NAME      Block Allocated Block Used High Water Mark
--------------- --------------- ---------- ---------------
EVENT_DETAILS             62464      62367           62463

Once we reorganized the table, the high water mark as well as the wasted space has come down drastically. This reorganization will cause the High Water Marker adjustment for the table and in turn will result in less block scan while performing a full table scan.

Let’s execute the query against the QA environment and see if we can notice any improvements in the query performance.



   
---//
---// execute query against QA //---
---//
SQL> select /*+ gather_plan_statistics */ count(*) from EVENT_DETAILS where status='ACTIVE';

  COUNT(*)
----------
   1999533

Elapsed: 00:00:00.19

Wow!! our query is now executing in less than 1 sec when compared to earlier execution which was taking 7.64 secs. We can see the query is still using the same execution plan. However, it is doing much less number of I/O when compared to earlier executions as found below.





---//
---// execution plan from QA //---
---//
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------
SQL_ID  1s7nh198rxpq4, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from EVENT_DETAILS where
status='ACTIVE'

Plan hash value: 3586353056

-------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |               |      1 |        |      1 |00:00:00.19 |   62025 |  62020 |
|   1 |  SORT AGGREGATE    |               |      1 |      1 |      1 |00:00:00.19 |   62025 |  62020 |
|*  2 |   TABLE ACCESS FULL| EVENT_DETAILS |      1 |   1999K|   1999K|00:00:00.15 |   62025 |  62020 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("STATUS"='ACTIVE')


20 rows selected.


---//
---// execution statistics from QA //---
---//
SQL> set autotrace traceonly stat
SQL> select count(*) from EVENT_DETAILS where status='ACTIVE';


Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
      62028  consistent gets
      62020  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        551  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Our query is now doing only 62K block reads when compared to 338K block reads in the earlier case. We have improved the query performance by addressing the table fragmentation as the query was doing a full table scan (FTS) against the table. The other suitable option would be to create an appropriate index on the predicate columns to avoid a full table scan. Having an index will ensure that the query performance is stable even if the table is highly fragmented as Oracle doesn’t need to scan through blocks until the High Water Mark as shown below.





---//
---// creating index on query predicate //---
---//
SQL> create index event_details_stat_idx on event_details (status);

Index created.

---//
---// executing the query //---
---//
SQL> select /*+ gather_plan_statistics */ count(*) from EVENT_DETAILS where status='ACTIVE';

  COUNT(*)
----------
   1999533

Elapsed: 00:00:00.25

---//
---// check the execution plan //---
---//
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------
SQL_ID  1s7nh198rxpq4, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from EVENT_DETAILS where
status='ACTIVE'

Plan hash value: 931614286

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                   | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                        |      1 |        |      1 |00:00:00.25 |   10647 |  10625 |
|   1 |  SORT AGGREGATE       |                        |      1 |      1 |      1 |00:00:00.25 |   10647 |  10625 |
|*  2 |   INDEX FAST FULL SCAN| EVENT_DETAILS_STAT_IDX |      1 |   1999K|   1999K|00:00:00.53 |   10647 |  10625 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("STATUS"='ACTIVE')


20 rows selected.

As we can observe, if we opt for index, the query execution improves even further as Oracle can directly access the table blocks through the index resulting into less I/O activity for the query.





No comments:

Post a Comment