Roads ? - where we're going, we don't need roads ...
FBI internal investigations...hidden virtual columns in 8i. | ORA600
ORA600 content Content RSS
Oracle ORA600 News RRSS Oracle News RSS
ORA600 blog Blog RSS
ORA600 blog Blog Atom

User login

FBI internal investigations...hidden virtual columns in 8i.

I was called in last week for a performance issue at a customer site.
Let's set scene a bit …

They had moved their main production OLTP database from AIX to Windows and had been struggling with performance for a couple of weeks. Hardware-wise, they went from expensive but old Bull hardware to a Dell box (with 8 cores). And they went from 4Gb to 16Gb RAM. So you would expect the database would fly on this new hardware.
Now - this database was/is pulling around a 1000 concurrent users…. The first thing they ran into was the process' addressable memory space on 32bit windows, being 2Gb. This of course led to a lot of ORA-04030 and angry users.  Actually, Anjo recently made a blogentry about it.

So actions were made to optimize Oracle for windows, which basically consists of putting the buffercache above the 2Gb limit - anyway metalink note 231159.1 explains all of this, but still, as this is an Oracle 8.1.7.4 database handling a thousand users, all other memory (remember sort_area_size, hash_area_size etc) related parameters were set to a minimum to avoid ORA-4030's.
The result of this memory balancing operation was, that performance was below average. And I was send in to have a look at it. I must admit - I hadn't tuned a whole environment for a while, but I had done a lot of 8.0 and 7.3 migrations to 8i back in the day, and knew pretty much what the problems of the 8i CBO were. Oh - why are the still running 8i … well the database application is to be decommissioned this year and so they didn't bother to move up to the latest Oracle version. The fact that they wanted to invest in getting performance up, proves power users were about to lynch the IT manager Tongue out
Anyway - 5minutes into the meeting with IT, a second consultant from a rivaling company walks in. That's how bad performance was. I don't mind competition - in fact I didn't see it as competition. The guy was more than reasonable and during the day we acted as each-others soundboard.

The first thing I noticed were the bad response times on the disks and the large amount of physical IO the database was doing. The database running on a Dell SAN…. I didn't even know Dell produced SANs !!! :-p But I knew one thing - this SAN was sh*te. It used 8 disks in RAID5 (coming from 4 disks) and I reckon the write cache must have been quite small. In fact at first I thought the SAN cache must have been disabled - something I had encountered multiple times when the battery of the cache dies, the cache on the SAN is disabled to avoid corruption in case of power-failure.
Clearly, this DB was doing too much IO for the SAN… way past it's critical point in the hockey stick chart see also Anjo's paper on this  - oh and we couldn't increase any memory settings within the DB of course.


So the only option left, was hunting down individual sql statements and tune them, so the overall IO would decrease.  The impact of sql tuning can be *huge* - especially with a 1000 oltp users. Just imaging a sql that does 5000 physical IO's  that is run by these 1000 users every minute (5x10^6 IO/min) - if we can tune it to use only 20 physical IO's (20x10^3 IO/min)  than that would have a dramatic effect on overall performance.


I noticed that there were a lot of full tablescans … that's not necessarily bad thing, but sometimes it just is ! From the old days, I knew that Oracle 8i's CBO (with default settings) is quite fond of full tablescans and parallel query. I had a lot of migrations from 8.0 to 8i were performance went bad just because of that. Now we're not interested in PQ in an OLTP, so we disabled it.
Then there's Oracle 8i favor for full tablescans. The CBO didn't have any CPU costing back then, so one of the things to do was to set optimizer_index_caching and optimizer_index_cost_adj. There have been a lot of discussing of these parameters in the past, but my experience (in 8i and on OLTP db) is that about 95% of all queries benefit and 5% of the queries will perform worse. The main problem with lowering optimizer_index_cost_adj are rounding errors in the CBO's calculation costs. This can lead to the CBO picking an index based on it's alphabetical order.
Simply explained - let's assume the cost of using index A is 9 and the cost of using index B is 5. If optimizer_index_cost_adj is set to  10, then the cost is divided by 10. So the cost of index A will be 0.9 and index B will be 0.5. Or when rounding kicks in, the cost of both A & B will be 1. And the optimizer will choose the index based on the alphabetical order of it's name. (disclaimer - simple explanation - read Jonathan Lewis' book on Oracle CBO )

<!--break-->

As expected - disabling PQ and setting the optimizer_index_x parameters - dramatically improved performance. So now I could focus on individual statements still going slow. And then I hit what I call bad database design.
Some tables were using the CHAR datatype … nothing wrong with that if you like wasting space. But the CHAR datatype was also being used for storing NUMBERs !!! The CHAR columns were indexed but that doesn't help much using where clauses like 'where <char column>=number'.

Let's setup a small testcase :

SQL> create table duh ( n number, c char(30) );
Table created.
SQL> create index i_duh_n on duh(n) ;
Index created.
SQL> create index i_duh_c on duh(c) ;
Index created.
SQL> insert into duh (n,c) select obj#,obj# from sys.obj$ ;
12201 rows created.
SQL> commit ;
Commit complete.

Querying the number column N will result in an index range scan :

SQL> select * from duh where n=2 ;

        N C
--------- ------------------------------
2 2

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=2 Bytes=76)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'DUH' (Cost=2 Card=2 Bytes=76)
2    1     INDEX (RANGE SCAN) OF 'I_DUH_N' (NON-UNIQUE) (Cost=1 Car d=2)

No problem here - but when we query the CHAR column C as shown below, the optimizer will perform an implicit cast, and thus no index can be used.

SQL> select * from duh where c=2 ;

        N C
--------- ------------------------------
2 2

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=44 Card=123 Bytes=4674)
1    0   TABLE ACCESS (FULL) OF 'DUH' (Cost=44 Card=123 Bytes=4674)

All problematic queries were like the ones above. Sometimes you saw a little variation like this :

SQL> select * from duh where trim(c)='2' ;
N C
--------- ------------------------------
2 2

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=44 Card=123 Bytes=46 74)
1    0   TABLE ACCESS (FULL) OF 'DUH' (Cost=44 Card=123 Bytes=4674)

This means the developers knew something was fishy about using CHAR as a NUMBER ;-)
Anyway - there's a quick fix for this - and it's called function based indexes (FBI). They were introduced in Oracle 8i, so, two thumbs up, I could use them. Well, I could use them after setting QUERY_REWRITE_ENABLED=TRUE and QUERY_REWRITE_INTEGRITY=TRUSTED
But I had already done this as I had seen a lot of 'WHERE UPPER(x)=UPPER(y)' type of where clauses - a developer's favorite performance killer !
So once I identified all the queries that were full table scanning these 'CHAR as NUMBER' tables, I could do the following with it :

SQL> create index i_duh_c_fbi on duh(to_number(c)) ;
Index created.
SQL> select * from duh where c=2 ;

        N C
--------- ------------------------------
2 2


Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=2 Bytes=76)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'DUH' (Cost=2 Card=2 Bytes=76)
2    1     INDEX (RANGE SCAN) OF 'I_DUH_C_FBI' (NON-UNIQUE) (Cost=1      Card=2)

BONUS! If you know that those tables were around 60 to 80Mb each and those queries were executed quite frequently, it doesn't take a genius to see what that does for IO and performance in general. The SAN could now cope with the number of IO's thrown to it. (see the hockey stick model)

There was, however, one other thing. These tables were also queried from remote databases, mainly one XE database running the web part of the application (APEX).
After the FBI's were implemented, some parts of the web app broke down. Looking at the user dumps, we found out that the remote server process crashed each time a table (or synonym) with an fbi on it was queried. Basically we had an ORA-07445 on the XE database session each time a remote query was launched on the fbi tables :

ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_ddfnetCFull+1379] [PC:0xFD1F13] [ADDR:0x8A10068] [UNABLE_TO_READ] []

A brief search on Metalink clarified that this has something todo with 'hidden columns'.
Now ... I could only reproduce the ORA-07445 when the execution plan on the remote DB used the function based index.

So what happens if you create a function based index ? IS this really a kind of index or did the oracle kernel developers added some special trickery to the table ? Or does a function based index add a column to the table behind your back, materializing the result of the function, and indexes that ?

Let's look at the columns of table DUH without function based indexes :
SQL> select name,col#,intcol#,segcol#,property from sys.col$ where obj# = ( select obj# from sys.obj$ where name = upper('DUH') and type# = 2);

NAME                                COL#   INTCOL#   SEGCOL#  PROPERTY
------------------------------ --------- --------- --------- ---------
N                                      1         1         1         0
C                                      2         2         2         0

And let's dump a datablock of DUH (without fbi) to trace :

buffer tsn: 1 rdba: 0x00800821 (2/2081)
scn: 0x0000.00030bad seq: 0x01 flg: 0x02 tail: 0x0bad0601
frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

Block header dump:  0x00800821
Object id on Block? Y
seg/obj: 0x2fce  csc: 0x00.30b73  itc: 1  flg: -  typ: 1 - DATA
fsl: 0  fnx: 0x0 ver: 0x01

Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   xid:  0x0009.004.0000001f    uba: 0x00c025af.0010.0a  --U-   45  fsc 0x0000.00030bad

data_block_dump
===============
tsiz: 0x7b8
hsiz: 0x6c
pbl: 0x0372a044
bdba: 0x00800821
flag=-----------
ntab=1
nrow=45
frre=-1
fsbo=0x6c
fseo=0x137
avsp=0xcb
tosp=0xcb
0xe:pti[0] nrow=45 offs=0
0x12:pri[0] offs=0x137
0x14:pri[1] offs=0x15c
0x16:pri[2] offs=0x181
...
0x6a:pri[44] offs=0x793
block_row_dump:
tab 0, row 0, @0x137
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 02
col  1: [30]
31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
tab 0, row 1, @0x15c
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 30
col  1: [30]
34 37 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
tab 0, row 2, @0x181
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 14
col  1: [30]
31 39 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
...
tab 0, row 44, @0x793
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 35
col  1: [30]
35 32 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
end_of_block_dump

So - nothing exciting here !

Now lets do the same after we've created a function based index on 'to_number(c)' :
SQL> select name,col#,intcol#,segcol#,property from sys.col$ where obj# = ( select obj# from sys.obj$ where name = upper('DUH') and type# = 2);

NAME                                COL#   INTCOL#   SEGCOL#  PROPERTY
------------------------------ --------- --------- --------- ---------
N                                      1         1         1         0
C                                      2         2         2         0
SYS_NC00003$                           0         3         0    327976

Hmm - a system generated column !
Property 327976 or 0x50128=0x40000+0x10000+0x00100+0x00020+0x00008
means
0x40000 = virtual column is NLS dependent
0x10000 = virtual column
0x00100 = system-generated column
0x00020 = hidden column
0x00008 = virtual column

So we have a hidden, system generated, virtual column.
COL# is zero - this is consistent with the column being hidden.
And SEGCOL# is also zero - which tells us that the column is not materialized within the table's blocks and that's consistent with the column being virtual.

You can the property values from sql.bsq in $OH/rdbms/admin :
property      number not null,           /* column properties (bit flags): */
/* 0x0001 =       1 = ADT attribute column                   */
/* 0x0002 =       2 = OID column                             */
/* 0x0004 =       4 = nested table column                    */
/* 0x0008 =       8 = virtual column                         */
/* 0x0010 =      16 = nested table's SETID$ column           */
/* 0x0020 =      32 = hidden column                          */
/* 0x0040 =      64 = primary-key based OID column           */
/* 0x0080 =     128 = column is stored in a lob              */
/* 0x0100 =     256 = system-generated column                */
/* 0x0200 =     512 = rowinfo column of typed table/view     */
/* 0x0400 =    1024 = nested table columns setid             */
/* 0x0800 =    2048 = column not insertable                  */
/* 0x1000 =    4096 = column not updatable                   */
/* 0x2000 =    8192 = column not deletable                   */
/* 0x4000 =   16384 = dropped column                         */
/* 0x8000 =   32768 = unused column - data still in row      */
/* 0x00010000 =   65536 = virtual column                         */
/* 0x00020000 =  131072 = place DESCEND operator on top          */
/* 0x00040000 =  262144 = virtual column is NLS dependent        */
/* 0x00080000 =  524288 = ref column (present as oid col)        */
/* 0x00100000 = 1048576 = hidden snapshot base table column      */
/* 0x00200000 = 2097152 = attribute column of a user-defined ref */
/* 0x00400000 = 4194304 = export hidden column,RLS on hidden col */
/* 0x00800000 = 8388608 = string column measured in characters   */
/* 0x01000000 = 16777216 = virtual column expression specified    */
/* 0x02000000 = 33554432 = typeid column                          */
/* 0x04000000 = 67108864 = Column is encrypted                    */
/* 0x20000000 = 536870912 = Column is encrypted without salt       */
         
Let's double check the column hasn't materialized by dumping a block.

buffer tsn: 1 rdba: 0x00800821 (2/2081)
scn: 0x0000.00030bad seq: 0x01 flg: 0x02 tail: 0x0bad0601
frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

Block header dump:  0x00800821
Object id on Block? Y
seg/obj: 0x2fce  csc: 0x00.30b73  itc: 1  flg: -  typ: 1 - DATA
fsl: 0  fnx: 0x0 ver: 0x01

Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   xid:  0x0009.004.0000001f    uba: 0x00c025af.0010.0a  --U-   45  fsc 0x0000.00030bad

data_block_dump
===============
tsiz: 0x7b8
hsiz: 0x6c
pbl: 0x0372a044
bdba: 0x00800821
flag=-----------
ntab=1
nrow=45
frre=-1
fsbo=0x6c
fseo=0x137
avsp=0xcb
tosp=0xcb
0xe:pti[0] nrow=45 offs=0
0x12:pri[0] offs=0x137
0x14:pri[1] offs=0x15c
0x16:pri[2] offs=0x181
...
0x6a:pri[44] offs=0x793
block_row_dump:
tab 0, row 0, @0x137
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 02
col  1: [30]
31 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
tab 0, row 1, @0x15c
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 30
col  1: [30]
34 37 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
tab 0, row 2, @0x181
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 14
col  1: [30]
31 39 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
...
tab 0, row 44, @0x793
tl: 37 fb: --H-FL-- lb: 0x1 cc: 2
col  0: [ 2]  c1 35
col  1: [30]
35 32 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20
end_of_block_dump

Nope - the table still has 2 physical columns. Just what we expected.

Let's dump a leaf block from the function based index... here we see the column is materialized.

buffer tsn: 1 rdba: 0x00800944 (2/2372)
scn: 0x0000.00030bc3 seq: 0x01 flg: 0x00 tail: 0x0bc30601
frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

Block header dump:  0x00800944
Object id on Block? Y
seg/obj: 0x2fcf  csc: 0x00.30bbe  itc: 2  flg: -  typ: 2 - INDEX
fsl: 0  fnx: 0x0 ver: 0x01

Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   xid:  0x0000.000.00000000    uba: 0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x02   xid:  0x000b.00c.00000015    uba: 0x00000000.0000.00  ----    0  fsc 0x0000.00000000

Leaf block dump
===============
header address 57843804=0x372a05c
kdxcolev 0
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 117
kdxcofbo 270=0x10e
kdxcofeo 472=0x1d8
kdxcoavs 202
kdxlespl 0
kdxlende 0
kdxlenxt 8390981=0x800945
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 1892
row#0[1880] flag: -----, lock: 0
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  00 80 08 21 00 00
row#1[1868] flag: -----, lock: 0
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  00 80 08 21 00 19
...
row#116[472] flag: -----, lock: 0
col 0; len 3; (3):  c2 02 11
col 1; len 6; (6):  00 80 08 23 00 1a
----- end of leaf block dump -----

So a function based index is just a normal index on a virtual hidden column Wink

Now all this sounds very familiar with ... Oracle 11g virtual columns !

Let's create a table in 11g with a virtual column ... just for fun.
create table duh11g ( n number, c char(30), v as (to_number(c)) ) ;

select name,col#,intcol#,segcol#,property from sys.col$ where obj# = ( select obj# from sys.obj$ where name = upper('DUH11G') and type# = 2)
SQL> /

NAME                                 COL#    INTCOL#    SEGCOL#   PROPERTY
------------------------------ ---------- ---------- ---------- ----------
N                                       1          1          1          0
C                                       2          2          2          0
V                                       3          3          0     327688

Property 327688 = 0x50008=0x40000+0x10000+0x00008
0x40000 = virtual column is NLS dependent
0x10000 = virtual column
0x00008 = virtual column

COL# is non-zero - so it is visable. Clearly it's not system generated, but it isn't materialised because SEGCOL# is zero.
Now for fun - let's create an index on the virtual column V.

SQL> create index i_duh11g_v on duh11g( v ) ;

Index created.

And now dump a leafblock of the index :

Block header dump:  0x004159db
Object id on Block? Y
seg/obj: 0x11587  csc: 0x00.ba00b  itc: 2  flg: -  typ: 2 - INDEX
fsl: 0  fnx: 0x0 ver: 0x01

 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0006.001.000002ed  0x00c0028d.01f3.01  -B--    1  fsc 0x0000.00000000
0x02   0x0009.002.000002b7  0x00c0168b.0216.04  ----  540  fsc 0x0000.00000000
Leaf block dump
===============
header address 682344540=0x28abc05c
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 1
kdxcoopc 0x87: opcode=7: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 540
kdxcofbo 1116=0x45c
kdxcofeo 1116=0x45c
kdxcoavs 0
kdxlespl 0
kdxlende 0
kdxlenxt 4282844=0x4159dc
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8032
row#0[1116] flag: ----S-, lock: 2, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  00 41 59 c2 00 00
row#1[1128] flag: ----S-, lock: 2, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  00 41 59 c2 00 01
...
row#539[8019] flag: ----S-, lock: 2, len=13
col 0; len 3; (3):  c2 06 29
col 1; len 6; (6):  00 41 59 c4 00 ae
----- end of leaf block dump -----


What we got is that a normal index on a virtual column in 11g is the same as a function based index in 8i.
And if you use an fbi in 8i you have actually created a hidden virtual column. It doesn't add a physical column to your table, which of course would be very io intensive and could have let to row migration (and possible chaining).
So to get back to the bug - the work around is to created a view on the table in the remote database and then query the view as opposed to querying the actual table or synonym.
This bug is also fixed in 9.2.0.6 and 10g.... but then again ... we had to work with 8.1.7.4 Wink

 

Compliments

Kurt... A very good compilation..

The link to Anjo's blog isn't working. Do you have the correct one?

Thanks Raj

Anjo's link

Hi Raj,

 

Thanks for the update - it seems Anjo has migrated his blog from his own domain (miraclebenelux.nl) to wordpress - here's the new link :

 

http://akolk.wordpress.com/2009/02/19/oracle-on-32-bit-windows-4030-errorsthe/

Fun...

LOL

Cool




Howto's
See DUDE primer for info

Get Support

Europe

Belgium :
Kurt Van Meerbeeck
ORA600 bvba
E-mail
dude@ora600.be
Cell : +32 495 580714

Denmark :
Michael Môller
Service & Support Manager
Miracle AS
E-mail :
hra@miracleas.dk
Cell: +45 53 74 71 27


North America

USA :
Tim Gorman
Evdbt Inc
E-mail
tim@evdbt.com
Cell : +1 303 885 4526

Canada :
Pythian
E-mail
dude@pythian.com
Contact


Latin America

Brazil :
HBtec
E-mail
dude@hbtec.com.br
Cell : +55 47 88497639
Contact


Africa

South Africa :
Kugendran Naidoo
NRG Consulting
E-mail
k@nrgc.co.za
Cell : +27 82 7799275


East Asia Pacific

Australia
Andre Araujo
Pythian Australia
E-mail
dude@pythian.com
Cell : +61 2 9191 7427 ext. 1270