OraStory

Oracle Text – 5 million rows, 100 hours and counting

February 6, 2007 · 6 Comments

Last week, I was using DBMS_REFINITION to migrate a LONG column on a table to a CLOB so that I could use that column in a MULTI_COLUMN_DATASTORE for an Oracle Text search.

 Unfortunately, on a development system with nothing else happening, it took hours and hours. I left it running with the datafiles set to AUTOEXTEND ON and I think it took the best part of a day.

I knew that I would be off site on Friday and the following Monday so I set off an Oracle Text index creation, thinking that it might take a day or so, at worst.

However, as I write, it’s still going. So, started on the 1st, still going on the 6th. Five million rows. Seems a bit excessive to me.

And as it gets towards the end, although we’re not there yet, the indexing has slowed right down. Initially, each parallel process was indexing 100 documents every 6-8 seconds. Now each 100 documents takes nearly 1 minute so, practically, we’re never going to get to the finish.
CREATE INDEX i_myindex_ctx01
ON mytable(ctx_search_multistore_column)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('LEXER mylexer
STORAGE mystorage
DATASTORE mydatastore
SECTION GROUP mysection
FILTER CTXSYS.NULL_FILTER
STOPLIST CTXSYS.EMPTY_STOPLIST
TRANSACTIONAL
SYNC (EVERY "SYSDATE+1/24")
MEMORY 250M')
PARALLEL 4;

There must be some more optimisation that I can do – will need to research, but definitive advice is a bit thin on the ground.

For development purposes, I could make the table NOLOGGING but that probably wouldn’t reflect what I could do in production eventually.

Categories: oracle · oracle text

6 responses so far ↓

  • Michael Smith // February 6, 2007 at 12:11 pm | Reply

    Check to make sure that it is still indexing. I’ve run into occasional rows that I’ve had to delete and in.ert again. The logging option is great, just tail the log file to monitor the process. I went through 11.7 million rows in less than a day on a 2 processor xeon system. 5 mil should not take that long.

  • dombrooks // February 6, 2007 at 12:14 pm | Reply

    Thanks Michael. I am tailing the logs and can see that indexing performance has degraded significantly – from 100 docs per process every 6 to 10 seconds now down to nearly 1 minute.

  • Mihajlo Tekic // February 6, 2007 at 3:36 pm | Reply

    Check the execution path. Make sure that it is the right one.
    After migration from 8i to 10g I got different execution paths for the queries using CONTEXT operators, which results with significant performance downgrade.
    http://oracle-tech.blogspot.com/2006/10/domain-index-not-used-after-migrating.html

  • dombrooks // February 6, 2007 at 8:02 pm | Reply

    Thanks Teko. This issue relates to just the initial index creation on a context index, so no querying involved. The downgrade relates to what is visible from tailing the index creation logs.

  • Gary // February 6, 2007 at 9:44 pm | Reply

    Looking at the stats from the session(s) may be worth a try. Its one of those things that afterwards you think
    “Hmmm, if I knew what it was doing in a 5 minute timeframe when it was going fast, I’d compare it to what it is doing now.”
    I remember having a process start fast and slow right down that was due to undo being applied (and the longer the query ran, the more undo it had to apply), though I can’t see that being an issue with index creation unless there’s updates happening to that data at the same time.

  • Mihajlo Tekic // February 6, 2007 at 10:35 pm | Reply

    OOops… my fault.
    I guess I didn’t read it completely. :-)
    Sorry …
    I have had such a problems, at least not until today. :-)
    You may want to take look at your storage settings it may be an I/O related issue.
    Also I would check the CPU utilization.

Leave a Comment