Oracle Text – 5 million rows, 100 hours and counting

Last week, I was using DBMS_REFINITION to migrate a LONG column on a table to a CLOB so that I could use that column in a MULTI_COLUMN_DATASTORE for an Oracle Text search.

 Unfortunately, on a development system with nothing else happening, it took hours and hours. I left it running with the datafiles set to AUTOEXTEND ON and I think it took the best part of a day.

I knew that I would be off site on Friday and the following Monday so I set off an Oracle Text index creation, thinking that it might take a day or so, at worst.

However, as I write, it’s still going. So, started on the 1st, still going on the 6th. Five million rows. Seems a bit excessive to me.

And as it gets towards the end, although we’re not there yet, the indexing has slowed right down. Initially, each parallel process was indexing 100 documents every 6-8 seconds. Now each 100 documents takes nearly 1 minute so, practically, we’re never going to get to the finish.
CREATE INDEX i_myindex_ctx01
ON mytable(ctx_search_multistore_column)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('LEXER mylexer
STORAGE mystorage
DATASTORE mydatastore
SECTION GROUP mysection
FILTER CTXSYS.NULL_FILTER
STOPLIST CTXSYS.EMPTY_STOPLIST
TRANSACTIONAL
SYNC (EVERY "SYSDATE+1/24")
MEMORY 250M')
PARALLEL 4;

There must be some more optimisation that I can do – will need to research, but definitive advice is a bit thin on the ground.

For development purposes, I could make the table NOLOGGING but that probably wouldn’t reflect what I could do in production eventually.

About these ads

6 Responses to Oracle Text – 5 million rows, 100 hours and counting

  1. Michael Smith says:

    Check to make sure that it is still indexing. I’ve run into occasional rows that I’ve had to delete and in.ert again. The logging option is great, just tail the log file to monitor the process. I went through 11.7 million rows in less than a day on a 2 processor xeon system. 5 mil should not take that long.

  2. dombrooks says:

    Thanks Michael. I am tailing the logs and can see that indexing performance has degraded significantly – from 100 docs per process every 6 to 10 seconds now down to nearly 1 minute.

  3. Check the execution path. Make sure that it is the right one.
    After migration from 8i to 10g I got different execution paths for the queries using CONTEXT operators, which results with significant performance downgrade.
    http://oracle-tech.blogspot.com/2006/10/domain-index-not-used-after-migrating.html

  4. dombrooks says:

    Thanks Teko. This issue relates to just the initial index creation on a context index, so no querying involved. The downgrade relates to what is visible from tailing the index creation logs.

  5. Gary says:

    Looking at the stats from the session(s) may be worth a try. Its one of those things that afterwards you think
    “Hmmm, if I knew what it was doing in a 5 minute timeframe when it was going fast, I’d compare it to what it is doing now.”
    I remember having a process start fast and slow right down that was due to undo being applied (and the longer the query ran, the more undo it had to apply), though I can’t see that being an issue with index creation unless there’s updates happening to that data at the same time.

  6. OOops… my fault.
    I guess I didn’t read it completely. :-)
    Sorry …
    I have had such a problems, at least not until today. :-)
    You may want to take look at your storage settings it may be an I/O related issue.
    Also I would check the CPU utilization.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 68 other followers

%d bloggers like this: