Observed

jdavidb on 2004-07-08T16:09:06

Running massive deletes (on the order of tens of millions of records, probably a hundred million records when it's all done) on an Oracle table. I'm pulling the relevent key information in a select, looping for each record returned and sending the key information back in a delete, committing every 500 or 1000 rows. My DBA suggested I instead selected the ROWID as that is faster. The DBD::Oracle summary says a ROWID "can be treated as a string and used to rapidly (re)select rows."

However, switching to the use of ROWIDs appears to slow my program down about 50-100%. I haven't rigorously measured this, and there could of course be many other factors. But 500 rows in the original program take about .1333 seconds to delete on average, while 500 rows in the ROWID version of the program often take .3333 seconds.

Re: Observed

jplindstrom on 2004-07-08T18:03:57

Sounds weird. Is it the same if you do it from PL/SQL? Is the query plan what you would expect with/without the rowid in the query?

If performance is very important, you could perhaps consider these things:

Why first select and then delete? Is there a client-side decision whether a row should stay or go? It's probably more efficient to do it in the delete where clause.

Delete in parallel from many processes. Oracle can probably handle more load than serialized deletes. At least it can do somehting useful during idle network latency time.

Minimize the number of calls by doing a "delete from x where y in (pk1, pk2, pk3, pk4, ...). If you're lucky, maybe you can delete where x between y and z.

If most rows should go, it may be faster to copy the remaining rows to another table and rename it into place, then truncate the original table.

Re: Observed

jdavidb on 2004-07-08T19:57:00

If most rows should go, it may be faster to copy the remaining rows to another table and rename it into place, then truncate the original table.

Bingo. I think I should'a done that.

Delete in parallel from many processes.

I am. I've currently got one running for every day between June 3rd and June 16th, inclusive. Seems to be going much faster now than it was before (although I've been on June 3rd forever... fortunately, I suspect there are an abnormally enormous number of records on that day compared to the later days).

Why first select and then delete? ... It's probably more efficient to do it in the delete where clause.

Well, a single delete was blowing up due to exceeding the size of the rollback segment. I'm sure now that Oracle probably has some feature where I can say "commit every 500 rows" during a delete, and that's probably what I should be using. I originally tried this with a single delete statement and came back the next day to discover it had blown up.

Is it the same if you do it from PL/SQL?

Well, I should have mentioned my suspicion: that ROWID being an internal binary type is being serialized to text for the DBI and then reconverted in the delete statement. I think PL/SQL would solve that, and perhaps I should look that route. I figured I'd put out the early indications in people's minds that ROWID use in DBI may not be very efficient, though.

Thanks for the feedback. I think I'm going to reevaluate my approach.

Re: Observed

jplindstrom on 2004-07-08T22:04:10
a single delete was blowing up due to exceeding the size of the rollback segment

I remember Thomas Kyte (of http://asktom.oracle.com/ fame) recommending someone to... increase the rollback segment in a case like this :)

Here's a discussion about deleting gazillion rows where they also mention that it may be clever to drop the index before deleting.

Re: Observed

jdavidb on 2004-07-09T14:20:51

increase the rollback segment in a case like this

Well, yeah, there is the obvious solution, but there's also the issue of being one step further removed from those kinds of actions than I would like. Plus I worried that the rollback segment might need to be increased by more than one order of magnitude and that it wouldn't be practical. :)

Thanks for your help.