What happens when inserting a row during a long query

I am writing data load code that retrieves data from a large slow table in an oracle database. I have read-only access to the data and have no way of changing indexes or affecting query speed in any way.

My select statement takes 5 minutes to execute and returns about 300,000 rows. The system is constantly inserting large batches of new records, and I need to make sure I get all the latest ones, so I need to keep the timestamp the last time I loaded the data.

My question is, if my select statement runs for 5 minutes and new rows are inserted while the selection is running, am I getting new rows or not as a result of the query?

My gut tells me the answer is no, especially since most of those 5 minutes are just time spent transferring data from the database to the local environment, but I can't find direct documentation on the script.

+3


source to share


1 answer


"If my select statement runs for 5 minutes and new rows are inserted while the selection is running, am I getting new rows or not as a result of the query?"

Not. Oracle enforces strong isolation levels and does not allow dirty reads.

The default isolation level is Read Committed. This means that the result set you get in five minutes will be identical to what you would get if Oracle could provide you with all the records in 0.0000001 seconds. Anything that was done after the query was run will not be included in the results. This includes record updates as well as inserts.

Oracle does this by tracking changes to a table in the UNDO tablespace. If it can restrict the original image from this data, your request will run to completion; if for any reason the cancellation information is overwritten your request will fail with a terrible one ORA-1555: Snapshot too old

. Correct: Oracle would rather throw an exception than provide us with an inconsistent result set.



Note that this consistency applies at the instruction level. If we run the same query twice within the same transaction, we can see two different result sets. If this is a problem (I think not in your case), then we need to switch from Read Readed to Serialized isolation.

The Concepts Guide covers Concurrency and consistency in great depth. Read more ...

So, to answer your question, please tick the timestamp from when you started the selection. Specifically, take max(created_ts)

from the table before running the query. This should protect you from the gap that Alex mentions (if the records are not committed at the time they are inserted, there is a chance of losing records if you base the selection over the system timestamp). Although this means that you are issuing two requests in the same transaction, which means that you really need serialized isolation!

+8


source







All Articles