Insert fail then update OR Download and then decide, insert or update

I have a web service in Java that receives a list of information that needs to be inserted or updated in a database. I don't know which one needs to be inserted or updated.

Which one is best for achieving the best results:

  • Go through the list (list of objects, with the pk table on it), try to insert a record into the database. If insertion failed, run update

  • Try loading the record from the database. if the results were obtained by updating, if not insert a record.

  • another variant? tell me about this:)

In the first calls, I believe most of the entries will be new bd inputs, but there will be a saturation point that most of them will update.

I'm talking about a DB table that can reach over 100 million records in mature form.

What is your approach? Productivity is my most important goal.

0


source to share


5 answers


If your database supports MERGE, I would think this is most efficient (and treats all data as one set).

Cm:



http://www.oracle.com/technology/products/oracle9i/daily/Aug24.html

https://web.archive.org/web/1/http://blogs.techrepublic%2ecom%2ecom/datacenter/?p=194

+4


source


If productivity is your goal, get rid of the word first, iterate from your vocabulary! learn to do things in sets.



If you need to update or insert, please do the update first. Otherwise, you can easily update your recorded recording by accident. If you do, it helps to get the id, which you can look at to see if the entry exists. If id exists then update, otherwise insert.

+1


source


It is important to understand the balance or relationship between the number of inserts and the number of updates on the list you receive. IMHO you have to implement an abstract strategy that says "persist in the database". Then create specific strategies that (for example):

  • checks the primary key if null records are found, inserts and also updates
  • Whether an update is in progress, and if it fails, an insert.
  • others

And then pull the strategy to use (like the fully qualified class name) from the config file. This way, you can easily switch from one strategy to another. If possible, it might be depending on your domain, you can put a heuristic that picks the best strategy based on input objects on the set.

+1


source


MySQL supports this:

INSERT INTO foo
SET bar='baz', howmanybars=1
ON DUPLICATE KEY UPDATE howmanybars=howmanybars+1

      

+1


source


Option 2 will not be the most effective. The database will already do this check for you when you do the actual insert or update to force the primary key. By doing this check yourself, you will face the overhead of looking up tables in half, as well as an extra trip from your Java code. Choose which case is most likely and the code is optimistic.

Expanding on option 1, you can use a stored procedure to handle insert / update. This example with PostgreSQL syntax assumes that insert is normal.

CREATE FUNCTION insert_or_update(_id INTEGER, _col1 INTEGER) RETURNS void
AS $$
    BEGIN
        INSERT INTO
            my_table (id, col1)
        SELECT
            _id, _col1;
    EXCEPTION WHEN unique_violation THEN
        UPDATE
            my_table
        SET
            col1 = _col1
        WHERE
            id = _id;
    END;
END;
$$
LANGUAGE plpgsql;

      

You can also make the update normal and then check the number of rows affected by the update statement to determine if the row is indeed new and you need to do an insert.

As mentioned in some of the other answers, the most efficient way to handle this operation is in one batch:

  • Take all rows passed to the web service and add them to a temporary table
  • Update rows in mater table from temp table
  • Insert new rows into master table from temp table
  • Dispose of temporary table

The type of temporary table to use and the most efficient way to manage it will depend on the database you are using.

+1


source







All Articles