Merge Duplicate Oracle Merge

I have a table accessible via an oracle database link that I am trying to pull into a local database table because of the reasons.

MERGE INTO MEMBERSHIPS LOCAL
USING (
  SELECT DISTINCT
    REMOTE.GROUP_NAME        "GROUP_NAME",
    REMOTE.USER_ACCOUNT      "USERNAME",
    REMOTE.SOME_OTHER_COLUMN "COL3"
  FROM MEMBERSHIPS@link REMOTE
) REMOTE
ON (
  REMOTE.GROUP_NAME = LOCAL.GROUP_NAME AND
  REMOTE.USERNAME   = LOCAL.USERNAME
)
WHEN MATCHED THEN
  UPDATE SET
    LOCAL.COL3       = REMOTE.COL3
    LOCAL.UPDATED_AT = sysdate
WHEN NOT MATCHED THEN
  INSERT (ID, GROUP_NAME, USERNAME, COl3, CREATED_AT, UPDATED_AT)
  VALUES (MEMBERSHIPS_SEQ.NEXTVAL, REMOTE.GROUP_NAME, REMOTE.USERNAME, REMOTE.COl3, sysdate, sysdate)

      

Unfortunately, the owner of the original database did not lose much sleep worrying about data integrity, so there are 71 duplicates in 3 million copies which blows up my unique index by group name, username. The merge will be handled if I remove the uniqueness constraint, however these lines will explode the next time I execute the query with ORA-30926: unable to get a stable set of rows in the source tables

.

This is something that will run on a daily basis, so I need to find a way to ignore duplicates

EDIT:

I would have thought that clear-cut people would take care of my problem, but they are not. I still get duplicates:

SELECT DISTINCT
REMOTE.GROUP_NAME,
REMOTE.USER_ACCOUNT
COUNT(*)
FROM MEMBERSHIPS@link REMOTE
GROUP BY
REMOTE.GROUP_NAME,
REMOTE.USER_ACCOUNT
HAVING COUNT(*) > 1;

      

Shows 71 GROUP_NAME / USER_ACCOUNT combinations that are still duplicated.

+3


source to share


2 answers


I tried to reproduce the problem you described using two local tables. With the join as you provided it, I have no problem caused by duplicates in the original table (in Oracle 11.2.0.4). If I remove a keyword DISTINCT

from a subquery in a sentence USING

, then I get exactly the same problems you described - constraint violations on the first try, or ORA-30926 on the second attempt if I remove the unique constraint.

Two explanations I can think of for this are that (a) you are hitting some kind of error in Oracle, perhaps using DISTINCT on remote subqueries, or (b) the merge operator you are actually doing does not include DISTINCT

. (I also considered the possibility that NULL values ​​might cause unexpected results from DISTINCT operations, but I couldn't think of how this would happen.)



EDIT: Another thoughtful explanation: if two databases use different character sets, I wonder if it is possible that values ​​that are different in the original table are converted when transitioning to the same values?

+1


source


In a situation like this, you can always try to rank the rows to avoid duplicates, not DISTINCT. In this case, it will be something like this:



MERGE INTO MEMBERSHIPS LOCAL
USING (
  SELECT rank() over(partition by REMOTE.GROUP_NAME
        ,REMOTE.USER_ACCOUNT
        order by NVL(REMOTE.UPDATED_AT,REMOTE.CREATED_AT) DESC NULLS LAST) r,
    REMOTE.GROUP_NAME        "GROUP_NAME",
    REMOTE.USER_ACCOUNT      "USERNAME",
    REMOTE.SOME_OTHER_COLUMN "COL3"`
  FROM MEMBERSHIPS@link REMOTE ) REMOTE
ON (
  REMOTE.GROUP_NAME = LOCAL.GROUP_NAME 
  AND REMOTE.USERNAME   = LOCAL.USERNAME
  AND REMOTE.r = 1
)
WHEN MATCHED THEN
  UPDATE SET
    LOCAL.COL3       = REMOTE.COL3
    LOCAL.UPDATED_AT = sysdate
WHEN NOT MATCHED THEN
  INSERT (ID, GROUP_NAME, USERNAME, COl3, CREATED_AT, UPDATED_AT)
  VALUES (MEMBERSHIPS_SEQ.NEXTVAL, REMOTE.GROUP_NAME, REMOTE.USERNAME, REMOTE.COl3, sysdate, sysdate);

      

+1


source







All Articles