Merge Duplicate Oracle Merge
I have a table accessible via an oracle database link that I am trying to pull into a local database table because of the reasons.
MERGE INTO MEMBERSHIPS LOCAL
USING (
SELECT DISTINCT
REMOTE.GROUP_NAME "GROUP_NAME",
REMOTE.USER_ACCOUNT "USERNAME",
REMOTE.SOME_OTHER_COLUMN "COL3"
FROM MEMBERSHIPS@link REMOTE
) REMOTE
ON (
REMOTE.GROUP_NAME = LOCAL.GROUP_NAME AND
REMOTE.USERNAME = LOCAL.USERNAME
)
WHEN MATCHED THEN
UPDATE SET
LOCAL.COL3 = REMOTE.COL3
LOCAL.UPDATED_AT = sysdate
WHEN NOT MATCHED THEN
INSERT (ID, GROUP_NAME, USERNAME, COl3, CREATED_AT, UPDATED_AT)
VALUES (MEMBERSHIPS_SEQ.NEXTVAL, REMOTE.GROUP_NAME, REMOTE.USERNAME, REMOTE.COl3, sysdate, sysdate)
Unfortunately, the owner of the original database did not lose much sleep worrying about data integrity, so there are 71 duplicates in 3 million copies which blows up my unique index by group name, username. The merge will be handled if I remove the uniqueness constraint, however these lines will explode the next time I execute the query with ORA-30926: unable to get a stable set of rows in the source tables
.
This is something that will run on a daily basis, so I need to find a way to ignore duplicates
EDIT:
I would have thought that clear-cut people would take care of my problem, but they are not. I still get duplicates:
SELECT DISTINCT
REMOTE.GROUP_NAME,
REMOTE.USER_ACCOUNT
COUNT(*)
FROM MEMBERSHIPS@link REMOTE
GROUP BY
REMOTE.GROUP_NAME,
REMOTE.USER_ACCOUNT
HAVING COUNT(*) > 1;
Shows 71 GROUP_NAME / USER_ACCOUNT combinations that are still duplicated.
source to share
I tried to reproduce the problem you described using two local tables. With the join as you provided it, I have no problem caused by duplicates in the original table (in Oracle 11.2.0.4). If I remove a keyword DISTINCT
from a subquery in a sentence USING
, then I get exactly the same problems you described - constraint violations on the first try, or ORA-30926 on the second attempt if I remove the unique constraint.
Two explanations I can think of for this are that (a) you are hitting some kind of error in Oracle, perhaps using DISTINCT on remote subqueries, or (b) the merge operator you are actually doing does not include DISTINCT
. (I also considered the possibility that NULL values might cause unexpected results from DISTINCT operations, but I couldn't think of how this would happen.)
EDIT: Another thoughtful explanation: if two databases use different character sets, I wonder if it is possible that values that are different in the original table are converted when transitioning to the same values?
source to share
In a situation like this, you can always try to rank the rows to avoid duplicates, not DISTINCT. In this case, it will be something like this:
MERGE INTO MEMBERSHIPS LOCAL
USING (
SELECT rank() over(partition by REMOTE.GROUP_NAME
,REMOTE.USER_ACCOUNT
order by NVL(REMOTE.UPDATED_AT,REMOTE.CREATED_AT) DESC NULLS LAST) r,
REMOTE.GROUP_NAME "GROUP_NAME",
REMOTE.USER_ACCOUNT "USERNAME",
REMOTE.SOME_OTHER_COLUMN "COL3"`
FROM MEMBERSHIPS@link REMOTE ) REMOTE
ON (
REMOTE.GROUP_NAME = LOCAL.GROUP_NAME
AND REMOTE.USERNAME = LOCAL.USERNAME
AND REMOTE.r = 1
)
WHEN MATCHED THEN
UPDATE SET
LOCAL.COL3 = REMOTE.COL3
LOCAL.UPDATED_AT = sysdate
WHEN NOT MATCHED THEN
INSERT (ID, GROUP_NAME, USERNAME, COl3, CREATED_AT, UPDATED_AT)
VALUES (MEMBERSHIPS_SEQ.NEXTVAL, REMOTE.GROUP_NAME, REMOTE.USERNAME, REMOTE.COl3, sysdate, sysdate);
source to share