Click here to Skip to main content
15,868,162 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi Everyone!

Good day to all of you. I just have a quick inquiry as to why below query is taking too long to complete. Actually, there are times it runs smoothly and times it never finishes at all.

It had me thinking maybe the problem is the query itself but database fine tuning could also help. Whenever it doesn't complete, I just manually run the query by performing the inner query first (getting the duplicates and excluding the MAX obj IDs) therefore deleting only duplicates with the MIN OBJ ID.

I am thinking of splitting into multiple scripts but suggestions are always welcome.

Here is the query:
SQL
DELETE FROM U_TABLE
WHERE   OBJ_ID IN (
    SELECT  OBJ_ID
    FROM    U_TABLE
    WHERE   (S_FIELD, P_FIELD) IN (
        SELECT  S_FIELD, P_FIELD
        FROM    U_TABLE
        GROUP BY S_FIELD, P_FIELD
        HAVING COUNT (*) > 1)
        )
        AND OBJ_ID NOT IN (
            SELECT MAX (OBJ_ID)
            FROM    U_TABLE
            GROUP BY S_FIELD, P_FIELD
            HAVING COUNT (*) > 1
            )

I'd like to thank you all who will help, in advance.
Posted
Updated 28-Jun-15 22:00pm
v2
Comments
jgakenhe 28-Jun-15 11:48am    
That is an ugly delete statement. Having to do 3 lookups to get the correct items will be costly. To speed it up, make sure there are indexes on S_FIELD and P_FIELD in each of the tables and remember that you should make the dataset smallest on your first selects.
serigraphie 28-Jun-15 11:56am    
I know right! The previous developer wrote this one and it just checks itself multiple times to get the OBJ_IDs desired. I understand the part "create indexes on S and P fields" however, how do I make the dataset smallest as per your advice? You mean re-write? If so, do you have any suggestions?

so you want to delete all the duplicates except the "youngest" one?
I would try it this way:


SQL
CREATE TABLE U_TABLE (OBJ_ID INT IDENTITY(1,1), S_FIELD INT, P_FIELD INT )

INSERT INTO U_TABLE (S_FIELD, P_FIELD) VALUES (1, 1)
INSERT INTO U_TABLE (S_FIELD, P_FIELD) VALUES (1, 1)
INSERT INTO U_TABLE (S_FIELD, P_FIELD) VALUES (1, 1)
INSERT INTO U_TABLE (S_FIELD, P_FIELD) VALUES (1, 2)
INSERT INTO U_TABLE (S_FIELD, P_FIELD) VALUES (1, 2)
INSERT INTO U_TABLE (S_FIELD, P_FIELD) VALUES (1, 2)

SELECT * FROM U_TABLE

WITH cte AS (
  SELECT ROW_NUMBER() OVER (PARTITION BY S_FIELD, P_FIELD
                            ORDER BY  OBJ_ID DESC ) RN
  FROM   U_TABLE)
DELETE FROM cte WHERE RN>1



this works with SQL-Server 2008 (you didn't mention your version) so if you like it, you need to check whether common table expressions will work for you
 
Share this answer
 
Comments
serigraphie 28-Jun-15 11:57am    
Hi Franzbe,

Thank you for your comment! This is actually PL/SQL so running in Oracle. I am not sure if above will work on that environment but I could try. Thank you, I will accept solution if it works for me.

Thanks a ton!
The solution from FranzBe is a good one, but the syntax is wrong for Oracle.
Try:
SQL
DELETE FROM U_TABLE t1
WHERE EXISTS (
    SELECT  1
    FROM    (
        SELECT  OBJ_ID
               ,ROW_NUMBER() OVER (PARTITION BY S_FIELD, P_FIELD ORDER BY  OBJ_ID DESC ) RN
        FROM    U_TABLE
        ) t2
    WHERE   t2.RN > 1
        AND t1.OBJ_ID = t2.OBJ_ID
    )
;

To get high performance you need two indices.
One on (S_FIELD, P_FIELD, OBJ_ID) and the other one on (OBJ_ID)
 
Share this answer
 
Comments
serigraphie 12-Jul-15 11:37am    
Hi Jorgen, thank you! I will try this one out!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900