how to delete duplicates

Question

1.00/5 (1 vote)

See more:

OK. I have a table that with some duplicate records (Rows).

Example: employeeID 1001 is twice in the table with every column having the same data except ColumnX is different. So it looks like this. ColumnX shows blank and other one show 7 in the table.

How can I ONLY delete the one that is blank?

1001 ColumnX

1001 7

Thanks.

Posted 1-Feb-16 15:04pm

Member 11829482

Updated 1-Feb-16 15:10pm

v3

Add a Solution

Comments

Sergey Alexandrovich Kryukov 1-Feb-16 21:14pm

Why not preventing duplicates in first place, when the data is added?
—SA

Member 11829482 1-Feb-16 21:49pm

I am putting the raw data into temp table and doing some updates of the fields which is working fine except the issue above.

Philippe Mori 11-Feb-16 22:18pm

Well, the best solution to the problem avoid it as SA mentionned. You should always try your code on a copy of a database first and if the code does not work as intended, fix the code and if it works, then run it on the production database.

If you have already messed up the database, then it is up to you to write code that find those extraneous row and delete them. But try your code on a copy to ensure that the code works as intended.

Member 11829482 1-Feb-16 21:52pm

in addition, the way that data is, after doing some updates to the temp table, I am getting what I need but need to get rid of the above issue,

PIEBALDconsult 1-Feb-16 22:19pm

How about DELETE FROM table WHERE employeeID=1001 AND ColumnX IS NULL ?

Member 11829482 1-Feb-16 22:23pm

there are more than one record like this. So I was trying to count the double employee ids where Columnx is NULL and delete the. But Cant figure out how to construct it.

dan!sh 2-Feb-16 0:37am

Is there any unique column in the table?

Member 11829482 2-Feb-16 9:21am

employeeid is.

5 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

JasonClark03 · Answer 1 · 2016-02-04T01:32:00

Solution 4

This article might be helpful for you to delete duplicate rows https://support.microsoft.com/en-us/kb/139444

Posted 4-Feb-16 1:32am

JasonClark03

Updated 4-Feb-16 1:33am

v2

sandeepmittal11 · Answer 2 · 2016-02-01T20:27:00

Solution 2

you can delete duplicate records using cte.
Refer below post

JavaScript

delete duplicate records in sql server

Posted 1-Feb-16 20:27pm

sandeepmittal11

cigwork · Answer 3 · 2016-02-05T11:01:00

SQL

-- A cut down version of the problem.
create table duplicates
(
 id varchar(10),
 colX varchar(10) null
)

insert into duplicates (id, colX) values ('1001','7')
insert into duplicates (id, colX) values ('1001',null)
insert into duplicates (id, colX) values ('1001',null)
insert into duplicates (id, colX) values ('1002','8')
insert into duplicates (id, colX) values ('1002',null)
insert into duplicates (id, colX) values ('1003','9')

And to identify rows to delete....

SQL

-- Identify duplicate ID values
-- Using a table variable but it could be a temp table.
declare @duplicateKey table (key1 varchar(10))

insert into @duplicateKey (key1)
select id from duplicates group by id having count(id) > 1

Now check we have only 1001, 1002 in our list of deletion candidates.

SQL

select * from @duplicateKey 

(2 row(s) affected)
key1
----------
1001
1002

SQL

-- Delete rows with identified IDs and "invalid" columns 

delete duplicates
from duplicates 
inner join @duplicateKey as deleteList
      on deleteList.key1 = duplicates.id 
      and duplicates.colX is null

-- Prove that we have only "valid" rows.
select * from duplicates

id         colX
---------- ----------
1001       7
1002       8
1003       9

However as was pointed out above you're better off preventing duplicates in the first place if at all possible.

johnwalker10 · Answer 4 · 2016-02-10T19:33:00

Solution 6

>> Select the distinct records into a new table
>> Truncate the old table
>> Merge the new table into the old table

Posted 10-Feb-16 19:33pm

johnwalker10

Member 11154151 · Answer 5 · 2016-02-01T18:54:00

Do you think something like this would be sufficient?

SQL

-- Lets say your table is called lala and has the following values:
create table lala (employeeID int, colx int)
insert into lala values (1001, null)
insert into lala values (1001, 7)
insert into lala values (1001, 8)
insert into lala values (1002, null)
insert into lala values (1002, 7)
insert into lala values (1002, 9)
insert into lala values (1002, 10)
insert into lala values (1003, 11)
insert into lala values (1004, null)
insert into lala values (1005, 12)
insert into lala values (1005, 20)
insert into lala values (1005, 21)
insert into lala values (1005, 22)
insert into lala values (1006, 24)
insert into lala values (1006, null)

-- Now you can define a second table to do the job:

SQL

create table #lala2 (employeeID int, colx int, inc int identity(1, 1))
insert into #lala2 (employeeID, colx)
select employeeID, colx from lala
order by colx desc -- if you want to delete the records with colx=null

-- Delete the multiple occurrencies and leave just one.
delete from #lala2
where exists (select 1
                from #lala2 maxinc
			   where maxinc.employeeID = lala2.employeeID
			     and maxinc.inc > lala2.inc)

delete from lala -- Delete old values
insert into lala select employeeID, colx from #lala2 -- Insert new values

drop table #lala2 -- Drop temp table