Click here to Skip to main content
15,887,683 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I thought I had this but turns out it's dropping some it shouldn't.

I'm migrating data from an old table into a new table and the tables are dissimilar so am having to be a little creative since I'm SQL illiterate.

The table is defined as;
C#
public partial class User
{
    public int UserID { get; set; }
    public string Name { get; set; }
    public string ImageFile { get; set; }
    public string Password { get; set; }
    public string Email { get; set; }
    public string WebSite { get; set; }
    public string DisplayName { get; set; }
    public bool CanComment { get; set; }
    public bool IsActive { get; set; }
    public System.DateTime CreateDate { get; set; }
    public string Custom { get; set; }
}

And what I'm trying to do is delete all duplicate users using email. This is the query that I thought I had working but doesn't seem to be doing what I want?
SQL
delete from [User] where Email in
(select email from [User] group by Email having Count(Email) > 1)

Thanks for any help given
Posted
Comments
Maciej Los 12-Feb-15 1:52am    
Mike, is it SQL Server, MySQL, PostgreSQL, Access database?

I'd suggest to write query as follow:

SQL
SELECT u.*, ROW_NUMBER() OVER(ORDER BY Email, UserID) AS RowNo
FROM [User] as u

This should produce a recordset with RowNumbers.
To delete duplicates, use this:
SQL
DELETE
FROM (
    SELECT u.*, ROW_NUMBER() OVER(ORDER BY Email, UserID) AS RowNo
    FROM [User] as u
) AS t1
WHERE t1.RowNo>1


Another way is to use:
SQL
DELETE
FROM (
    SELECT u.*, ROW_NUMBER() OVER(PARTITION BY Email ORDER BY UserID) AS RowNo
    FROM [User] as u
) AS t1
WHERE t1.RowNo>1
 
Share this answer
 
v2
Comments
Mike Hankey 12-Feb-15 2:06am    
Great stuff thanks Maciej
Maciej Los 12-Feb-15 2:09am    
You're welcome, Mike ;)
Cheers!
Wendelius 12-Feb-15 15:26pm    
Good choices!
Maciej Los 12-Feb-15 15:28pm    
Thank you, Mika ;)
Just to throw you one more option, if UserId is unique then simply
SQL
DELETE FROM [User] a
WHERE EXISTS (SELECT 1
              FROM [User] b
              WHERE b.email = a.email
              AND   b.userid > a.userid)

or you can use %%physloc%% if no key is present. For more info, see How to remove duplicate rows in SQL Server 2008 when no key is present[^]
 
Share this answer
 
Comments
Maciej Los 12-Feb-15 15:25pm    
2 times 5!
Mike Hankey 12-Feb-15 15:27pm    
Thanks for the extra hand Mike I'll keep that one in mind also for the future.
Wendelius 12-Feb-15 15:33pm    
Glad to help you :)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900