Click here to Skip to main content
15,886,362 members
Please Sign up or sign in to vote.
1.24/5 (3 votes)
See more:
C#
Below is my table

Name1	Name2	Distance
BG	CH	280
CH 	BG	280
CH 	HY	350
HY	CH	350
CH 	CBE	500

I want remove duplicates in it How can i do it?

In above table describes distance between two cities, name1 is city1 & name2 is City2

Urgent help and attention is required..,

Thanks in Advance
Posted
Updated 23-Jan-21 19:05pm
Comments
Sergey Alexandrovich Kryukov 1-Dec-15 14:00pm    
Why not preventing adding duplicates in first place?
—SA
tpkpradeep 1-Dec-15 14:21pm    
well good question.. :) customers of different type.., :)

Complicated...but possible. Have a look here: https://support.microsoft.com/en-us/kb/139444[^] - it explains the process, but I'd probable create an SQL function which took the two names, and returned a composite name that was ordered, so the BG, CH returned the same value as CH, BG. And I hope you have an ID column in there, because you are going to need one!
 
Share this answer
 
Comments
tpkpradeep 1-Dec-15 14:22pm    
I'd probable create an SQL function which took the two names, and returned a composite name that was ordered, so the BG, CH returned the same value as CH, BG

How can i perform this? I tried in different ways but i couldn't.., help me this.., please
OriginalGriff 1-Dec-15 14:33pm    
You know how to create an SQL function (as opposed to a Stored Procedure) don't you?
tpkpradeep 2-Dec-15 7:40am    
hi OriginalGriff
how about this..

select dup1.name1,dup1.tab1,dup1.dist
from
(
select t.rk
from
(
SELECT t1.name1 tab1,t2.name1,dense_rank() over (order by t1.name1) as rk
FROM #tmp_dup t1
inner join #tmp_dup t2 on t1.name1 = t2.name2 and t1.name2 = t2.name1
AND t1.Name1 NOT IN (SELECT name2 FROM #tmp_dup WHERE name2=t1.name1 AND name1 = t2.name2)
group by t1.name1,t2.name1
) as t
group by t.rk
having count(t.rk) = 1
) as dup
left outer join
(
SELECT t1.name1 tab1,t2.name1,dense_rank() over (order by t1.name1) as rk,t1.dist
FROM #tmp_dup t1
inner join #tmp_dup t2 on t1.name1 = t2.name2 and t1.name2 = t2.name1
AND t1.Name1 NOT IN (SELECT name2 FROM #tmp_dup WHERE name2=t1.name1 AND name1 = t2.name2)
group by t1.name1,t2.name1,t1.dist
) as dup1 on dup.rk = dup1.rk
tpkpradeep 1-Dec-15 15:26pm    
:) I know..,

I tried this

with cte as(
select name1,name2,
row_number() over
(
order by name1,name2
) as cnt
from
#tmp_dup
group by name1,name2
)


select t1.name1,t1.name2,t1.rk
from
(
select a.name1,a.name2,a.cnt as cnt,
dense_rank() over (order by a.name1) as rk
from cte a
where exists(
select 1 from cte b where a.name1 = b.name2 and a.name2 = b.name1)
) t1
group by t1.name1,t1.name2,t1.rk


i got output as below

name1 name2 rk
BG CH 1
CH BG 2
CH HY 2
HY CH 3

but i got strucked removing the value 2
The solution consists in first place to find exactly where the "bad registers" are. If we see the records (BG, CH, 280) and (CH,BG,280), the problems consists that are "equal" if they were inserted always both fields in alfabetical order that is (BG, CH,280). So a record is not good in first opinion, if name2 is
previous to name1.

The query:
SQL
SELECT 
	CASE WHEN t1.name1<>(CASE WHEN t1.name1<t1.name2 THEN t1.name1 ELSE t1.name2 end) THEN 'bad register' else 'good registerk' end as clase
	,t1.*
  FROM table1 t1

says that (CH, BG, 280), (HY,CH, 350) are 'wrong register' (what is right), but also says that (CH,CBE, 500) is a 'wrong register' (what is worng).
This last record really isnt´t a 'wrong register' because there isn´t a 'mirror' register (CBE, CH, 250)

So if we use the query:
SQL
SELECT 
	CASE WHEN t1.name1<>(CASE WHEN t1.name1<t1.name2 THEN t1.name1 ELSE t1.name2 end) THEN 'wrong register' else 'ok register' end as clase
	,t1.*
  FROM table1 t1
  JOIN table1 t2
  ON t1.name1=t2.name2 AND t1.name2=t2.name1

we only have the real 'wrong register' (because if there is no 'mirrow' register it says that it´s a 'ok register')

Then, the query:
SQL
SELECT t1.*
  FROM table1 t1
  JOIN table1 t2
  ON t1.name1=t2.name2 AND t1.name2=t2.name1
  WHERE t1.name1<>(CASE WHEN t1.name1<t1.name2 THEN t1.name1 ELSE t1.name2 end) 


just list the records that you need to delete.

or simplier:
SQL
SELECT t1.*
  FROM table1 t1
  JOIN table1 t2
  ON t1.name1=t2.name2 AND t1.name2=t2.name1
  WHERE t1.name1>t1.name2



Finaly the order:
SQL
delete TABLEdELETE
	FROM table1 tableDelete
	inner JOIN table1 t1
		on tabledelete.name1=t1.name1 and tabledelete.name2=t1.name2	
	inner JOIN  table1 t2
		ON ( t1.name1>t1.name2 AND t1.name1=t2.name2 AND t1.name2=t2.name1 ) 

deletes the register that you want.

Note: If your data are:
BG CH 1
CH BG 2
CH HY 2
HY CH 3

and you don´t want that the record CH, BG, 2 will be delete then use:
SQL
delete TABLEdELETE
	FROM table1 tableDelete
	inner JOIN table1 t1
		on tabledelete.name1=t1.name1 and tabledelete.name2=t1.name2 	
	inner JOIN  table1 t2
		ON ( t1.name1>t1.name2 AND t1.name1=t2.name2 AND t1.name2=t2.name1 AND t1.distance=t2.distance ) 
 
Share this answer
 
delete from table1 t1
WHERE t1.name1>t1.name2
 
Share this answer
 
Comments
CHill60 3-Dec-19 5:53am    
So, if 4 years ago the OP had a table that contained
Name1	Name2	Distance
BG	CH	280
CH 	BG	280
HY 	CH	350
CH 	CBE	500
You would have them remove the HY + CH line, despite the fact it is not duplicated.
select * from table1 where rowid in (select t1.rowid from table1 t1,table1 t2 where t1.Name1>t1.Name2 and t1.Name1=t2.Name2 and t2.Name1=t1.Name2 );

delete from table1 where rowid in (select t1.rowid from table1 t1,table1 t2 where t1.Name1>t1.Name2 and t1.Name1=t2.Name2 and t2.Name1=t1.Name2 );
 
Share this answer
 
Comments
CHill60 4-Dec-19 3:46am    
You posted a solution yesterday. Do not post multiple solutions to the same question - it is confusing for everyone and just looks like rep-point farming.
This solution is no better than your original one but is starting to look like a copy of Solution 4.
Always read the previous solutions and make sure you are bringing something new to the thread … unlikely after 4 years has passed.
Member 14676000 5-Dec-19 10:06am    
Above solution 4 is not a copy of solution 4,it is improved solution of 4 ,they never used rowid concept
I have come up with this below SQL query and it works.The main idea is to sort the values in the two columns.

SELECT Name1,Name2,distance from name_tbl INNER JOIN(
SELECT DISTINCT CASE WHEN name1<name2 THEN name1 ELSE name2 END as firstName,
CASE WHEN name1<name2 THEN name2 ELSE name1 END as secondName
FROM name_tbl
)t1
ON name_tbl.name1 = t1.firstName AND name_tbl.name2 = t1.secondName
 
Share this answer
 
v2
Comments
CHill60 25-Jan-21 4:00am    
I suspect the reason you are being downvoted is because the principle is no different to the "principle" given in Solutions 1, 4 and 6. You have just used a sub-query instead of a function or CTE.
I applaud your desire to help, but I would stick to answering newer posts, where the OP still needs help

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900