Given that
sno
is always going to be 1 (because you are partitioning by
sname
AND marks
) you could simplify this to
select distinct 1 as sno,sname, marks from @tbl
The key point being that it is your introduction of the row_number() that is actually causing the issue.
But I suspect you want
sno
to be something meaningful (correct me if I'm wrong)
If you want it to show the number of the mark per name then use
select row_number() over (partition by sname order by sname,marks) as sno, sname, marks
FROM (select distinct 1 as sno,sname, marks from @tbl) A
which will give you
sno sname mark
1 A 9
2 A 10
1 B 10
2 B 40
1 C 10
1 D 15
If you want to give each row a number then remove the partition altogether
select row_number() over (order by sname,marks) as sno, sname, marks
FROM (select distinct sname, marks from @tbl) A
sno sname mark
1 A 9
2 A 10
3 B 10
4 B 40
5 C 10
6 D 15
------------------ EDIT AFTER OP COMMENT ---------------
To only get the lines that are duplicated there are two ways you could try:
1. To get the sname and marks and the number of duplications try this
select sname, marks, max(sno) as numberOfDups from
(select row_number() over (partition by sname, marks order by sname,marks) as sno, sname, marks
FROM @tbl) A
where sno > 1
group by sname, marks
2. To get the full list of items that have duplicates try this:
;with cte as
(
select row_number() over (partition by sname, marks order by sname,marks) as sno, sname, marks
from @tbl
)
select *
from cte
inner join (select sname, marks, max(sno) as numberOfDup from cte where sno > 1 group by sname, marks) B on cte.sname = B.sname AND cte.marks = B.marks
The key is the INNER JOIN on a sub-query looking at the CTE a second time. It's not just a simple "give me the rows where sno > 1" - the join ensures that while we don't see the non-duplicated rows we
do see
all the rows where there has been a duplicate - i.e. A 10 where sno - 1 AND 2, not just 2
------------------ EDIT AFTER ANOTHER OP COMMENT ---------------
Introducing other columns which must be taken into account - the joins take "much time for the on conditions
Test data:
declare @tbl table (id int identity(1,1),sname varchar(10),marks int, other int)
insert into @tbl values('A',10,1),('B',10,1),('A',10,1),('B',40,1),('C',10,2),('D',15,2),('A',9,1), ('A',10,2)
My original solution would return
1 A 10 2 3
2 A 10 1 3
3 A 10 1 3
Which is incorrect when taking into account the
other
column.
We can use
Checksum [
^] to "combine" all of the columns that are relevent e.g.
;with cte1 as
(
select checksum(sname, marks, other) as sno, sname, marks, other from @tbl
)
select sname, marks, other
from cte1
inner join (select checksum(sname, marks, other) as sno from @tbl group by checksum(sname, marks, other) having count(*) > 1) A on cte1.sno = A.sno
results
A 10 1
A 10 1