Click here to Skip to main content
15,896,063 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Im having a table with crores of records. Im having column called Narration in my table.
In Narration column data in the form of sentance. In between the sentance Im having unique Code.
I have to extract that code from the sentance.

Code having varying length from 15 to 20.
Code start with Letter and End with Numbers
Code appears in any part of sentance

How to extract code from the sentance.

Example for my requirement

Narration                                                         Code

two coll-EntryNo-1-NIRMLTW02140010                                NIRMLTW02140010
EntryNo 54 For Amount towards & received for CDDHLTW1203280005    CDDHLTW1203280005
CASH COLLECTION-EntryNo-1-PONAMPJ10210029                         PONAMPJ10210029
Being Charges Entry For Loan - RNGTAPJ1204020026 - RGT1274        RNGTAPJ1204020026
amt rceived pack  20686-EntryNo-1-GNGE1PJ1203300016               GNGE1PJ1203300016

I have to extract code like this.
Posted
Updated 28-Mar-13 20:39pm
v2
Comments
Hemant Singh Rautela 29-Mar-13 3:39am    
You have make some work yourself... I give you idea that ..
convert string in array of character then apply your filter conditions....using looping.

But 1 problem in your input string that your 'Code' not is a single word always. If it is alwlays single word (having space before & after) then it can make easy...

There are 2 possibilities:
1) Using PATINDEX() AND SUBSTRING() functions, but it's not optimized
2) Using Regular Expressions in T-SQL


Ad 1)
PATINDEX[^]
SUBSTRING[^]
Pattern Matching With SQL[^]

For example, below code:
SQL
DECLARE @tbl TABLE(Narration NVARCHAR(100))

INSERT INTO @tbl (Narration)
VALUES('two coll-EntryNo-1-NIRMLTW02140010')
INSERT INTO @tbl (Narration)
VALUES('EntryNo 54 For Amount towards & received for CDDHLTW1203280005')
INSERT INTO @tbl (Narration)
VALUES('CASH COLLECTION-EntryNo-1-PONAMPJ10210029')
INSERT INTO @tbl (Narration)
VALUES('Being Charges Entry For Loan - RNGTAPJ1204020026 - RGT1274')
INSERT INTO @tbl (Narration)
VALUES('amt rceived pack  20686-EntryNo-1-GNGE1PJ1203300016')

DECLARE @tmp TABLE (Narration NVARCHAR(100), Position INT, Code NVARCHAR(100))

DECLARE @pattern NVARCHAR(30)
SET @pattern = '%[^0-9][A-Z][^-][^ ][0-9]%[^A-Za-z]%'

INSERT INTO @tmp (Narration, Position, Code)
SELECT Narration, Position, Code
FROM (
    SELECT Narration, PATINDEX(@pattern, Narration) + 1 AS Position, SUBSTRING(Narration, PATINDEX(@pattern, Narration)+1, LEN(Narration) -PATINDEX(@pattern, Narration))AS Code
    FROM @tbl
    ) AS T

SELECT *
FROM @tmp


returns:
RowNoNarrationCode
1amt rceived pack 20686-EntryNo-1-GNGE1PJ1203300016No-1-GNGE1PJ1203300016
2Being Charges Entry For Loan - RNGTAPJ1204020026 - RGT1274 - RGT1274
3CASH COLLECTION-EntryNo-1-PONAMPJ10210029No-1-PONAMPJ10210029
4EntryNo 54 For Amount towards & received for CDDHLTW1203280005o 54 For Amount towards & received for CDDHLTW1203280005
5two coll-EntryNo-1-NIRMLTW02140010No-1-NIRMLTW02140010


Not satisfying? No matter. There is still hope and... solution. ;)

After that you need to parse ' ' (spaces) and '-' using CTE[^] to get correct code.

Here is an example:
SQL
CREATE TABLE #tmp (RowNo INT, Code NVARCHAR(100))

;WITH myCodes AS
(
	SELECT RowNo, LTRIM(RTRIM(LEFT(Code, CHARINDEX(' ',Code)-1))) AS myCode,
			LTRIM(RTRIM(RIGHT(Code, LEN(Code) - CHARINDEX(' ',Code)))) AS Remainder
	FROM @tmp
	WHERE NOT Code IS NULL AND CHARINDEX(' ',Code)>0
	UNION ALL
	SELECT RowNo, LTRIM(RTRIM(LEFT(Remainder, CHARINDEX(' ',Remainder)-1))) AS myCode,
			LTRIM(RTRIM(RIGHT(Remainder, LEN(Remainder) - CHARINDEX(' ',Remainder)))) AS Remainder
	FROM myCodes
	WHERE NOT Remainder IS NULL AND CHARINDEX(' ',Remainder)>0
	UNION ALL
	SELECT RowNo, Remainder AS myCode, NULL AS Remainder
	FROM myCodes
	WHERE NOT Remainder IS NULL AND CHARINDEX(' ',Remainder)=0  
)
INSERT INTO #tmp (RowNo, Code)
SELECT RowNo, myCode AS Code
FROM myCodes
WHERE LEN(myCode)>14 AND LEN(myCode)<21

;WITH newCodes AS
(
	SELECT RowNo, LTRIM(RTRIM(LEFT(Code, CHARINDEX('-',Code)-1))) AS myCode,
			LTRIM(RTRIM(RIGHT(Code, LEN(Code) - CHARINDEX('-',Code)))) AS Remainder
	FROM @tmp
	WHERE NOT Code IS NULL AND CHARINDEX('-',Code)>0
	UNION ALL
	SELECT RowNo, LTRIM(RTRIM(LEFT(Remainder, CHARINDEX('-',Remainder)-1))) AS myCode,
			LTRIM(RTRIM(RIGHT(Remainder, LEN(Remainder) - CHARINDEX('-',Remainder)))) AS Remainder
	FROM newCodes
	WHERE NOT Remainder IS NULL AND CHARINDEX('-',Remainder)>0
	UNION ALL
	SELECT RowNo, Remainder AS myCode, NULL AS Remainder
	FROM newCodes
	WHERE NOT Remainder IS NULL AND CHARINDEX('-',Remainder)=0  
)
INSERT INTO #tmp (RowNo, Code)
SELECT RowNo, myCode AS Code
FROM newCodes
WHERE LEN(myCode)>14 AND LEN(myCode)<21

SELECT *
FROM #tmp
ORDER BY RowNo

DROP TABLE #tmp


results:
RowNoCode
1GNGE1PJ1203300016
3PONAMPJ10210029
4CDDHLTW1203280005
5NIRMLTW02140010



Ad 2)
Regular Expressions Make Pattern Matching And Data Extraction Easier[^]
http://blogs.msdn.com/b/khen1234/archive/2005/05/11/416392.aspx[^]
xp_pcre - Regular Expressions in T-SQL[^]
 
Share this answer
 
v3
You can do similar like this :
SQL
declare @Str varchar(max)
set @Str='two coll-EntryNo-1-NIRMLTW02140010 EntryNo 54 For Amount towards & received for CDDHLTW1203280005
         CASH COLLECTION-EntryNo-1-PONAMPJ10210029 Being Charges Entry For Loan - RNGTAPJ1204020026 - RGT1274
         amt rceived pack  20686-EntryNo-1-GNGE1PJ1203300016 '

SELECT  substring(substring(@Str, PATINDEX('%-1-%', @Str) ,40),4,15) as Code
SELECT  substring(substring(@Str, PATINDEX('%For %', @Str) ,40),4,15) as Code
SELECT  substring(substring(@Str, PATINDEX('% - %', @Str) ,40),3,15) as Code


or you can Search from table also :

SQL
SELECT  substring(substring(narration, PATINDEX('%-1-%', narration) ,40),3,15) as Code
FROM TableName
WHERE PATINDEX('%-1-%', narration) <> 0
union all
SELECT  substring(substring(narration, PATINDEX('%For %', narration) ,40),4,15) as Code
FROM TableName
WHERE PATINDEX('%For %', narration) <> 0
union all
SELECT  substring(substring(narration, PATINDEX('% - %', narration) ,40),3,15) as Code
FROM TableName
WHERE PATINDEX('% - %', narration) <> 0
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900