Click here to Skip to main content
14,244,275 members
Rate this:
Please Sign up or sign in to vote.
See more:
i am extracting text from PDF it has English and Urdu text , English text extracted as expected but ItextSharp library convert Urdu text into special characters kindly guide me

What I have tried:

PdfReader reader = new PdfReader(pdfpath);

int pageNum = reader.NumberOfPages;


for (int i = 177; i <= pageNum; i++)
{
// this line convert urdu into special character
text = PdfTextExtractor.GetTextFromPage(reader, i, new LocationTextExtractionStrategy());




}
Posted
Comments
Richard MacCutchan 8-Jun-19 4:01am
   
No, iTextSharp does not convert anything. You need to use the correct font and character set to display the Urdu characters.
Noman Suleman 8-Jun-19 4:26am
   
how i can change font and character ?
Richard MacCutchan 8-Jun-19 5:01am
   
Assuming the PDF file displays the text in Urdu, you can get the details from the file. Alternatively you just need to set the correct font and character set in your display code.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100