Click here to Skip to main content
14,268,557 members
Rate this:
Please Sign up or sign in to vote.
I read a big txt file with greek characters (in Notepad++ says that is ansi encoding) and I break it in shorter txt files.
The most of the files are ansi but some files are not ansi. This is crazy but it happens every time I tried .

What I have tried:

the code is
objStreamReader = New StreamReader(fileName, System.Text.Encoding.GetEncoding(28597))
DobjStreamWriter = New StreamWriter(WDfilename, False, System.Text.Encoding.GetEncoding(28597))
strLine = objStreamReader.ReadLine
Do While Not Dstrline Is Nothing
     DWstrLine = strline
     DobjStreamWriter.WriteLine(DWstrLine)
     Dstrline = DobjStreamReader.ReadLine
 loop

(the statement
DobjStreamWriter.WriteLine(DWstrLine, System.Text.Encoding.GetEncoding(28597))
is not working too)

The 28597 code page is for iso-8859-7, I think that it is greek ansi.

Please any advice will be helpfull.
thank you in advance
Posted
Updated 12-Aug-19 3:39am
v5
Comments
phil.o 9-Aug-19 7:06am
   
Since the issue is with your output files, the code that you should show us is the one which initializes and configures the StreamWriter. Please Improve your question with the relevant code-block.
perogr 9-Aug-19 7:34am
   
Thanks for your comment
Richard Deeming 9-Aug-19 12:06pm
   
strLine = objStreamReader.ReadLine
Do While Not Dstrline Is Nothing
     DWstrLine = strline

That doesn't make any sense to me. You don't initialize the variable you're using to control the loop until after you've checked whether it's Nothing.

You also seem to have two different StreamReader variables - objStreamReader, which you use for the first read, and DobjStreamReader which you use within the loop.

I'd expect to see:
Dim fileEncoding As System.Text.Encoding = System.Text.Encoding.GetEncoding(28597)
Using reader As New StreamReader(fileName, fileEncoding)
    Using writer As New StreamWriter(WDfilename, False, fileEncoding)
        Dim line As String = reader.ReadLine()
        Do While line IsNot Nothing
            writer.WriteLine(line)
            line = reader.ReadLine()
        Loop
    End Using
End Using
perogr 13-Aug-19 5:02am
   
Thank you very match.
The key is the initializing.
My code is very complicated and it took me a long time to try your solution.
I read a lot of files for EDI , Header Files and detail Files, and I try to split it in shorter files.
Thank's again for your assistance
Rate this:
Please Sign up or sign in to vote.

Solution 1

If you're just "streaming", there's no need to be concerned about "encoding".

The "text" contains Greek Unicode or it doesn't.

Use the correct Font for the range of codes and it will display properly; otherwise not.

That's it.
   
Rate this:
Please Sign up or sign in to vote.

Solution 2

perogr[^] wrote:
The 28597 code page is for iso-8859-7, I think that it is greek ansi.


No. A ISO/IEC 8859-7 (Windows-28597)[^] is used for Latin/Greek. A Windows-1253[^] code page is used for Greek-ANSI.

Wiki wrote:
Windows code page 1253 ("Greek - ANSI"), commonly known by its IANA-registered name Windows-1253 or abbreviated as cp1253 is a Microsoft Windows code page used to write modern Greek. It is not capable of supporting the older polytonic Greek.

It is not fully compatible with ISO 8859-7 because a few characters, including the letter Ά, are located at different byte values:

(...)

Unicode is preferred for Greek in modern applications, especially as UTF-8 encoding on the Internet.
   

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100