|
When trying to transliterate the input, you should also know that the keyboard layout is quite different for each set of characters. And some languages use many more characters than does English, many write syllables (Japanese, Indian alphabets including Thai and Khmer) or words (Chinese, Japanese).
|
|
|
|
|
i need to examine a text file containing Arabic writings, by counting each character frequency of each character, how do i do this in vb.net? Do i need to detect the file code-page before reading? do i need to convert the file code-page before doing anything? AM using vb.net 2008
|
|
|
|
|
For my 2cts; I don't think there is any way to detect the codepage, you ought to know it before opening it. Text files don't have any metadata to tell you anything about its content.
So if you know the encoding, you can load the full file by using System.IO.File.ReadAllText(path As String, encoding As Encoding) As String . From here you should have a properly encoded string and you can start counting characters.
If you want something done fast, then do it right (Grissom, CSI)
Thanks for your reply, you just acknowledged my existence
|
|
|
|
|
what i want to do is count the frequency of characters that appears in an Arabic text
|
|
|
|
|
Have you tried it? Or can you be more specific?
A string in .NET is not just a list of bytes. Every character consists of 1 or more bytes depending on the encoding used.
The method provided will read the file into a proper encoded string. All you have to do is traverse the string and count the characters.
If you want something done fast, then do it right (Grissom, CSI)
Thanks for your reply, you just acknowledged my existence
|
|
|
|
|
All you have to do is read the file into a String, then iterate over the Characters in the string and add them to a Dictionary collection. Since Dictionary is a key/value pair collection, the "key" will be the character you are looking at. The "value" will be the count of those characters. When you go to add the character to the collection, you first see if it is already there, and if so, get it's value and increment it by one. If not, add the new key with a value of 1. Move on the next character...
|
|
|
|
|
Cool Smith wrote: i need to examine a text file containing Arabic writings
What format is it in? Ideally, it'd be UTF. It's important since the encoding determines the length of a single character. Download a HEX-editor and open the textfile with it - what do the first bytes look like in HEX?
Cool Smith wrote: Do i need to detect the file code-page before reading?
There's no way of detecting it with good precision, but Notepad can tale an educated guess[^]. If you have any say in it, then it should be UTF. If you don't, ask which codepage was used to write the files. There'll be a difference in Windows Arabic 1256[^] and DOS Arabic 864[^]
Cool Smith wrote: by counting each character frequency of each character, how do i do this in vb.net?
First, determine the encoding, and read the file with that encoding. Then create a dictionary, read the entire file as a string. Loop through the string by eating characters, adding them to the dictionary as the key, or adding +1 to it's value if it's already in the dictionary. When done eating, burp out the results
I are Troll
|
|
|
|
|
What format is it in? Ideally, it'd be UTF. It's important since the encoding determines the length of a single character. Download a HEX-editor and open the textfile with it - what do the first bytes look like in HEX?
The this is, the software will be examining different text files (*.txt) only that contains arabic writings. i found code here that can detect the code page of a file and another that can convert between different code page.
First, determine the encoding, and read the file with that encoding. Then create a dictionary, read the entire file as a string. Loop through the string by eating characters, adding them to the dictionary as the key, or adding +1 to it's value if it's already in the dictionary. When done eating, burp out the results
can you give me pseudo code for this, i don't have any idea how to do it
|
|
|
|
|
Cool Smith wrote: i found code here that can detect the code page of a file
Can you post a link to that article? I haven't read it yet
Cool Smith wrote: can you give me pseudo code for this It'd go something like this;
Dictionary<String, Int64> characterCounter = new Dictionary<String, Int64>();
string theFile = File.ReadAllText("C:\test.txt");
while (theFile.Length > 0)
{
string CurrentCharacter = theFile[theFile.Length -1];
string theFile = theFile.Remove(theFile.Length -1, 1);
if (characterCounter.ContainsKey(CurrentCharacter))
{
characterCounter[CurrentCharacter] = characterCounter[CurrentCharacter] + 1;
}
else
{
characterCounter.Add(CurrentCharacter, 1);
}
}
for each (DictionaryEntry<String, Int64> entry in characterCounter)
{
textBox1.Text += String.Format("char {0} occurs {1} times", entry.Key, entry.Value);
} This could be a bit slow with large files, as it forces .NET to allocate memory each time for a new string. It'd be more efficient if it were a moving frame. That'd go something more like this;
string theFile = File.ReadAllText("C:\test.txt");
Int64 currentPos = 0;
Int64 endPos = theFile.Length -1;
while (currentPos <> endPos)
{
string CurrentCharacter = theFile[currentPos];
currentPos = currentPos + 1;
...
}
I are Troll
|
|
|
|
|
here are the links
CodePage File Converter[^]
Detect Encoding for In- and Outgoing Text[^]
i'll try your implementation and and back to you.
besides i found a hextostring code, will it work well for recognizing single characters in a joined character
<br />
Private Function ConvertStringToHex(ByVal MyString As String) As String<br />
Dim Result As String = vbNullString<br />
If Len(MyString) = 0 Then<br />
Result = vbNullString<br />
Else<br />
For i As Integer = 0 To Len(MyString.Trim) - 1<br />
Dim MyChar As String = Mid(MyString.Trim, i + 1, 1)<br />
Result = Result + Xformat(Hex(Microsoft.VisualBasic.AscW(MyChar)))<br />
Next<br />
End If<br />
Return Result<br />
End Function<br />
Private Function ConvertHexToString(ByVal MyString As String) As String<br />
<br />
Dim Result As String = vbNullString<br />
If Len(MyString) = 0 Then<br />
Result = vbNullString<br />
Else<br />
For i As Integer = 0 To Len(MyString.Trim) - 1 Step 4<br />
Dim MyChar As String = Mid(MyString.Trim, i + 1, 4)<br />
Result = Result + Microsoft.VisualBasic.ChrW(Convert.ToInt32(MyChar, 16))<br />
Next<br />
End If<br />
Return Result<br />
<br />
<br />
End Function<br />
Function Xformat(ByVal xin As String) As String<br />
Dim retval As String = xin<br />
Select Case Len(xin)<br />
Case Is = 3<br />
retval = "0" & xin<br />
Case Is = 2<br />
retval = "00" & xin<br />
Case Is = 1<br />
retval = "000" & xin<br />
End Select<br />
Return retval<br />
End Function<br />
End Class<br />
<br />
|
|
|
|
|
|
Cool Smith wrote: first am using vb.net not c#, i tried convertin to vb.net
You asked for pseudocode, and that's what it is.
Cool Smith wrote: Can you provide vb.net version?
No, since it's not my job. You could post your code however, and people could have a look. That is, if you explain where you're stuck.
I are Troll
|
|
|
|
|
How do you store/retrieve a PDF file in SQL Server using vb.net code?
|
|
|
|
|
Have a read of this article there is a lnk in the article that has a code example to do what your after
loading binary data (files) in a database [^]
As barmey as a sack of badgers
Dude, if I knew what I was doing in life, I'd be rich, retired, dating a supermodel and laughing at the rest of you from the sidelines.
|
|
|
|
|
I am working on a library that uses multiple threads to do all kinds of work. The work being done results in events being generated. Preferably I would like all communications to the outside world to be on a single thread, so most threading issues can be dealt with inside the library.
When using forms there are the standard components with the Invoke methods that let the thread that runs the form call the requested Delegate , so whatever needs to be done (raise an event in my case) gets executed on that form thread.
Is there a way to do this generically, without forms or components ? Say I have 1 main thread that spawns 10 or 20 worker threads. And whenever the workers have an event to raise, they let the main thread do this? Is there an easy way (that I just haven't found), or should I be creating some custom threadpool management code to achieve this?
Any thoughts?
If you want something done fast, then do it right (Grissom, CSI)
Thanks for your reply, you just acknowledged my existence
|
|
|
|
|
the BackgroundWorker class has this built-in: its ProgressChanged and RunWorkerCompleted handlers run on the thread that created the BGW.
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
|
|
|
|
|
Thx for your time. I get your point, but... that would make my library capable of only raising the standard ProgressChanged and RunWorkerCompleted events, and not the custom events used in my library.
Is there any description on how the BackgroundWorker class does its magic? that would allow me to re-create it and add my own events
If you want something done fast, then do it right (Grissom, CSI)
Thanks for your reply, you just acknowledged my existence
|
|
|
|
|
BGW's internals are pretty complex; it uses System.ComponentModel.AsyncOperation , System.Threading.SynchronizationContext and System.Threading.ThreadPool and it basically boils down to a Delegate.Invoke
You can see all that with a tool such as Reflector[^].
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
|
|
|
|
|
Meanwhile I ran some tests and my conclusion is that I cannot use the BackgroundWorker.
Thing is with the BackgroundWorker , that the described behaviour is only valid in a windows forms application. If you run it in a forms application, the ProgressChanged handler indeed runs on the creator thread (the UI thread), but if you use it in a console application, the ProgressChanged handler runs on the DoWork background thread.
If you want something done fast, then do it right (Grissom, CSI)
Thanks for your reply, you just acknowledged my existence
|
|
|
|
|
In the program I'm developing for my company I need to add email capabilities. However I want to get a list of email addresses from our Exchange Server 2007 SP3 which is running on SBS 2008. I was originally going to try and pull it from Outlook, however there are 3 different versions across all our computers (Outlook 2003, 2007, and 2010), I want to avoid using Interop just to get email addresses, and upgrading everyone to 2010 isn't an option. The program is being developed in .NET 3.5. The email addresses I want to find are in the Global Address List, however I haven't been able to figure out how to do this so any guidance or direction would be greatly appreciated. Thanks in advance.
|
|
|
|
|
DisIsHoody wrote: The program is being developed in .NET 3.5. The email addresses I want to find are in the Global Address List
Start here[^]
I are Troll
|
|
|
|
|
Never thought about trying to get the contact through the web access, need to save that page. Thanks. However I did find that I could get it using Redemption and not have to worry about HTTP.
|
|
|
|
|
Nice, bookmarked it! Thanks for sharing
I are Troll
|
|
|
|
|
DisIsHoody wrote: However I did find that I could get it using Redemption
God is Great \o/ :p
|
|
|
|
|
Hello Everybody,
There are a lot of Source for Keyboard Hook. But i want to develop some new enhashment with keyboard Hook. Noew I am trying to capture Unicode characters.
I read a best article for only keyboard hook is
Global Windows Hooks[^]
But i want to convert it to Unicode.
So your help and suggestion is required.
Thanks
If you can think then I Can.
|
|
|
|