|
Okay yes, but once i have that converted (either to a 32bit int, or a double-char string, i can't do anything with it.
I can't call char.IsWhiteSpace with it.
I can't do anything but print its value. Which is stupid
Real programmers use butterflies
|
|
|
|
|
ASCII FTW! 127 characters should be enough for anyone!
I'm not sure there are any whitespace characters that would be encoded as a surrogate pair:
Whitespace character - Wikipedia[^]
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
And EBCDIC - everybody always forgets EBCDIC
|
|
|
|
|
And sixbit on DEC's PDP systems!
|
|
|
|
|
|
I've encountered a weird one. But I don't remember if it was a surrogate or not. I just remember it didn't print to the console properly (it cooked it) or didn't save to source as a literal value or something.
It came up as whitespace, but only when I used these huge "not ranges" which are like [^a-z] (anything but a lower case letter)
that "anything" part created ranges all throughout the 16-bit unicode spectrum.
And that's when I ran into issues with one whitespace character.
Real programmers use butterflies
|
|
|
|
|
Actually Baudot it sufficient at 5 bits.
CQ de W5ALT
Walt Fair, Jr.PhD P. E.
Comport Computing
Specializing in Technical Engineering Software
|
|
|
|
|
|
Only works for 16 bit unicode values. Not these 32bit surrogates
Real programmers use butterflies
|
|
|
|
|
"So what the hell are you supposed to do with these values"
hope they never appear.
|
|
|
|
|
Pretty much!
Real programmers use butterflies
|
|
|
|
|
idea: research using Unicode categories in your RegEx ? [^]
«One day it will have to be officially admitted that what we have christened reality is an even greater illusion than the world of dreams.» Salvador Dali
|
|
|
|
|
I am using those.
The category for surrogates is surrogate. Not helpful.
Combining a hi and lo surrogate you get a 2 char string.
The 2 char string cannot be queried for its unicode category in .NET AFAIK
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: The 2 char string cannot be queried for its unicode category in .NET AFAIK It is a mess, but, check this against what you expect, now:
public void PrintUniCodeRange(int sc, int ec)
{
bool isKey;
string key = "";
for (int i = sc; i <= ec; i++)
{
string ucString = char.ConvertFromUtf32(i);
isKey = i < 256;
if (isKey) key = ((Keys)Enum.Parse(typeof(Keys), i.ToString())).ToString();
UnicodeCategory cat = Char.GetUnicodeCategory(ucString, 0);
if (cat != UnicodeCategory.OtherNotAssigned)
{
Console.WriteLine($"#{i} | Unicode Category: {cat} {(isKey ? "! Keys Enum: " + key : "")}");
}
}
} Calling the above with 8192 to 8233 parameters:
#8192 | Unicode Category: SpaceSeparator
#8193 | Unicode Category: SpaceSeparator
#8194 | Unicode Category: SpaceSeparator
#8195 | Unicode Category: SpaceSeparator
#8196 | Unicode Category: SpaceSeparator
#8197 | Unicode Category: SpaceSeparator
#8198 | Unicode Category: SpaceSeparator
#8199 | Unicode Category: SpaceSeparator
#8200 | Unicode Category: SpaceSeparator
#8201 | Unicode Category: SpaceSeparator
#8202 | Unicode Category: SpaceSeparator
#8203 | Unicode Category: Format
#8204 | Unicode Category: Format
#8205 | Unicode Category: Format
#8206 | Unicode Category: Format
#8207 | Unicode Category: Format
#8208 | Unicode Category: DashPunctuation
#8209 | Unicode Category: DashPunctuation
#8210 | Unicode Category: DashPunctuation
#8211 | Unicode Category: DashPunctuation
#8212 | Unicode Category: DashPunctuation
#8213 | Unicode Category: DashPunctuation
#8214 | Unicode Category: OtherPunctuation
#8215 | Unicode Category: OtherPunctuation
#8216 | Unicode Category: InitialQuotePunctuation
#8217 | Unicode Category: FinalQuotePunctuation
#8218 | Unicode Category: OpenPunctuation
#8219 | Unicode Category: InitialQuotePunctuation
#8220 | Unicode Category: InitialQuotePunctuation
#8221 | Unicode Category: FinalQuotePunctuation
#8222 | Unicode Category: OpenPunctuation
#8223 | Unicode Category: InitialQuotePunctuation
#8224 | Unicode Category: OtherPunctuation
#8225 | Unicode Category: OtherPunctuation
#8226 | Unicode Category: OtherPunctuation
#8227 | Unicode Category: OtherPunctuation
#8228 | Unicode Category: OtherPunctuation
#8229 | Unicode Category: OtherPunctuation
#8230 | Unicode Category: OtherPunctuation
#8231 | Unicode Category: OtherPunctuation
#8232 | Unicode Category: LineSeparator
#8233 | Unicode Category: ParagraphSeparator
«One day it will have to be officially admitted that what we have christened reality is an even greater illusion than the world of dreams.» Salvador Dali
|
|
|
|
|
hmm, I wonder what my test was doing wrong, because GetUnicodeCategory(string, int) was returning only single char values for me i thought. maybe i had a bug
Real programmers use butterflies
|
|
|
|
|
Thank you! Turns out there was a bug in my code where i wasn't passing doublechar strings in. They ended up single char.
Real programmers use butterflies
|
|
|
|
|
What would we call it then? Here are Some suggestions[^]
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Some funny ones there
Yet it annoyed me (a lot more than it should have) that France and Poland didn't adhere to the use of the country's name!
|
|
|
|
|
Why would France do it lie anyone else?
|
|
|
|
|
Telling that there wasn't one for Germany. DoneMark is very close to how it's actually pronounced natively (in Norwegian and Swedish, at any rate; Danish has been referred to as a throat disease rather than a language).
|
|
|
|
|
That website is a gem!
"It is easy to decipher extraterrestrial signals after deciphering Javascript and VB6 themselves.", ISanti[ ^]
|
|
|
|
|
Frédéric Chopin - Impromptu no. 4 in C sharp minor, Op. posth. 66 (Fantaisie Impromptu)[^]
If, a few years back, you told me I'd be listening to Chopin I'd tell you you're crazy.
I didn't care much for the many piano notes that don't always seem to have a head or tails.
But I heard this on the radio on my way to piano lessons.
It sounded familiar, but I don't know from where.
The first bars are the same as Beethoven's Moonlight Sonata, but it's the bars after that that sounded familiar.
Well, it's one of Chopin's most famous pieces so I probably just heard it here and there.
Funny thing is that it was published after Chopin's death even though Chopin explicitly stated he didn't want anything published after he passed.
So this is my first SOTW of 2020.
Chopin would be proud
|
|
|
|
|
We all grow up sooner or later.
Soon enough you'll be listening to Eric Satie and Joaquin Rodrigo.
And on that note: Your future [^]
|
|
|
|
|
Jörgen Andersson wrote: Eric Satie I've listened to Satie for most of my life, mostly because my father really likes it
Jörgen Andersson wrote: Joaquin Rodrigo Didn't know him, but that's something I could listen to anyway
Jörgen Andersson wrote: And on that note: Your future [^] Oh hell no!
|
|
|
|
|