|
honey the codewitch wrote: What we do when we have to insert surrogate characters into UTF-32 to support another 100,000 languages?
Well, we should be able to meet the requirements for a secure password ...
Obligatory Dilbert[^]
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Given our current spaceflight capabilities, I doubt that this will be a problem for a long time. Any and all intelligent aliens are proving their intelligence by staying a long way from Earth.
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
We have at least one already: Klingon Unicode Fonts[^]
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
From what I think to remember is, that surrogate pairs are mostly used for special symbols and that most language symbols will fit in the base page.
It does not solve my Problem, but it answers my question
modified 19-Jan-21 21:04pm.
|
|
|
|
|
That's true for earth. But wait til we add a zillion more languages which is my point =)
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
If we assume, say, 1000 characters per language, that means another 100 million characters which should not be a problem since a 32-bit range should handle 4G characters. At 1K characters per language, UTF-32 should accommodate 4M languages. That is, unless I am looking at this incorrectly which is certainly possible.
"They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"
|
|
|
|
|
You're probably right. There's a bunch of reserved bitfields though but yeah, even then.
Forgive me, it's early here. =)
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
Things will get really interesting if/when we encounter a civilization that does not utilize language in the written ways that we do. We might come across one that has no eyes and does not sense light. It might communicate telepathically and have no written language, only thoughts. That could be really weird.
"They have a consciousness, they have a life, they have a soul! Damn you! Let the rabbits wear glasses! Save our brothers! Can I get an amen?"
|
|
|
|
|
The working group committee on unicode already scrapped three preliminary recommendations on that.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
honey the codewitch wrote: What we do when we have to insert surrogate characters into UTF-32 to support another 100,000 languages? drugs.
«One day it will have to be officially admitted that what we have christened reality is an even greater illusion than the world of dreams.» Salvador Dali
|
|
|
|
|
I mean, my first thought was UTF-64, but drugs? I suppose drugs could work.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
If we have a Universal Translator or Babel Fish we won't care what the original language looked like / sounds like / brain-waves like because we would hear / see / think it in our own ways, so Unicode would be overkill even for Earth languages. Then all we would need is a universal programming language if H T CW hasn't completed it first.
|
|
|
|
|
Pah! The Psilon solved that issues aeons ago!
|
|
|
|
|
The emoji jungle would be finally cleared up because the Unicode guys & gals would finally got something useful to do rather than keeping themselves occupied for the sake of keeping themselves occupied.
|
|
|
|
|
If you can't decide to get your child a pet or a toy for Christmas, have you considered a rattlesnake?
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
Is this for the good children or the bad ones?
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
Reminds me of Halloween: Venom giving out my favorite favor to the tykes => mashed potatoes (by the scoop). Well, they did say "Trick OR Treat". If they don't like it and cry, we could viper cheeks dry and let them slither off to the next door
Ravings en masse^ |
---|
"The difference between genius and stupidity is that genius has its limits." - Albert Einstein | "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010 |
|
|
|
|
|
Fangs for the advice - venom out shopping I will pick one up (carefully, of course)!
I, for one, like Roman Numerals.
|
|
|
|
|
Asp someone else, I don't know.
|
|
|
|
|
All of this because microsoft doesn't expose the tables they use behind char.IsLetter() and the like
Like, did they think nobody would ever need to use the unicode category codes for things?
Their enumeration isn't even flags.
for(var i = 0;i<char.MaxValue;++i)
{
char ch = unchecked((char)i);
var uc = char.GetUnicodeCategory(ch);
switch(uc)
{
case UnicodeCategory.ClosePunctuation:
_AddTo(working, "Pe", ch);
_AddTo(working, "P", ch);
break;
case UnicodeCategory.ConnectorPunctuation:
_AddTo(working, "Pc", ch);
_AddTo(working, "P", ch);
break;
case UnicodeCategory.Control:
_AddTo(working, "Cc", ch);
_AddTo(working, "C", ch);
break;
case UnicodeCategory.CurrencySymbol:
_AddTo(working, "Sc", ch);
_AddTo(working, "S", ch);
break;
case UnicodeCategory.DashPunctuation:
_AddTo(working, "Pd", ch);
_AddTo(working, "P", ch);
break;
case UnicodeCategory.DecimalDigitNumber:
_AddTo(working, "Nd", ch);
_AddTo(working, "N", ch);
break;
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
And on that note, why couldn't they expose the Number class?
There are many times I would have liked that.
|
|
|
|
|
I'm just happy to have biginteger.
now if only they'd give us a bigreal.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
You're a programmer. Write it yourself.
Here's a start:
class BigReal : IComparable<BigReal>, IEquatable<BigReal>, IConvertible, IFormattable
{
public bool sign { get; private set; }
public long biasedExponent { get; private set; }
public List<ulong> mantissa { get; private set; }
}
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|
|
I don't have the math skills for that.
Ever done an efficient multiply on an arbitrary length FP?
I know it's not easy.
When I was growin' up, I was the smartest kid I knew. Maybe that was just because I didn't know that many kids. All I know is now I feel the opposite.
|
|
|
|
|
The best algorithm depends on the size(s) of the mantissa and the size of the machine word. See the MPFR package for details.
I have written multiplication algorithms using the standard O(N^2) method, Karatsuba's O(N^lg3) method, and the Fourier transform O(N*lg(N)*lg(lg(N))) methods.
Recently, an algorithm was described for an O(N*lg(N)) method, but I can't say that I understand it.
The big problems are actually division and square root. They can be implemented using an fma (fused multiply add), which calculates A*B+C with only one rounding (for the addition).
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|