|
Apologies for the shouting but this is important.
When answering a question please:
- Read the question carefully
- Understand that English isn't everyone's first language so be lenient of bad spelling and grammar
- If a question is poorly phrased then either ask for clarification, ignore it, or mark it down. Insults are not welcome
- If the question is inappropriate then click the 'vote to remove message' button
Insults, slap-downs and sarcasm aren't welcome. Let's work to help developers, not make them feel stupid.
cheers,
Chris Maunder
The Code Project Co-founder
Microsoft C++ MVP
|
|
|
|
|
For those new to message boards please try to follow a few simple rules when posting your question.- Choose the correct forum for your message. Posting a VB.NET question in the C++ forum will end in tears.
- Be specific! Don't ask "can someone send me the code to create an application that does 'X'. Pinpoint exactly what it is you need help with.
- Keep the subject line brief, but descriptive. eg "File Serialization problem"
- Keep the question as brief as possible. If you have to include code, include the smallest snippet of code you can.
- Be careful when including code that you haven't made a typo. Typing mistakes can become the focal point instead of the actual question you asked.
- Do not remove or empty a message if others have replied. Keep the thread intact and available for others to search and read. If your problem was answered then edit your message and add "[Solved]" to the subject line of the original post, and cast an approval vote to the one or several answers that really helped you.
- If you are posting source code with your question, place it inside <pre></pre> tags. We advise you also check the "Encode HTML tags when pasting" checkbox before pasting anything inside the PRE block, and make sure "Ignore HTML tags in this message" check box is unchecked.
- Be courteous and DON'T SHOUT. Everyone here helps because they enjoy helping others, not because it's their job.
- Please do not post links to your question in one forum from another, unrelated forum (such as the lounge). It will be deleted.
- Do not be abusive, offensive, inappropriate or harass anyone on the boards. Doing so will get you kicked off and banned. Play nice.
- If you have a school or university assignment, assume that your teacher or lecturer is also reading these forums.
- No advertising or soliciting.
- We reserve the right to move your posts to a more appropriate forum or to delete anything deemed inappropriate or illegal.
cheers,
Chris Maunder
The Code Project Co-founder
Microsoft C++ MVP
|
|
|
|
|
Unicode is 1byte per character, that’s the Latin characters and the other symbols found on a standard keyboard
Multibyte is Latin, Greek, Russian and everything else that exceeds the initial 256 symbols
Is that how it works?
|
|
|
|
|
|
I thought a c++ char has the size of one byte. How can something that is greater that 1 byte (Unicode, Multibyte) fit into a char?
|
|
|
|
|
It cannot; it is using an “encoding”, the most popular by far being UTF-8[^].
Mircea
|
|
|
|
|
Mircea Neacsu wrote: the most popular by far being UTF-8[^]. I'd say that the most popular file storage format is UTF-8.
As a working format, in RAM, UTF-16 is very common. E.g. it is the format used by all Windows APIs, which is more or less to say all Windows programs. Java uses UTF-16 in RAM, as do a lot of other modern languages.
It must be said that not all software that claims to use UTF-16 fully handles UTF-16 - only the BMP ("Basic Multilingual Plane"), so that all supported characters will fit in one 16-bit code unit. BMP didn't have space for latecomer alphabets, like for a number of African or Asian languages. Most developers said "But my program isn't aimed at such markets, so I'll ignore the UTF-16 surrogates, for handling such characters as two 16-bit code units. I can treat text as if all characters are of equal width, 16 bits".
But a new situation has arisen: Emojis have procreated to a number far exceeding the number of WinDings. They do not all fit in BMP, so a number of them have been allocated in other planes than BMP. Don't expect the end user to know which emojis are defined in which planes and refrain from using non-BMP emojis! If you are not prepared for them, your code may mess up the text badly.
Writing your own complete UTF-16 interpreter is not recommended. Use library functions! There is more to UTF-16 than just alternative planes: Some character codes are nonspacing, or combining (typically an accent and a character). So you cannot deduce the number of print positions from the length of the UTF-16 string - not even after considering control characters.
For "trivial" strings limited to Western alphabets, there usually is a fairly close correspondence between the number of UTF-16 code units and the number of positions. You can pretend that it is exact, but look out for cases that need to be treated as exceptions. I suspect that is what a lot of programmers do. 99,9% of Western text is free of exceptional cases, so the fixed-code-width assumption holds. Until, of course, emojis become common e.g. in file names. Note that UTF-32 does not provide an ultimate solution to all problems: You still may have to relate to nonspacing or combining characters!
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
I have to confess that I am a convert to the UTF-8 religion as preached in the UTF-8 Everywhere[^] manifesto. So much so that I've written a series of articles[^] on CP about using UTF-8 in Windows (you can find the whole series here[^]).
Some of your assertions are open to interpretations: Quote: So you cannot deduce the number of print positions from the length of the UTF-16 string - not even after considering control characters. Why would that be interesting from a programming point of view? From a typographical point of view, sure, but as programmers we don't usually concern ourselves with such minutia
The subject of emojis is another pet peeve of mine so allow me a bit of a roundabout. Some evolutionary solutions have been reinvented many times: flight has been reinvented by insects, birds, mammals, you name it. However there are some crucial points in evolution that happened only once. Photosynthesis or eukaryotic cells are prime examples but so is alphabetic writing. Moving from pictographic writing, where a symbol represented a whole word, to one where a symbol represented a sound, was a magnificent achievement of the human spirit that opened the path to what we now call the Western civilization. Now, if you buy at least some of my arguments, you can see how disappointed I am when this whole evolutionary path is turned back by the spread of emojis. No longer we need the magic words of a Shakespearean sonnet when we can just put a heart and a smiley face. Bleah!
Mircea
|
|
|
|
|
|
I understand. Thank you guys
|
|
|
|
|
You are welcome.
"In testa che avete, Signor di Ceprano?"
-- Rigoletto
|
|
|
|
|
addendum
Found the attached.
Will it fly ?
Please somebody , with Qt experience, make some comments ....
#include <QDebug>
#include <QCoreApplication>
#include <QObject>
#include <QTimer>
class Foo : public QObject
{
Q_OBJECT
public:
Foo( QObject* parent = 0 ) : QObject( parent )
{}
private:
void doStuff()
{
qDebug() << "Emit signal one";
emit signal1();
qDebug() << "Emit finished";
emit finished();
qDebug() << "Emit signal two";
emit signal2();
}
signals:
void signal1();
void signal2();
void finished();
public slots:
void slot1()
{
qDebug() << "Execute slot one";
}
void slot2()
{
qDebug() << "Execute slot two";
}
void start()
{
doStuff();
qDebug() << "Bye!";
}
};
int main( int argc, char** argv )
{
QCoreApplication app( argc, argv );
Foo foo;
QObject::connect( &foo, &Foo::signal1, &foo, &Foo::slot1 );
QObject::connect( &foo, &Foo::signal2, &foo, &Foo::slot2 );
QObject::connect( &foo, &Foo::finished, &app, &QCoreApplication::quit );
QTimer::singleShot( 0, &foo, &Foo::start );
return app.exec();
}
#include "main.moc"
I am well aware that some "contributors" do not like me to ask Qt question and are hard of hearing.
( They do have choice NOT to reply...)
So this is a QtCreator question ( they can ignore ).
I have my "send data using serial connection" working.
In pseudo code
while loop thru string
extract data byte from string
convert to QByteArray
send data to serial port
delay 200 mS
Since the loop is run as an event it physically sends the data
, and posts my debug messages , only AFTER the loop is completed.
I am asking for suggestion / opinions how to split this event into
executable parts.
Or if there is another way to modify my code so it physically executes
data in correct timing sequence.
Thanks
modified 7-Sep-24 15:42pm.
|
|
|
|
|
|
As I'm sure we all know, there are basically three ways to handle currency values in code.
1) Can store the value as cents in an integer. So, $100.00 would be 10,000. Pro: You don't have to worry about floating point precision issues. Con: Harder to mentally glance at without converting it in your head. Con: Depending on size of integer may significantly reduce available numbers to be stored.
2) Can use a float with a large enough precision. Pro: Easy to read. Con: Rounding issues, precision issues, etc.
3) Can use a struct with a dollars and cents. Pro: Same benefit as integer with no number loss. Pro: Easy to read mentally. Con: Have to convert back and forth or go out of the way for calculations.
Historically, I've always gone with either 1 or 2 with just enough precision to get by. However, I'm working on a financial app and figured... why not live a little.
So, I'm thinking about using 128-bit ints and shift its "offset" by 6, so I can store up to 6 decimal places in the int. For a signed value, this would effectively max me out at ~170 nonillion (170,141,183,460,469,231,731,687,303,715,884.105727). Now, last I checked there's not that much money in the world. But, this will be the only app running on a dedicated machine and using 1GB of RAM is completely ok. So, it's got me thinking...
Here's the question... any of y'all ever use 128-bit ints and did you find them to be incredibly slow compared to 64-bit or is the speed acceptable?
Jeremy Falcon
modified 2-Sep-24 14:37pm.
|
|
|
|
|
If you're not concerned about speed, then the decimal::decimal[32/64/128] numerical types might be of interest. You'd need to do some research on them, though. It's not clear how you go about printing them, for example. Latest Fedora rawhide still chokes on
#include <iostream>
#include <decimal/decimal>
int main()
{
std::decimal::decimal32 x = 1;
std::cout << x << '\n';
} where the compiler produces a shed load of errors at std::cout << x , so the usefulness is doubtful. An alternative might be an arbitrary precision library like gmp
A quick test of a loop adding 1,000,000 random numbers showed very little difference between unsigned long and __uint128_t For unsigned long the loop took 0.0022 seconds, and for __int128_t it took 0.0026 seconds. Slower, but not enough to not consider them as a viable data type. But as with the decimal::decimal types, you would probably have to convert to long long for anything other than basic math.
"A little song, a little dance, a little seltzer down your pants"
Chuckles the clown
|
|
|
|
|
Well, just so you know I'm not using C++ for this. But the ideas are transferable, for instance a decimal type is just a fixed-point number. Which, in theory sounds great, but as you mentioned it's slow given the fact there's no FPU-type hardware support for them.
I was more interested in peeps using 128-bit integers in practice rather than simply looping. I mean ya know, I can write a loop.
While I realize 128-bit ints still have to be broken apart for just about every CPU to work with, I was curious to know if peeps have noticed any huge performance bottlenecks with doing heavy maths with them in a real project.
Not against learning a lib like GMP if I have to, but I think for my purposes I'll stick with ints, in a base 10 fake fixed-point fashion, as they are fast enough. It's only during conversions in and out of my fake fixed-point I'll need to worry about the hit if so.
So the question was just how much slower is 128-bit compared to 64-bit... preferably in practice.
Jeremy Falcon
|
|
|
|
|
You mean 64-bit CPUs can't deal natively with 128-bit integers?
You had me at the beginning thinking that it was a real possibility.
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
I'm too tired to know if this is a joke or not. My brain is pooped for the day.
Richard Andrew x64 wrote: You had me at the beginning thinking that it was a real possibility. Any time I can crush your dreams. You just let me know man. I got you.
Jeremy Falcon
|
|
|
|
|
FYI I wasn't joking.
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
Ah, I haven't played with ASM since the 16-bit days and it was only a tiny bit back then to help me debug C code really. So, this may be old and crusty info...
But, yeah typically in a 64-bit CPU the registers don't go any wider than 64-bits. Now, there are extended instruction sets (SSE, SSE2, etc.), but those usually deal more with capabilities per instruction than data/bus width.
One notable exception is that all CPUs have FPUs these days and most FPUs can process 80-bit wide floats natively, even on a 64-bit CPU. AFAIK, there are no 128-bit registers/extensions for 64-bit CPUs for anything.
Which means, if I got a 128-bit number, any programming language that compiles it will have to treat that as two 64-bit values in the binary. Good news is, it's a loooooooooot easier to do with integers than floats. Say for instance, a quadruple precision float that's 128-bits is over 100 times slower than a 80-bit float. With an integer, you're just one bit shift away from getting the high word.
Stuff like the C runtime will have a native 128-bit type, but the binary will still have to break it down into two under the hood.
Jeremy Falcon
|
|
|
|
|
I would use 64-bit integers, representing cents.
My 0x0000000000000002.
"In testa che avete, Signor di Ceprano?"
-- Rigoletto
|
|
|
|
|
That's what I'm leaning towards, but I'd want to go to at least a tenth of a mil (4 decimal places) as that's the minimum resolution most accounting software has. So, looking to see if 128-bit is viable so I go to 6 decimal places and not have to worry about it for a while. It's a dedicated machine for this app, so using 1GB RAM isn't an issue. Speed is the only concern.
Jeremy Falcon
modified 3-Sep-24 9:07am.
|
|
|
|
|
Even with such a constraint, a 64-bit integer gives you a plethora of dollars.
"In testa che avete, Signor di Ceprano?"
-- Rigoletto
|
|
|
|
|
Unless you're tracking the national US debt.
Jeremy Falcon
|
|
|
|
|
Hmm, unless my math is wrong, per Double-precision floating-point format - Wikipedia[^]:
Quote: Integers from −253 to 253 (−9,007,199,254,740,992 to 9,007,199,254,740,992) can be exactly represented.
US national dept is around $33.17T = 33,170,000,000,000. Seems you can accurately represent US national debt down to 0.1 cents using just a regular double-precision number.
Mircea
|
|
|
|
|