|
That all seems to mesh very well with my reality and understandings.
I've just not had the expendable moola to grab new silicon in a bit and keeping Intel offerings straight in one's head is an exercise in futility.
Nothing against AMD (though ARM, because of its sordid history with Windows can kick rocks). I just like Intel because it's literally all I've ever had and there's just a degree of comfort/security (likely a false sense).
I have built machines for others with AMD Ryzen though and they seem to have worked out just fine.
All this does seem to make the bit of calculating total processor usage a bit more of a complex algorithm though, if not a near-useless metric in context?
|
|
|
|
|
Most of those metrics are useless by themselves. As hardware gets more complicated, so do the numbers, and the circumstances we find those numbers presenting themselves in.
I've found if I want to bench a system, I find what other people are using to bench, and then I bench my own using a baseline. The ones I use right now are:
Running Cyberpunk Bench (DLSS and Raytracing benchmark)
Running TimeSpy (General DirectX 12 and CPU gaming perf bench)
And Cinebench R23 - for CPU performance
That won't tell you everything, and the first two of those benches are very gaming oriented, and focus on GPU performance. What running them tells me is that my desktop and laptop are pretty comparable at the resolutions I play at on each, but my lappy slightly beats my desktop in multicore performance
What I'd like is for other people to compile the same large C++ codebase i am on other machines, which would give me a nice real world metric for the purpose i built this machine for (C++ compile times)
As it is, I would buy an AMD laptop (power efficiency), but intel is my go to for desktops at this point, primarily due to general purpose single core performance. My laptop is also an intel, but if i bought again, I'd have waited for the AMD version of it and got better battery life.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
modified 8-Feb-24 11:21am.
|
|
|
|
|
I can tell you that if you aren't already, making sure all the related bits ride on SSD would be one of the most huge things I think might speed up linkage.
|
|
|
|
|
That's why I run two Samsung 990 Pro NVMe drives - fastest on the market.
I also run my ram at 6000MHz/CL32 - stock spec for intel on DDR5 is like 4800 or something.
My CPU on this machine is an i5-13600K. I would have gone with the i9-13900K but I built this to be an air cooled system, and 250W was too rich for my blood - at least with this cooler - and this i5 is a sleeper with single core performance comparable to the i9. I have the i9-13900HX in my laptop - which is basically the same as the desktop version but retargeted to 180W instead of 250W.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
What you may want to try regarding the RAM...
It's a real PITA, because you'll lock/bluescreen your machine, but rather than aiming for just the highest clock-rate possible, try to tighten the latency timings or look for some sticks with the most excellent latency timings.
These things tend to be somewhat inversely related (clock speed : CASL/others).
I won't be so upset I just got a corptop with an i5 then (coming from an i7).
|
|
|
|
|
I don't play the silicon lottery, because I've lost too much time to intermittent RAM failures.
I use an XMP profile because the stick was tested at those timings. and CL32 is pretty tight.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Even just different sticks might have better latency timings... and lowering, even underclocking RAM to get tight latencies can see noticeable framerate improvement for at least some games (kinda depends what they have goin on).
For some of the same reasons that's the case, I suspect it'd be the case for linking too.
|
|
|
|
|
Like I said, I don't play the silicon lottery, as losing it is a giant time sink.
I run my ram at what it was tested for at factory. XMP profile has it at 6000/CL32, it's rock solid. And also I know that since it wasn't the fastest ram on the market from that vendor when I bought it that it failed faster tests.
So I'm not messing with the timings. Frankly, my time is too valuable to waste running down system errors due to memory corruption.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
I never played that either. But I have played at trying to extract what I can from whatever I get. Maybe we have different definitions of silicon lottery - I call it buying/returning chips till you get a good bin#.
> Frankly, my time is too valuable to waste
Gotta be your judgement call on that one... I looked for ya, and it doesn't seem like you'll find much better with stock XMP profile.
|
|
|
|
|
Yeah. I've been burned before with bad memory so I guess I'm extra cautious these days. That was a week of hairpulling (it was a bad stick, it wasn't due to clocking, but same issue)
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
There have been some bad bugs with XMP/bios relationships where using XMP would force a lower voltage than what the RAM wanted. I wonder if you ran into that.
|
|
|
|
|
There are three different major resources in a computer system:
- CPU
- Memory
- Disk
If you have files open on a network, then the network will be a fourth major resource.
Each thread in the system can only be using one of these resources at a time, with multiple threads receiving time slices of that resource. So if a thread is waiting for the disk, it's not using the CPU. In fact, it's this very concept that allows virtualization to work at all. Personally, I used this concept to partition a task that was taking close to six hours to complete as a serialized task and complete it in about two hours.
|
|
|
|
|
obermd wrote: So if a thread is waiting for the disk, it's not using the CPU
Just to be difficult *cough* I/O Completion Ports *cough*
Seriously though, I'm being a bit pedantic, but on modern OSs and hardware so many things are asynchronous without even using threads that a single thread can potentially be using multiple resources virtually at the same time, and when it does enter a wait state, it will be woken up by the first ready resource it's waiting on.
That changes the calculus of what you say in terms of how it actually plays out, even if what you say is ... "basically" true. In essence, you're not wrong, but you're simplifying, maybe to a fault.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
I'm simplifying based on what I had available (VAXBasic and Digital Command Language) to make the adjustments (6+ hours => 2 hours).
This simplification is also a good high-level view of what's available in a system and the realization that a thread can only do one item at a time. Threads that spawn off asynchronous tasks are still only doing one item at a time as the spawned task always executes on another thread. It may be that that thread is a hardware resource for IO, but it is still another thread and another task that attempts to use the same resource will have to wait.
IO Completion Ports use hardware signals that the OS monitors to allow applications to offload the actual IO details to an OS level thread. For commodity systems, you can have a small number of concurrent IOs occuring, and depending on the hardware and where the actual resource conflicts lie, that number can be one. From a high level, I've wrote an SQL Server based applications that had to back off on SQL errors of any sort (concurrency, timeout, etc.). The first fault (SQL error) split the task into 10 tasks to attempt concurrently. The second SQL error (double fault) used the dotNet framework's ReaderWriterLock class to force a complete back off so the erroring SQL statement would be the only insert/update query executing on the server at that time. Yes, it was a huge penalty hit but it was necessary to ensure data integrity. These threads normally set the lock to Read, but for the double fault situation the thread that was going to attempt the final insert/update/delete would set the lock to Write and wait for all the other readers to complete before attempting. Of course any new readers also waited for the Write lock to complete.
|
|
|
|
|
Do you know how thread waits are actually implemented in the kernel? Does the thread actually stop processing instructions, IOW, is the wait hardware based? I've heard the term used "spinning" related to threads, so I wonder if a waiting thread simply polls a value and if it's not the value it wants, it loops.
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
I mean, as far as I know, a kernel can wait and awake on a number of different conditions, some of them directly interrupt related, as in your drive finishes fetching, and it triggers a CPU interrupt, which eventually wakes up a thread to dispatch the waiting data.
Another option is the scheduler puts the waiting thread to sleep, and awakens it on a software condition (such as a mutex being released) as opposed to an interrupt.
When a thread waits, the scheduler does not schedule it for execution. It is effectively "asleep", not spinning a loop or anything. It does not poll. If it did, 3000 threads would quickly overwhelm the kernel.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
Greeting Kind Regards
I have no experience with threads. May I please posit an inquiry as to their use for one of my current projects. In particular I am currently running console level test code which runs autonomously needing no user interaction which is estimated to run for several days. It is of the form below.
test_0();
test_1();
test_2();
...
The calls do not depend on any before and share no resource other than CPU. They merely perform integer arithmetic. They are called in sequence just as shown. No call to test_x is performed until the one prior is complete. So my inquiry is would the overall test complete sooner if each of the calls were in its own thread?
Kind Regards
|
|
|
|
|
Up to a point, yes - that point being the number of threads being roughly equal to the number of virtual or logical "cpu cores" you have (hyperthreading effectively doubles that number on many CPUs because each core - p-core at least can run two threads).
Any threads created after that are no longer truly concurrent, but may still make sense to create in some cases if they'll be waiting (I/O bound) - that's not the case with your current situation which is all CPU bound.
Bottomline is if your chip has hyperthreading and 4 cores it can run 8 threads concurrently. After that it is divvying up one or more cores to do multiple tasks, which is extra overhead without extra benefit, again, in your situation.
What I would do is use System.Threading.ThreadPool and simply call QueueUserWorkItem to run each of those functions. Let the ThreadPool figure out when to schedule.
Check out my IoT graphics library here:
https://honeythecodewitch.com/gfx
And my IoT UI/User Experience library here:
https://honeythecodewitch.com/uix
|
|
|
|
|
I had the same question about CPU useage, as seen through the Task Manager window, a few years back but didn't get really curious about it until I got a new computer running Windows 10. Because of the number of cores mainly and the uptick in the amount of memory I had at my disposal, I began to work on this new machine with Task Manager always open. This new view was quite an upgrade incidently; the more cores and the more memory, the more windows-in-windows ... some what of a phenomemon. Recall back in the days of using that Borland C-compiler being able to watch a debugging session (through WYSYWG perhaps/even!) after having set a few stops. All those glittering gold flashes. Very entertaining.
Anyway, but it wasn't until I discovered Task Scheduler that my eyes were really opened! And to address what I think is the issue with background processes "ready" to dart into action. Notice all the tasks there in TS primed to go off.
Many other shadow processes can be seen scratching their sigils on the various system folders. There's C:\Windows\temp for one. Always a side-show of files of unknown origin entraining themselves on the retina. There's AppData\Local\tmp ... for a good time tune in to that channel; what I found useful is just to delete the content of these folders now and again.
Truth be told, my start-up time for Windows 10 went instantly from more than five minutes to a blazing minute or two after realizing that I could do that willy-nilly and get away with disabling some of the task scheduled by ad hoc installers, after being watchful for a while there too.
[EDIT]:
Let me also add that under Computer Managment Services, Stopping processes that are questionable, either by switching their Startup Types to Manual (from Running) or Disabled, doesn't always show itself accurately. For instance, when setting a service to Manual it can still start up on it's own (unknown process). And not only that but with the Computer Managment console open, the Start Type switch to Running (at the behest of the unknown process) will even ONLY show Manual.
[END EDIT]
So I guess my answer to the "why" is "Because it's there"
modified 6-Feb-24 19:15pm.
|
|
|
|
|
threads and cores, gentlemen, threads and cores /penguin
------------------------------------------------
If you say that getting the money
is the most important thing
You will spend your life
completely wasting your time
You will be doing things
you don't like doing
In order to go on living
That is, to go on doing things
you don't like doing
Which is stupid.
|
|
|
|
|
I might have missed the question, but if "what does N% of time in Task Manager for a task mean"
This just random-(semi educated guess) - the amount of time or executions that that task is allocated in a set amount of time.
again, over simplified, but A single CPU core, only runs 1 execution at a time. There is a scheduler that orders the so so many things to execute. So over 1 second it can go oh, task 123 has done 103456 executions, then some maths to say that was .2 seconds, or 20% of 1 second.
The amount of time it uses I am not sure. It could be in milliseconds. Or what ever the CPU make decides for that method endpoint to give back to the windows operating system.
Add on multiple cores, and the Windows main Task Manager - 100% is total across the cores.
If instead you are saying you are running performance tests, and not hitting over 60. Change the performance test that you are running.
The program could have 100 threads, but only able to run them across 2 cores, instead of 4.
i7 does not indicate fully what the CPU is.
Also can add in GPU along with the other comments on here, on what could be limiting and having the cpu waiting on.
Idle is not bad. Idle is good, it means that the cpu has more head room then the application can use. Or application needs changing to utilise more, which parallel programming is hard.
|
|
|
|
|
If you want to see what happens when the CPU runs at 100%, may I recommend the following tool:
Free Stress Test Tool HeavyLoad | JAM Software[^]
I've used it for a number of reasons, the most recent being to help answer the question "have I installed the replacement heatpipe and fan in my laptop properly, and has it made things better."
|
|
|
|
|
|
The Task Manager inserts a task of it's own, with a stub that includes a known computation - the percentage for performance is the relationship between the CPU time the stub is allocated, vs the other competing processes.
There are a number of reasons why you might see less than 100% CPU. One reason is concurrency. Not all tasks can proceed in parallel. Or the software might not effectively split the tasks up. Sometimes a task will need to wait for something else to complete. If that happens a lot, then it affects CPU utilization.
Some tasks are IO or network bound. In those cases the CPU waits for the network or peripheral to respond which also affects the CPU utilization.
Sometimes it's just poorly written software.
Hope that helps.
|
|
|
|
|
Greetings Kind Regards
In the spirit of possible helpfulness I present recent discovery of fast means of typing id est stenography.
Home | Open Steno Project[^]
Also utilized by an actual programmer.
STANOGRAPHER[^]
|
|
|
|