|How high resolution do you need? You certainly need nothing better than your eyes can see!
In the days of analog silver photography, it was established that a normal eye could distinguish line pairs - one black, one white line - 1/1500 of the viewing distance. At 3m, you could distinguish lines 1mm white and 1mm black. So with perfectly centered lines, pixels of 1mm square would do. A 4K (3840 pixels) screen of 3.84 m width would have pixels at the limit of the resolution of your eyes.
If the line pattern was not perfectly aligned and centered on the pixels, you would not see them as sharp lines. At half a pixel displacement everything would be 50% grey! In the early days, it was customary to assume that a line pair on the average required 3 (rather than 2) scan lines to be properly displayed.
On the other hand: For moving object (including moving line patterns), when a white point moves gradually out of a pixel, over to the neighbour pixel, the original pixel gets gradually darker, the new one brighter. The brain interprets this as a more or less continuous movement between the two neighbour pixes, "emulating" a higher resolution. If you freeze a video at a single frame, the resolution always appears much lower than when the image is moving. For practical purposes, this more than makes up for the 3-scanlines-per-line-pair.
In a still picture, silver grains were irregularly located, so identifying specific grains at the edge of your eye's resolution was sort of random. With LCD screens, pixels have fixed poisitions and are aligned in regular rows, so they are more visible. But again: With moving images, where one pixel fades out, the neighbouring one fades in, you won't have a white pixes snapping over in a single jump; the sliding motion covers up the strict alignment of strictly square pixel.
There are other sides, though. At 3m viewing distance, a 4K screen no wider than 3.84 m has a resolution matching your eyes. If you move in to a 1m viewing distance, then a 4K screen may have a poorer resolution than your eyes if it is less than 1.28 m wide. Today we sit a lot closer to the screen than we did a generation ago (and movies are shot with wide-angle lenses to match it, for perspective). Yet... a 1.28 m (4 ft wide) 4K screen at 1m (40 in) distance - that matches the resolution of the eyes of a "standard" (young adult) person. I think we are close enough...
What about resampling? Scaling up plain HD material to 4K, when you've got a 4K screen?
First: If any part of the image is an even, same color/brightness, it doesn't matter if it is a single pixel, four quarter size pixels of the same color, or nine ninth size pixels of the same color. For smooth surfaces, resampling to higher resolution is not a big issue.
Modern video compressing methods are sort of analog, not digital . They do not compress pixel values, but see them as point of curves, or rather 3D surfaces, trying to do a cuve/surface fitting to those points. When you do this on small tiles of the imgage, it is surprisingly successful! What is compressed, is the coefficients to the mathematical (continous) functions to generate these surfaces. When unpakcing, you in principle generate the continous surface from the coefficients, and samlple it with whatever resolution you require. If the surface perfectly matches the original (analog) image, any display resolution is valid. If there are slight variations across the tile, those 2x2 or 3x3 pixels you generate for the single original one will vary slightly, according to the mathematical function, giving a smoother surface with the neighbouring pixels.
For the surface to be reasonably "correct" you may need many coefficients for a high-degree mathematical function, in particular if the tile covers a sharp edge. Good encoders know to manage their "bit budget" so that few bits are wasted on even surfaces, allowing more for sharp edges etc. For the curve/surface fitting to be able to match the raw image samples properly to coefficients, a sufficiently high resolution is required in the raw image, but with the encoded data being coefficients for continous functions, this doesn't dictate the resolution after unpacking.
So, with properly encoded material, even though presented as plain HD material, can, if done properly, be resampled to 4K with an image quality very close to what 4K material would provide. In principle, the (continous) mathematical functions (re)generating the surface should be the same. If the display unit samples it at 2K or 4K should be rather irrelevant.
If the encoder needs a 4K raw image to generate the high order coefficients for the surface functions: Go ahead with it! If that leads to a TV signal pretending to be a 2K image, but if a 4K decoder looks at the high order coefficients and decides to make slight differences between each of the 2x2 pixels that would have been a single one i a 2K image, that is just the way it should be. "2K should be enough for anybody". Or, at least 4K, with high quality encoding. 8K is just a showoff, it goes way beyond your eye's resolution.
And then look at those smartphone screens, do they go way beyond your eye's resolution!