The simplest way is to use an ffmpeg wrapper like this
], or mencoder wrapper (google will reveal lot more), of course you can use DirecShow tools directly in c#, but that is not easy.
The simplest answer to your theoretical question: a raw video stream consist of image frames, like a movie stripe, every frame is like a standalone bmp file, without compression, and every pixel is present. Well, the compression codecs try to compress single frames like mjpeg, or they use also the difference between the consequent frames. The compression itself is an other topic. In case of video files, there are also containers. Some can hold these compression types, some can hold others. In case of format conversion, in some cases you can raw copy the data from one container to another, but in most cases you have to recreate part-to-part the original raw stream decoding the input file, and compress it with the output format's codec. In case of non-linear transform, you can process this parts in parallel.