Postby markfm » 17 Dec 2004 01:38
At "Decode" you have brought the audio and video back to baseband -- decompressed, at whatever frame rate/resolution/sample rate is available for the information. At a simple level, you may be getting video coming in at 640x480 resolution, 30 frames per second.
Filter is where you work against the data. In video you may want to overlay a logo graphic, insert a timestamp, change the resolution, perhaps only output at 15 frames per second (so you only "wake up" and run the encode at 15 fps, even though the "source" baseband video may have the picture updating 30 fps). In audio you can run various equalization kinds of things -- filter the baseband signal.
Encode is just that -- you encode the baseband signals, audio and video separately, run them through a compression engine. For instance mpeg4 or DIV3 or WMV2 for video, mpga or mp3 for audio. At this stage you select the nominal bits per second that the different encoders (CODECs) operate at, so that the output encoded video may, as an example, be 1024 kbps, while the audio is output at 192 kbps.
Mux is mixing the audio and video together into a container format. If you use the wxwindows interface, the "Encapsulation method" is where you select which multiplexer to use. At the Encode stage the software is just putting out a series of compressed data, separate pipes of encoded audio and video. The Mux stage is where you obey an externally recognized "file" format -- the ordering of the blocks of audio and video, size of the chunks of data, location and contents of various format fields that are defined for the different container types (MPEG TS, ASF, ...).
Depending on what you wish to do, you can actually bypass some of the stages. If you have a file as your source where you are happy with the underlying CODECs, the bit rates, the compression methods, you are able to go right from Demux to Mux; in that case you have the compressed output of the demuxer immediately going into the multiplexer (the "Encapsulation method") to repackage it for Output. When you finally wish to Display the video locally, you Decode, Filter (can add things like a Logo graphic locally), but don't have to then Encode and Mux again.
Apologies if my description is a bit cryptic. I'm not a hardcore developer type, just been playing with the software, and this is how I explain it to my peers. The real devs may describe it completely differently, it's just a model that I use to understand the sequence of events, the "20000 foot" view.