Over the last several weeks I have been involved in a project ( for Airgain, Inc. ) in which we have a need to produce the highest possible quality streaming video using specific encodings and format. In the process I have discovered (and corrected) several minor problems in the vlc 0.9.8a player that contribute towards distorted or otherwise "unreliable" playback, as well as some observed problems in the live555 library with respect to the proper use of 'select'.
After applying our fixes, vlc is able to flawlessly play streaming video via RTSP [using UDP]. On a related note, we also had to fix problems in the open source streaming media server, as well as the live555 library, which made finding (and correcting) these problems all that more difficult.
FIX #1: frame buffer size [verified still exists in latest 'git' repository fetch]
In the modules/demux/live555.cpp file, function 'SessionsSetup', the track buffer is ALWAYS pre-allocated to 64k bytes [this value is actually hard-coded]. However, it appears that the intent was to determine how large the buffer SHOULD be by calculating the correct size in 'i_buffer'. The patch simply uses 'i_buffer's value in lieu of 65536, as follows:
// Airgain begin - increase video buffer to 'i_buffer' bytes (see above)
// tk->i_buffer = 65536;
// tk->p_buffer = (uint8_t *)malloc( 65536 );
tk->i_buffer = i_buffer;
tk->p_buffer = (uint8_t *)malloc( i_buffer );
// Airgain end
without the patch applied, each frame that exceeds the last allocated buffer size causes dropped packets, and corresponding video distortion [sometimes bad enough to crash the player]. Using errors [that lose data] to drive memory allocation is simply bad practice anyway, and so the intent of using 'i_buffer' rather than 65536 fixes this problem for even the highest bandwidth HD movies that we've tried so far.
Fix #2: correct I-frame timing
[NOTE: in the 'git' source tree from today I noticed some effort that may in some cases alleviate this problem, specifically Getcc and the information it collects, though it's likely the base cause of the problem goes uncorrected without the proposed patches]
This was very difficult to find. It appears that when streaming H.264 video contains B frames, if the content is 'just right' you could end up applying a B frame's timing (or even a preceding P frame's timing) to an I frame, causing 'shudders' in the playback. This is infrequent, although I was able to prove that it was actually happening through the use of debug messages.
Correcting the problem required patching two files, include/vlc_block.h and modules/packetizer/h264.c
In modules/packetizer/h264.c the function ParseNALBlock attempts to correct a problem where dts and pts are indicated as 'zero' for an I frame, by capturing the frame blocks FIRST dts and pts and applying that to everything in the frame block. Unfortunately, H.264 allows you to have multiple timings within the same block, so this method can't always work. Instead, a patch to block_ChainExtract corrects this problem by finding the first non-zero dts/pts within the same chain and using THAT instead. So, if the first entry of an I frame has a dts/pts of zero, the next entry probably won't - and so you use the NEXT entry's dts/pts value via the patch (which will be the correct value).
the function block_ChainExtract (include/vlc_block.h) is patched as follows:
static size_t block_ChainExtract( block_t *p_list, void *p_data, size_t i_max )
{
size_t i_total = 0;
uint8_t *p = (uint8_t*)p_data;
// Airgain begin
block_t *p_list0 = p_list;
// Airgain end
while( p_list && i_max )
{
size_t i_copy = __MIN( i_max, p_list->i_buffer );
memcpy( p, p_list->p_buffer, i_copy );
i_max -= i_copy;
i_total += i_copy;
p += i_copy;
// Airgain begin
if(!p_list0->i_dts && !p_list0->i_pts)
{
p_list0->i_dts = p_list->i_dts;
p_list0->i_pts = p_list->i_pts;
}
// Airgain end
p_list = p_list->p_next;
}
return i_total;
}
Then, a corresponding set of patches in modules/packetizer/h264.c un-does the workaround that uses 'i_frame_dts' and 'i_frame_pts' in lieu of the actual dts/pts as follows:
(in ParseNALBlock, near the end)
/* Append the block */
if( p_frag )
block_ChainAppend( &p_sys->p_frame, p_frag );
*pb_used_ts = false;
if( p_sys->i_frame_dts < 0 && p_sys->i_frame_pts < 0 )
{
// Airgain begin - if B frame or i_frag_pts is zero, do NOT assign frame [pd]ts !
if(i_frag_pts && p_sys->slice.i_frame_type !=
{
// Airgain end
p_sys->i_frame_dts = i_frag_dts;
p_sys->i_frame_pts = i_frag_pts;
*pb_used_ts = true;
// Airgain begin
}
// Airgain end
}
return p_pic;
}
(in OutputPicture, near the end)
else
{
p_pic = block_ChainGather( p_sys->p_frame );
}
// Airgain begin - do not assign i_[pd]ts from frame [pd]ts unless mine are <= 0
if(p_pic->i_dts <= 0 && p_pic->i_pts <= 0)
{
// Airgain end
p_pic->i_dts = p_sys->i_frame_dts;
p_pic->i_pts = p_sys->i_frame_pts;
// Airgain begin
}
// Airgain end
p_pic->i_length = 0; /* FIXME */
p_pic->i_flags |= p_sys->slice.i_frame_type;
p_sys->slice.i_frame_type = 0;
p_sys->p_frame = NULL;
p_sys->i_frame_dts = -1;
p_sys->i_frame_pts = -1;
p_sys->b_slice = false;
return p_pic;
}
Fix #3: (X11 only) - using XFlush + usleep to send commands to XServer and thread-switch _before_ XSync
One observation that I made regarding X11 video performance is that OCCASIONALLY you would end up with delay times around the 'XSync' call that exceeded the frame to frame timing by several times over (as long as 100msec at times). Additionally there was an accumulation of memory footprint for the X Server. It appeared as if the calls to the X Server were being made in such a way that normal background tasks were being pre-empted until "everything backed up". The problem manifested itself as infrequent "stops" and frame drops in high bandwidth video playback, and occasionally as "tears" where part of a frame was rendered on one vertical scan interval, and the rest on subsequent scan interval(s).
[I had an earlier post with respect to this patch but I'm repeating it again anyway as part of this, since it's relevant and not in the latest 'git' repository fetch]
In the file modules/video_output/x11/xcommon.c, function 'DisplayVideo', a call to XSync correctly follows all of the calls that prepare (but do not necessarily send) a set of commands for the X server. Unfortunately, the effects of making this call in this way include the occasional "garbage collection" or other 'interruptive' task that causes dropped frames to occur when they otherwise should not. The proposed well-tested patch, which attempts to work around an apparent problem in the X11 library, is as follows:
(modules/video_output/x11/xcommon.c, function DisplayVideo, near the end)
}
// Airgain begin - XFlush to send commands, usleep to force context switch.
// This is likely to help client/server interaction
XFlush( p_vout->p_sys->p_display ); // force commands to be sent
usleep(500); // then switch contexts before waiting
// Airgain end
/* Make sure the command is sent now - do NOT use XFlush !*/
XSync( p_vout->p_sys->p_display, False );
vlc_mutex_unlock( &p_vout->p_sys->lock );
}
Please implement these patches (or their equivalent functionality) at the earliest possibility. Thanks