Using GPU Decoding On Multiple Simultaneous Videos
Posted: 11 Feb 2014 02:39
Hi all, I am having some issues in regards to using GPU decoding. I have a workstation that is capable of performing 8 simultaneous decodings with eight separate instances of VideoLAN on the CPU. However, this maxes out the CPU at around 90%. As a result, I am trying to utilize the GPU to perform this task for me. My hardware is a Xeon Quad Core CPU E5606 chip @ 2.13 Ghz w/ 48GB RAM running Win7 64bit. I am running two AMD Firepro W7000s. The video cards control a 2x2 monitor pixelspace @ 1920x1080 each with the top two dedicated to video. When I run all instances of output using separate instances of VideoLAN (mosaic just makes it worse), it skips frames like crazy. I get a frame rate of 1-2 FPS if I'm lucky on some of the streams (whereas others seem to vary). When I output 6 instances, there are no issues. Everything plays smooth as butter without any hiccups. Monitoring the CPU/RAM/GPU usage, nothing is stressed at all (<15%, <15%, <30% respectively). The instant I add one additional stream, I get really bad frame rates and I notice my GPU drops to <10%. I have tried multiple instances of VLC with the most stable for me being 1.1.11 (which runs the most VLC instances flawlessly), and also the newest (2.1.2 - which incidentally only allows 5 streams smoothly and the 6th skips frames). It also does not appear to be the hard drive reading the file. The cards are using the newest Catalyst drivers from AMD. I also tested the setup with an NVidia card and had the same issue. What can cause this issue? I have tried the following to resolve:
- uncheck overlay
- check Verbosity and it indicates that it is "picture too late to be displayed" issue, IIRC
- disabled the audio track
- renamed the sampling dlls
- change Windows power saving to none
- set the H.264 profile to baseline from high
- tried all output modules
- set ffmpeg thread set to 4, 2, and 1
- skip loop filter on ffmpeg to all
- change input demuxer to autoformat demuxer
- disable BIOS Speedstep and CStates
- tried using W7000s linked in Crossfire mode with the above
I can tell that GPU decoding indeed works but for some reason, it won't get passed the threshold. I can open another window of VLC, set it to NOT use GPU decoding and it works okay. I can also open instances of Windows Media Player and play the file without issues while I have the 6 GPU streams up. I noticed though that when I bring up the 7th instance of VLC that I get extremely slow window redraw (moving windows by click-drag are slow, as are context menus to appear). Not sure if it matters but for some reason, 1.1.11 likes DirectDraw and 2.1.2 likes OpenGL better. Any ideas for the bad framerate? Thanks in advance.
- uncheck overlay
- check Verbosity and it indicates that it is "picture too late to be displayed" issue, IIRC
- disabled the audio track
- renamed the sampling dlls
- change Windows power saving to none
- set the H.264 profile to baseline from high
- tried all output modules
- set ffmpeg thread set to 4, 2, and 1
- skip loop filter on ffmpeg to all
- change input demuxer to autoformat demuxer
- disable BIOS Speedstep and CStates
- tried using W7000s linked in Crossfire mode with the above
I can tell that GPU decoding indeed works but for some reason, it won't get passed the threshold. I can open another window of VLC, set it to NOT use GPU decoding and it works okay. I can also open instances of Windows Media Player and play the file without issues while I have the 6 GPU streams up. I noticed though that when I bring up the 7th instance of VLC that I get extremely slow window redraw (moving windows by click-drag are slow, as are context menus to appear). Not sure if it matters but for some reason, 1.1.11 likes DirectDraw and 2.1.2 likes OpenGL better. Any ideas for the bad framerate? Thanks in advance.