Page 1 of 2

Fixing RTL (Mainly Hebrew and Arabic) subtitles problems

Posted: 15 Nov 2005 17:01
by barakori
The short sroty - I modified modules/misc/freetype.c to better support bidi languages. I'm not a VLC developer. Who can I send this so that it's included in the next version of VLC?
I know of the following problems with RTL subtitles (I've tested this with Hebrew on Windows & Linux):
  • When a single subtitle spans multiple lines, the line order is reversed. E.g. when there are two lines, the second line is shown at the top, and the first line at the bottom.
  • Some punctuation characters are displayed at the begining of the line (right hand side - it's RTL) instead of the end of the line (left hand side). For example, "well," is shown line this: ",well'.
  • When a hyphen is used between a number and a Hebrew word (common in Hebrew syntax, not so common in Hebrew subtitles), the ordering of the number, hyphen and Hebrew word is not as expected.
I got the VLC source from SVN and found out the problem is in freetype.c, which uses a call to fribidi to reorder the text:
  • The line reordering problem seems to be a fribidi problem, but this might indeed be the official Unicode handling for multiline text. I changed the code in freetype.c to reorder each line, and not the whole subtitle text. This solved the line ordering problem.
  • The punctuation ordering problem is solved by calling fribidi with LTR as the main direction. This is a work-around. Since most players (under Windows) don't perform bidi, but let the system do it, the subtitles are prepared in logical order (RTL), but with main order (LTR), since players don't support setting the main order of the subtitles. This makes authoring subtitles a little harder, but everyone's already doing it.
  • The hyphen problem happens because Microsoft's bidi algorithm is not fully compatible with the Unicode bidi algorithm. Since most Hebrew subtitles are generated and viewed on Windows, the hyphen is authored so it would display properly using Microsoft's bidi algorithm. Fribidi is Unicode compatible, which means it orders the hyphen differently. I didn't fix this problem in freetype.c, but have solved this in the past (In a Java utility i once wrote).

My question is: What do I do to suggest my fix to the developer team of VLC? Also, I made the change on Linux (where it's much easier to setup a development environment for VLC), but I'd really like a Windows VLC version with this feature, and I don't know how to build a Windows version.
In the long run, full support for bidi in VLC would include adding two options in the subtitles preferences:
  • Use RTL paragraph order for RTL subtitles - this would be set to false until more players really support RTL languages. BTW, I checked with a friend who speaks Arabic, and it seems that Arabic subtitles also assume LTR order.
  • Fix Microsoft bidi problems - this would be set to true by default to handle hyphens properly.
This would make VLC play Hebrew (and Arabic, as far as I could check) subtitles properly, and allow for *real* support for RTL Unicode-based subtitles.

One final thing - I have the "fixed" freebidi.c source, and would have published it here, but it doesn't look like I can add attachments to messages in this forum.

Barak

Posted: 15 Nov 2005 20:03
by ipkiss
You can send a patch (generated using the 'svn diff' command) to the vlc-devel@videolan.org mailing-list. It will be reviewed, and if everything is fine, it will be integrated in svn (hence available in the windows nightly builds too).

As for how to build a windows binary, detailed instructions are in the INSTALL.win32 file.

How do I get the nightly build?

Posted: 14 Dec 2005 10:35
by barakori
Okay, my patch was approved and implemented. Now please tell me where I can find the nightly builds... I looked in all the VLC tabs and the developers area but couldn't find it. Maybe it's my eyes...

Posted: 14 Dec 2005 14:29
by ARTillery
thanks.

I use windows XP pro SP2 with the default system fonts(Arial and tahoma for the most parts).

u can also send me the fix or tell me where can I get it to try it out if u like.

Posted: 14 Dec 2005 23:10
by dionoea
The nightly builds are hosted at: http://nightlies.videolan.org

Hebew fix - what about MAC

Posted: 16 Jan 2006 16:05
by skipper
when will we see a MAC update too ?

Last letter display first

Posted: 12 Feb 2006 09:54
by assaf
It seems that the issues with hebrew subtitles are not fully resolved.

Displaying an hebrew srt subtitle file in vlc0.84 (and trunk build vlc-0.8.5-svn-20060212-0000-win32.exe), the last letter of every paragraph is displayed at the start of the paragraph.

Is anybody else seeing this?

Posted: 13 Feb 2006 03:20
by The DJ
Can you make a screenshot and point out the issue ?

Posted: 13 Feb 2006 20:38
by assaf
Screenshots created using VLC's screenshot feature did not capture the subtitles.. Is there some hidden config I should enable?

Alternatively, is there a free utiltiy that is able to capture an avi being played inside vlc?

Posted: 13 Feb 2006 20:41
by The DJ
cmd-shift-3

Or the Grab.app in your Utilities.

Posted: 19 Feb 2006 10:09
by assaf
I am a windows user, so no cmd (ctrl-shift-3 didn't seem to work). I don't think there is Grab.app as well in win platform.
Also tried capturing with Snagit-v8, nothing work..
So sorry, cant provied screen capture.. :-(

Posted: 21 Feb 2006 00:17
by ipkiss
Disable overlay in the preferences (in the Video section... you might need to activate the Advanced preferences). Then you can take screenshots without any problem.
Also note that VLC is able to take snapshots, even with overlay activated (i think the default hotkey for that is Ctrl-Shift-S).

Posted: 21 Feb 2006 12:14
by assaf
Ok, finally - please find bellow a capture of a frame with the last letter apearing first.

Image

Posted: 21 Feb 2006 15:55
by The DJ
Could you write out what it should look like in this particular case ?

Posted: 21 Feb 2006 17:19
by Guest
It should look like this:
Image

Posted: 03 Mar 2006 08:11
by assaf758
Trying again - it should like this:
Image

Posted: 03 Mar 2006 10:33
by The DJ
is this the same problem for EVERY line?
Even when the sub is splitted over two lines? or is it only for the last line or something then?

Posted: 05 Mar 2006 00:32
by Tombigel
1st of all - Barak, thank you so much, I've been waiting for BiDi features in VLC for a long time. When your patch will be fully embedded, all other players will be thrown out the window(s)....

About the thing with the last letter apearing as first, I had the same problem, and a simple change of the encoding from "system default" to ISO-8859-8 in preferences->input/codecs->others codecs->subtitles fixed the problem.

btw

Did you check how your patch works on utf-8 subtitles? I don't remmember ever seeing any utf-8 encoded titles, but i'm sure there are some out there and it should be checked.

Posted: 05 Mar 2006 00:57
by Tombigel
Sorry for the double post, but I wasnt registered before.

Iv'e noticed a minor bug (using 20060304 nightly build) -
When there are 2 subtitle lines and subtitles justification is set to "center", the bottom line is aligned to center but the top line is aligned to the left of the bottom one
Image

Posted: 05 Mar 2006 19:13
by assaf758
Setting to ISO-8859-8 solved the issue as Tombigel reported - thanks!

BTW, I also see the two-line justification issue.

Posted: 06 Mar 2006 22:39
by The DJ
Yes the 2 line notification issue is a known problem introduced when I added more advanced SSA support. it will be fixed.

I'm interested to know what the default text encoding was btw. it should be reported in the messages dialog when you leave it default. on Mac it's always Latin-1, on windows and linux it is detected automatically. (for nightlies that is).

Posted: 04 Jun 2006 19:16
by Cooler
first of all thank you barakori....
I use vlc-0.8.6-svn-20060604-0001-win32 to display the arabic lang.
the problem is that the arabic letters is displayed divided ,when it should be connected to each other.
thanks in advanced.

I can't get the Hebrew Subtitles to work.

Posted: 06 Jun 2006 08:37
by IlyaPittel
Hi,

I'm new at VLC 0.8.5 . I am runing a Mac OS 10.4.6.

I can't get the Hebrew subtitles to run, i keep getting error messages. Can someone please give me a step by step on how to set it up...
I was also wondering if VLC can play any region DVDs, meaning if my mac is set to reagion 1 can VLC bypass this and play region 2 DVDs?

Thank you
Ilya

Posted: 19 Jun 2006 19:30
by Guest
I still cant get arabic subtitles to work can any1 help me out solve this issue the letters show but they are divided rather than connected, thanks inadvance

two lines problem of BIDI subtitles

Posted: 08 Aug 2006 00:09
by BIDI guest
Is two lines problem of BIDI subtitles fixed in VLC media player 0.8.5?
it happen 0.8.4a.