Fixing RTL (Mainly Hebrew and Arabic) subtitles problems

For questions and discussion that is NOT (I repeat NOT) specific to a certain Operating System.
barakori
New Cone
New Cone
Posts: 4
Joined: 15 Nov 2005 15:48
Contact:

Fixing RTL (Mainly Hebrew and Arabic) subtitles problems

Postby barakori » 15 Nov 2005 17:01

The short sroty - I modified modules/misc/freetype.c to better support bidi languages. I'm not a VLC developer. Who can I send this so that it's included in the next version of VLC?
I know of the following problems with RTL subtitles (I've tested this with Hebrew on Windows & Linux):
  • When a single subtitle spans multiple lines, the line order is reversed. E.g. when there are two lines, the second line is shown at the top, and the first line at the bottom.
  • Some punctuation characters are displayed at the begining of the line (right hand side - it's RTL) instead of the end of the line (left hand side). For example, "well," is shown line this: ",well'.
  • When a hyphen is used between a number and a Hebrew word (common in Hebrew syntax, not so common in Hebrew subtitles), the ordering of the number, hyphen and Hebrew word is not as expected.
I got the VLC source from SVN and found out the problem is in freetype.c, which uses a call to fribidi to reorder the text:
  • The line reordering problem seems to be a fribidi problem, but this might indeed be the official Unicode handling for multiline text. I changed the code in freetype.c to reorder each line, and not the whole subtitle text. This solved the line ordering problem.
  • The punctuation ordering problem is solved by calling fribidi with LTR as the main direction. This is a work-around. Since most players (under Windows) don't perform bidi, but let the system do it, the subtitles are prepared in logical order (RTL), but with main order (LTR), since players don't support setting the main order of the subtitles. This makes authoring subtitles a little harder, but everyone's already doing it.
  • The hyphen problem happens because Microsoft's bidi algorithm is not fully compatible with the Unicode bidi algorithm. Since most Hebrew subtitles are generated and viewed on Windows, the hyphen is authored so it would display properly using Microsoft's bidi algorithm. Fribidi is Unicode compatible, which means it orders the hyphen differently. I didn't fix this problem in freetype.c, but have solved this in the past (In a Java utility i once wrote).

My question is: What do I do to suggest my fix to the developer team of VLC? Also, I made the change on Linux (where it's much easier to setup a development environment for VLC), but I'd really like a Windows VLC version with this feature, and I don't know how to build a Windows version.
In the long run, full support for bidi in VLC would include adding two options in the subtitles preferences:
  • Use RTL paragraph order for RTL subtitles - this would be set to false until more players really support RTL languages. BTW, I checked with a friend who speaks Arabic, and it seems that Arabic subtitles also assume LTR order.
  • Fix Microsoft bidi problems - this would be set to true by default to handle hyphens properly.
This would make VLC play Hebrew (and Arabic, as far as I could check) subtitles properly, and allow for *real* support for RTL Unicode-based subtitles.

One final thing - I have the "fixed" freebidi.c source, and would have published it here, but it doesn't look like I can add attachments to messages in this forum.

Barak

ipkiss
Big Cone-huna
Big Cone-huna
Posts: 695
Joined: 23 Nov 2003 01:49

Postby ipkiss » 15 Nov 2005 20:03

You can send a patch (generated using the 'svn diff' command) to the vlc-devel@videolan.org mailing-list. It will be reviewed, and if everything is fine, it will be integrated in svn (hence available in the windows nightly builds too).

As for how to build a windows binary, detailed instructions are in the INSTALL.win32 file.

barakori
New Cone
New Cone
Posts: 4
Joined: 15 Nov 2005 15:48
Contact:

How do I get the nightly build?

Postby barakori » 14 Dec 2005 10:35

Okay, my patch was approved and implemented. Now please tell me where I can find the nightly builds... I looked in all the VLC tabs and the developers area but couldn't find it. Maybe it's my eyes...

ARTillery

Postby ARTillery » 14 Dec 2005 14:29

thanks.

I use windows XP pro SP2 with the default system fonts(Arial and tahoma for the most parts).

u can also send me the fix or tell me where can I get it to try it out if u like.

dionoea
Cone Master
Cone Master
Posts: 5157
Joined: 03 Dec 2003 23:09
Location: Paris, France

Postby dionoea » 14 Dec 2005 23:10

The nightly builds are hosted at: http://nightlies.videolan.org
Antoine Cellerier
dionoea
(Please do not use private messages for support questions)

skipper

Hebew fix - what about MAC

Postby skipper » 16 Jan 2006 16:05

when will we see a MAC update too ?

assaf

Last letter display first

Postby assaf » 12 Feb 2006 09:54

It seems that the issues with hebrew subtitles are not fully resolved.

Displaying an hebrew srt subtitle file in vlc0.84 (and trunk build vlc-0.8.5-svn-20060212-0000-win32.exe), the last letter of every paragraph is displayed at the start of the paragraph.

Is anybody else seeing this?

The DJ
Cone Master
Cone Master
Posts: 5987
Joined: 22 Nov 2003 21:52
VLC version: git
Operating System: Mac OS X
Location: Enschede, Holland
Contact:

Postby The DJ » 13 Feb 2006 03:20

Can you make a screenshot and point out the issue ?
Don't use PMs for support questions.

assaf

Postby assaf » 13 Feb 2006 20:38

Screenshots created using VLC's screenshot feature did not capture the subtitles.. Is there some hidden config I should enable?

Alternatively, is there a free utiltiy that is able to capture an avi being played inside vlc?

The DJ
Cone Master
Cone Master
Posts: 5987
Joined: 22 Nov 2003 21:52
VLC version: git
Operating System: Mac OS X
Location: Enschede, Holland
Contact:

Postby The DJ » 13 Feb 2006 20:41

cmd-shift-3

Or the Grab.app in your Utilities.
Don't use PMs for support questions.

assaf

Postby assaf » 19 Feb 2006 10:09

I am a windows user, so no cmd (ctrl-shift-3 didn't seem to work). I don't think there is Grab.app as well in win platform.
Also tried capturing with Snagit-v8, nothing work..
So sorry, cant provied screen capture.. :-(

ipkiss
Big Cone-huna
Big Cone-huna
Posts: 695
Joined: 23 Nov 2003 01:49

Postby ipkiss » 21 Feb 2006 00:17

Disable overlay in the preferences (in the Video section... you might need to activate the Advanced preferences). Then you can take screenshots without any problem.
Also note that VLC is able to take snapshots, even with overlay activated (i think the default hotkey for that is Ctrl-Shift-S).

assaf

Postby assaf » 21 Feb 2006 12:14

Ok, finally - please find bellow a capture of a frame with the last letter apearing first.

Image

The DJ
Cone Master
Cone Master
Posts: 5987
Joined: 22 Nov 2003 21:52
VLC version: git
Operating System: Mac OS X
Location: Enschede, Holland
Contact:

Postby The DJ » 21 Feb 2006 15:55

Could you write out what it should look like in this particular case ?
Don't use PMs for support questions.

Guest

Postby Guest » 21 Feb 2006 17:19

It should look like this:
Image

assaf758
New Cone
New Cone
Posts: 2
Joined: 21 Feb 2006 17:25

Postby assaf758 » 03 Mar 2006 08:11

Trying again - it should like this:
Image

The DJ
Cone Master
Cone Master
Posts: 5987
Joined: 22 Nov 2003 21:52
VLC version: git
Operating System: Mac OS X
Location: Enschede, Holland
Contact:

Postby The DJ » 03 Mar 2006 10:33

is this the same problem for EVERY line?
Even when the sub is splitted over two lines? or is it only for the last line or something then?
Don't use PMs for support questions.

Tombigel

Postby Tombigel » 05 Mar 2006 00:32

1st of all - Barak, thank you so much, I've been waiting for BiDi features in VLC for a long time. When your patch will be fully embedded, all other players will be thrown out the window(s)....

About the thing with the last letter apearing as first, I had the same problem, and a simple change of the encoding from "system default" to ISO-8859-8 in preferences->input/codecs->others codecs->subtitles fixed the problem.

btw

Did you check how your patch works on utf-8 subtitles? I don't remmember ever seeing any utf-8 encoded titles, but i'm sure there are some out there and it should be checked.

Tombigel
Blank Cone
Blank Cone
Posts: 70
Joined: 05 Mar 2006 00:34
Contact:

Postby Tombigel » 05 Mar 2006 00:57

Sorry for the double post, but I wasnt registered before.

Iv'e noticed a minor bug (using 20060304 nightly build) -
When there are 2 subtitle lines and subtitles justification is set to "center", the bottom line is aligned to center but the top line is aligned to the left of the bottom one
Image

assaf758
New Cone
New Cone
Posts: 2
Joined: 21 Feb 2006 17:25

Postby assaf758 » 05 Mar 2006 19:13

Setting to ISO-8859-8 solved the issue as Tombigel reported - thanks!

BTW, I also see the two-line justification issue.

The DJ
Cone Master
Cone Master
Posts: 5987
Joined: 22 Nov 2003 21:52
VLC version: git
Operating System: Mac OS X
Location: Enschede, Holland
Contact:

Postby The DJ » 06 Mar 2006 22:39

Yes the 2 line notification issue is a known problem introduced when I added more advanced SSA support. it will be fixed.

I'm interested to know what the default text encoding was btw. it should be reported in the messages dialog when you leave it default. on Mac it's always Latin-1, on windows and linux it is detected automatically. (for nightlies that is).
Don't use PMs for support questions.

Cooler

Postby Cooler » 04 Jun 2006 19:16

first of all thank you barakori....
I use vlc-0.8.6-svn-20060604-0001-win32 to display the arabic lang.
the problem is that the arabic letters is displayed divided ,when it should be connected to each other.
thanks in advanced.

IlyaPittel

I can't get the Hebrew Subtitles to work.

Postby IlyaPittel » 06 Jun 2006 08:37

Hi,

I'm new at VLC 0.8.5 . I am runing a Mac OS 10.4.6.

I can't get the Hebrew subtitles to run, i keep getting error messages. Can someone please give me a step by step on how to set it up...
I was also wondering if VLC can play any region DVDs, meaning if my mac is set to reagion 1 can VLC bypass this and play region 2 DVDs?

Thank you
Ilya

Guest

Postby Guest » 19 Jun 2006 19:30

I still cant get arabic subtitles to work can any1 help me out solve this issue the letters show but they are divided rather than connected, thanks inadvance

BIDI guest

two lines problem of BIDI subtitles

Postby BIDI guest » 08 Aug 2006 00:09

Is two lines problem of BIDI subtitles fixed in VLC media player 0.8.5?
it happen 0.8.4a.


Return to “General VLC media player Troubleshooting”

Who is online

Users browsing this forum: No registered users and 11 guests