vlc 1.1.0 subtitles auto mode regression

Microsoft Windows specific usage questions
Forum rules
Please post only Windows specific questions in this forum category. If you don't know where to post, please read the different forums' rules. Thanks.
temp4746
Blank Cone
Blank Cone
Posts: 17
Joined: 09 Jun 2010 22:17

vlc 1.1.0 subtitles auto mode regression

Postby temp4746 » 10 Jun 2010 00:20

In vlc 1.0.5, vlc seems to correctly auto detect an hebrew windows-1255 encoded .srt and show it correctly.
In vlc 1.1.0-rc1, vlc seems to incorrectly auto detect an hebrew windows-1255 encoded .srt file and displays gibberish, I have to manually set the subtitles encoding in the settings dialog to make it work.

EDIT: This happens with 1.1.0 final to.

OS: WIndows 7 Ultimate 32-bit.
Last edited by temp4746 on 22 Jun 2010 15:48, edited 3 times in total.

VLC_help
Mega Cone Master
Mega Cone Master
Posts: 25661
Joined: 13 Sep 2006 14:16

Re: vlc 1.1.0-rc1 subtitles auto mode regression

Postby VLC_help » 10 Jun 2010 17:28

Have you tried RC2 ?

temp4746
Blank Cone
Blank Cone
Posts: 17
Joined: 09 Jun 2010 22:17

Re: vlc 1.1.0-rc1 subtitles auto mode regression

Postby temp4746 » 10 Jun 2010 22:51

I downloaded from here: http://www.videolan.org/vlc/releases/1.1.0-RC.html
The link from the news post in the main site seems to indicate that this is RC1.
Guess trying RC2 will require compiling it myself which is quite a nasty thing to do under Windows. :-|

VLC_help
Mega Cone Master
Mega Cone Master
Posts: 25661
Joined: 13 Sep 2006 14:16

Re: vlc 1.1.0-rc1 subtitles auto mode regression

Postby VLC_help » 11 Jun 2010 23:38


temp4746
Blank Cone
Blank Cone
Posts: 17
Joined: 09 Jun 2010 22:17

Re: vlc 1.1.0 subtitles auto mode regression

Postby temp4746 » 22 Jun 2010 15:47

I tested this issue with 1.1.0 final, and I'm seeing exactly the same behaviour.

Lotesdelere
Cone Master
Cone Master
Posts: 9967
Joined: 08 Sep 2006 04:39
Location: Europe

Re: vlc 1.1.0 subtitles auto mode regression

Postby Lotesdelere » 22 Jun 2010 17:45

Reset Preferences and Cache.

temp4746
Blank Cone
Blank Cone
Posts: 17
Joined: 09 Jun 2010 22:17

Re: vlc 1.1.0 subtitles auto mode regression

Postby temp4746 » 22 Jun 2010 17:48

Reset Preferences and Cache.
I already did that :-|

Damien
Blank Cone
Blank Cone
Posts: 17
Joined: 11 May 2010 15:23

Re: vlc 1.1.0 subtitles auto mode regression

Postby Damien » 22 Jun 2010 20:53

The same here.

viewtopic.php?f=34&t=76048&start=15#p251748

People (who don't know what to do)will use another player,that simple. :-|

temp4746
Blank Cone
Blank Cone
Posts: 17
Joined: 09 Jun 2010 22:17

Re: vlc 1.1.0 subtitles auto mode regression

Postby temp4746 » 22 Jun 2010 23:45

Sad that it did work correctly for me in 1.0.5
Someone should really look at the Auto encoding detection code...

secarica
Blank Cone
Blank Cone
Posts: 95
Joined: 25 Oct 2005 00:27
Location: Romania, Earth
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby secarica » 23 Jun 2010 17:11

Happens the same here (i.e. autodetect is broken). System is set to Romanian, subtitles should display in CP-1250 Central European, but display wrong in CP-1252 (Western). Windows Vista 64bit.

With 1.0.5 works ok, even with a not-installed 1.0.5 version (just unzipped).

Cristi
... I think it's too hard to think

Jean-Baptiste Kempf
Site Administrator
Site Administrator
Posts: 37523
Joined: 22 Jul 2005 15:29
VLC version: 4.0.0-git
Operating System: Linux, Windows, Mac
Location: Cone, France
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby Jean-Baptiste Kempf » 23 Jun 2010 18:17

If someone could do a proper bug report, that would be amazing...
Jean-Baptiste Kempf
http://www.jbkempf.com/ - http://www.jbkempf.com/blog/category/Videolan
VLC media player developer, VideoLAN President and Sites administrator
If you want an answer to your question, just be specific and precise. Don't use Private Messages.

VLC_help
Mega Cone Master
Mega Cone Master
Posts: 25661
Joined: 13 Sep 2006 14:16

Re: vlc 1.1.0 subtitles auto mode regression

Postby VLC_help » 23 Jun 2010 20:09

I can if someone provides me sample subtitle file =)

secarica
Blank Cone
Blank Cone
Posts: 95
Joined: 25 Oct 2005 00:27
Location: Romania, Earth
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby secarica » 23 Jun 2010 20:43

I can if someone provides me sample subtitle file =)
A sample may not be enough, in that a particular sample should match a particular codepage setting at system level.
For example, take this one. It is 8 bit, CP-1250.

With all auto, row #2 in VLC 1.0.5 displays "Traducerea şi adaptarea *** etc." (which is correct).
Same row in VLC 1.1.0 displays "Traducerea ºi adaptarea *** etc." (which is wrong).

Row #4 in VLC 1.0.5 displays "Te simţi bine, dragule ?" (which is correct).
Same row in VLC 1.1.0 displays "Te simþi bine, dragule ?" (which is wrong).

Cristi
... I think it's too hard to think

VLC_help
Mega Cone Master
Mega Cone Master
Posts: 25661
Joined: 13 Sep 2006 14:16

Re: vlc 1.1.0 subtitles auto mode regression

Postby VLC_help » 24 Jun 2010 21:04


Rémi Denis-Courmont
Developer
Developer
Posts: 15266
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby Rémi Denis-Courmont » 26 Jun 2010 19:46

In vlc 1.0.5, vlc seems to correctly auto detect an hebrew windows-1255 encoded .srt and show it correctly.
In vlc 1.1.0-rc1, vlc seems to incorrectly auto detect an hebrew windows-1255 encoded .srt file and displays gibberish, I have to manually set the subtitles encoding in the settings dialog to make it work.
The hebrew translation does not define a default encoding currently. Looking at the changes, it has not been maintained for several years. We can add CP1255. But if the translation is totally outdated anyway, you might prefer to use English and set the subtitle encoding manually :-| .

I will fix it manually in VLC 1.1.1 but there is only so much the developers can do without active translators.
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

Rémi Denis-Courmont
Developer
Developer
Posts: 15266
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby Rémi Denis-Courmont » 26 Jun 2010 19:56

Sad that it did work correctly for me in 1.0.5
Someone should really look at the Auto encoding detection code...
The "code" just tries to decode the subtitle as UTF-8 (unless you've disabled UTF-8 autodetection) then falls back to a locale-defined character encoding. If you use VLC in English, then the default is CP1252. Microsoft uses that as character encoding for English and other western European languages.

The autodetection logic is basically the same since VLC 0.8.5. In earlier versions, the code would fall back to the local system character encoding. This used to work mostly well in the last century. But most systems have switched to Unicode by default nowadays, so that trick would not work anymore.

Fron VLC 0.8.5 through 1.0.6, the default values were hard-coded in the VLC source code. It turned out to be a bad idea as the number of supported languages exploded. VLC has almost 70 translations nowadays. From VLC 1.1.0 onward, the default character encodings are specified in the message translation files. Unfortunately, some VLC translation are currently unmaintained (Hebrew is one example). There you go...
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

temp4746
Blank Cone
Blank Cone
Posts: 17
Joined: 09 Jun 2010 22:17

Re: vlc 1.1.0 subtitles auto mode regression

Postby temp4746 » 26 Jun 2010 20:08

Sad that it did work correctly for me in 1.0.5
Someone should really look at the Auto encoding detection code...
The "code" just tries to decode the subtitle as UTF-8 (unless you've disabled UTF-8 autodetection) then falls back to a locale-defined character encoding. If you use VLC in English, then the default is CP1252. Microsoft uses that as character encoding for English and other western European languages.

The autodetection logic is basically the same since VLC 0.8.5. In earlier versions, the code would fall back to the local system character encoding. This used to work mostly well in the last century. But most systems have switched to Unicode by default nowadays, so that trick would not work anymore.

Fron VLC 0.8.5 through 1.0.6, the default values were hard-coded in the VLC source code. It turned out to be a bad idea as the number of supported languages exploded. VLC has almost 70 translations nowadays. From VLC 1.1.0 onward, the default character encodings are specified in the message translation files. Unfortunately, some VLC translation are currently unmaintained (Hebrew is one example), and still many bugs are not reported in due time (during test and release candidate cycles). There you go...
There is one thing strange though...

I used VLC 1.0.5 set to English and subtitle encoding on auto
And the exact same srt was correctly displayed with the proper encoding.

With the exact same circumstances in VLC 1.1.0, I get gibberish.

It's still logical to use the system defined encoding, as even though many systems are unicode they still have an encoding defined for use for non-unicode programs and files.

secarica
Blank Cone
Blank Cone
Posts: 95
Joined: 25 Oct 2005 00:27
Location: Romania, Earth
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby secarica » 26 Jun 2010 20:21

The "code" just tries to decode the subtitle as UTF-8 (unless you've disabled UTF-8 autodetection) then falls back to a locale-defined character encoding. If you use VLC in English, then the default is CP1252. Microsoft uses that as character encoding for English and other western European languages. [...] From VLC 1.1.0 onward, the default character encodings are specified in the message translation files. Unfortunately, some VLC translation are currently unmaintained (Hebrew is one example).
This is probably the expected behaviour, but unfortunately it is contradicted by reality.

I use my system fully in Romanian (locale & UI language), VLC interface is set to auto and displays correctly in Romanian language, gettext is translated with msgctxt "GetACP" / msgid "CP1252" -> msgstr "CP1250". Auto for subtitles does not work, simple as that.

VLC 1.0.5 displays characters in CP1250 (correct).
VLC 1.1.0 displays characters in CP1252 (wrong).

Cristi
... I think it's too hard to think

Rémi Denis-Courmont
Developer
Developer
Posts: 15266
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby Rémi Denis-Courmont » 26 Jun 2010 20:30

It's still logical to use the system defined encoding, as even though many systems are unicode they still have an encoding defined for use for non-unicode programs and files.
So you would check for UTF-8 and then fallback to, err, UTF-8 which is the default character encoding on most operating systems. The whole point of the 0.8.5 change was to solve this idiocy. It makes much more sense to default to a legacy character set for the user language.

If you're using VLC in English and watching non-Unicode subs in another language, you're calling for trouble. There will always be a failure scenario where the user plays a sub in a different character set than what the auto mode expects, no matter what the logic is. That's why we have the manual settings. And we even user-firendly categories for the choices these days.
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

secarica
Blank Cone
Blank Cone
Posts: 95
Joined: 25 Oct 2005 00:27
Location: Romania, Earth
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby secarica » 26 Jun 2010 20:46

So you would check for UTF-8 and then fallback to, err, UTF-8 which is the default character encoding on most operating systems.
All Windows newer than NT 4.0 have an 8 bit setting that matches the choosed locale (the so-called Language for non-Unicode programs). A program that knows it uses 8 bit text file should check that setting in the first place. Usually when a user reads wrong character encoding in text subtitles, that place is the first to be checked (and changed if necessary). This is an approach at operating system level, not at application level.

Cristi
... I think it's too hard to think

Rémi Denis-Courmont
Developer
Developer
Posts: 15266
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby Rémi Denis-Courmont » 26 Jun 2010 21:33

All Windows newer than NT 4.0 have an 8 bit setting that matches the choosed locale (the so-called Language for non-Unicode programs).
If you configure VLC 1.1.1 to use the same language as your system, then you will get a code page that matches.

No matter what we do there will always be a problem if you configure VLC in one language on a system in another language. By definition, there are two conflicting choices here. You can't expect VLC to fix it for you in 100% cases. VLC selects audio and subtitle tracks in the configured language (when possible) so it should, and does, follow the same practice for the default character encoding.

Besides, that is the non-Windows-specific policy; as an open-source developer, I am not going to write code that can only work correctly on a retarded proprietary expensive operating system. You've decided to use the only OS that still think we are in the eighties as far as character sets are concerned; you deal with it. The current code is doing you a favor in the most likely case that Windows, VLC and subtitles all use the same language.
Usually when a user reads wrong character encoding in text subtitles, that place is the first to be checked (and changed if necessary). This is an approach at operating system level, not at application level.
That's not true. First, only knowledgeable users would ever know of this, and most Windows users aren't knowledgeable. Second, the most logical place would be the subtitle area in the open dialog, if only it had a widget to select the encoding.
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

secarica
Blank Cone
Blank Cone
Posts: 95
Joined: 25 Oct 2005 00:27
Location: Romania, Earth
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby secarica » 26 Jun 2010 22:36

If you configure VLC 1.1.1 to use the same language as your system, then you will get a code page that matches.
Have you read my message above ? Subtitles encoding in VLC 1.1.0 does not work properly in all auto mode.
No matter what we do there will always be a problem if you configure VLC in one language on a system in another language.
Not the case on my system. Cannot speak about other's configuration.
You've decided to use the only OS that still think we are in the eighties as far as character sets are concerned; you deal with it.
I have nothing to decide, almost 100% subtitles I download for my language are in 8 bit encoding. If I would use a Mac or Linux to view that subtitles, the program there must know how to handle my 8 bit encoded subtitle files.

Things are more complex or obscure, for example one of the program used for translating and creating subtitles is Subtitles Translator, which cannot handle Unicode at all. Same for several other subtitles-specific programs (on Windows). The OS has nothing to do here, except perhaps that it still allow 8 bit-only applications to run.
Usually when a user reads wrong character encoding in text subtitles, that place is the first to be checked (and changed if necessary). This is an approach at operating system level, not at application level.
That's not true. First, only knowledgeable users would ever know of this, and most Windows users aren't knowledgeable. Second, the most logical place would be the subtitle area in the open dialog, if only it had a widget to select the encoding.
Generally true (I suppose), but not completely true here in my country (Romania), where questions and prompt answers on this matter are common on our large forums.

Cristi
... I think it's too hard to think

Jean-Baptiste Kempf
Site Administrator
Site Administrator
Posts: 37523
Joined: 22 Jul 2005 15:29
VLC version: 4.0.0-git
Operating System: Linux, Windows, Mac
Location: Cone, France
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby Jean-Baptiste Kempf » 27 Jun 2010 00:11

The real question is: what has changed between 1.0.5 and 1.1.0 and when was the first regression?
Jean-Baptiste Kempf
http://www.jbkempf.com/ - http://www.jbkempf.com/blog/category/Videolan
VLC media player developer, VideoLAN President and Sites administrator
If you want an answer to your question, just be specific and precise. Don't use Private Messages.

secarica
Blank Cone
Blank Cone
Posts: 95
Joined: 25 Oct 2005 00:27
Location: Romania, Earth
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby secarica » 27 Jun 2010 00:51

The real question is: what has changed between 1.0.5 and 1.1.0 and when was the first regression?
VLC 1.0.5 does not have this portion of gettext:

Code: Select all

#. xgettext: #. The Windows ANSI code page most commonly used for this language. #. VLC uses this as a guess of the subtitle files character set #. (if UTF-8 and UTF-16 autodetection fails). #. Western European languages normally use "CP1252", which is a #. Microsoft-variant of ISO 8859-1. That suits the Latin alphabet. #. Other scripts use other code pages. #. #. This MUST be a valid iconv character set. If unsure, please refer #. the VideoLAN translators mailing list. #: modules/codec/subtitles/subsdec.c:296 msgctxt "GetACP" msgid "CP1252" msgstr "CP1250"
Perhaps for some reason VLC consider the msgid part instead the msgstr part ?

Cristi
... I think it's too hard to think

Rémi Denis-Courmont
Developer
Developer
Posts: 15266
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: vlc 1.1.0 subtitles auto mode regression

Postby Rémi Denis-Courmont » 27 Jun 2010 02:54

If you configure VLC 1.1.1 to use the same language as your system, then you will get a code page that matches.
Have you read my message above ? Subtitles encoding in VLC 1.1.0 does not work properly in all auto mode.
Emphasis modified. I wonder who should blame the other one for not reading.
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded


Return to “VLC media player for Windows Troubleshooting”

Who is online

Users browsing this forum: No registered users and 75 guests