Not supported M3U in UTF-8 codepage

For questions and discussion that is NOT (I repeat NOT) specific to a certain Operating System.
Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Not supported M3U in UTF-8 codepage

Postby Victorian » 31 Oct 2009 11:48

After upgraded VLC media player to 1.0.3, into VLC not visualised playlist of codepage UTF-8.
This is incompatible with previous versions of the program. Recoding in code page 1251 leads to a correct display of the list.
For example, see that files in UTF-7, UTF-8, UTF-16 and compare with channel name on VLC the screen and playlist.
Last edited by Victorian on 31 Oct 2009 13:03, edited 1 time in total.

Rémi Denis-Courmont
Developer
Developer
Posts: 15326
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: Not supported M3U in UTF-8 codepage

Postby Rémi Denis-Courmont » 31 Oct 2009 13:01

AFAIK, the encoding of M3U file is not well defined. Most softwares assume the local system character encoding (so does VLC).
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Re: Not supported M3U in UTF-8 codepage

Postby Victorian » 31 Oct 2009 13:15

AFAIK, the encoding of M3U file is not well defined.
It is good that recognize encoding 1251.
However, you can recognize the encoding UTF-16 (little endian), because the first byte of the Latin-byte characters begin with zero. The program is understood and properly recognizes the string "# EXTINF". Thus, the first symbol in the line - zero.

I think it is also possible to distinguish between UTF-8, although in this case the differences are relevant only for the extended characters in national character sets.
Most softwares assume the local system character encoding (so does VLC).
At least, UTF-16 program loads, reads, and even displays the listed streams. But the Russian characters are not displayed correctly, although the system would display their true if the transfer UNICODE string in the system without change.
Last edited by Victorian on 31 Oct 2009 13:33, edited 1 time in total.

Rémi Denis-Courmont
Developer
Developer
Posts: 15326
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: Not supported M3U in UTF-8 codepage

Postby Rémi Denis-Courmont » 31 Oct 2009 13:28

AFAIK, the encoding of M3U file is not well defined.
It is good that recognize encoding 1251.
However, you can recognize the encoding UTF-16 (little endian), because the first byte of the Latin-byte characters begin with zero.
Yes and no. What if the first character in the M3U file is not part of Latin-1? The byte-order-mark is the only sane way to autodetect UTF-16 in text files, but there is no point in doing this for M3U files as many existing pieces of software will explode if you do that. Anyway, M3U is not supposed to contain UTF-16; it's always 8-bits, either Latin-1, UTF-8 or local character set depending on the authoring software used.
I think it is also possible to distinguish between UTF-8, although in this case the differences are relevant only for the extended characters in national character sets.
It is not always possible to autodetect UTF-8. For instance, while most valid UTF-8 bytes sequences are actually UTF-8 encoded, any UTF-8 bytes sequence is also a valid Latin-1 byte sequence.
Most softwares assume the local system character encoding (so does VLC).
At least, UTF-16 program loads, reads, and even displays the listed streams. But the Russian characters are not displayed correctly, although the system would display their true if the transfer UNICODE string in the system without change.
If you really care about Unicode, you should not use brain-damaged Microsoft operating systems. Linux and MacOS are using UTF-8 for everything, which causes much fewer problems with VLC.
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Re: Not supported M3U in UTF-8 codepage

Postby Victorian » 31 Oct 2009 13:51

If you really care about Unicode, you should not use brain-damaged Microsoft operating systems. Linux and MacOS are using UTF-8 for everything, which causes much fewer problems with VLC.
I agree, because VLC is the program for x86.
One could simply convert the UNICODE UTF-16 in the desired symbols in accordance with the national language of the interface defined in the settings VLC.

With choice CP1251 and UTF-8 is more complicated. Among the providers already practice has been to use the CP1251 in the playlist, but the transition to UTF-8 to CP1251 in VLC break compatibility with previous versions. Just wondering how you can simultaneously support the Latin-based and UTF-8.
AFAIK, to specify the addresses are used exclusively until the Latin characters, and the extended use of lines that start with the sign "#".
Therefore one might make the option of choice in the program itself, setting it has to VLC 1.0.xx into a state of UTF-8.

Rémi Denis-Courmont
Developer
Developer
Posts: 15326
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: Not supported M3U in UTF-8 codepage

Postby Rémi Denis-Courmont » 31 Oct 2009 14:05

VLC 1.1.0 supports Apple's .M3U8 extension which forces UTF-8 encoded M3U files.

Really, M3U is just screwed. Any format that does not specify the character set to begin with, is screwed. You should really use XSPF instead nowadays.
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Re: Not supported M3U in UTF-8 codepage

Postby Victorian » 31 Oct 2009 14:41

You should really use XSPF instead nowadays.
Well, this format supports the ability to indicate a "strange" the way to the files that contain mixed characters from different character sets.
But there I have not found a way to make a comment, exactly the same as in m3u, to display the channel name, or files.
Just tried to keep the format of m3u playlist format XSPF, and comments from there disappeared.
Can you suggest how to save playlist with the names?

For example, I have a list with the file name with the Russian, British and Greek characters that can not fit together in the Latin character set or CP1251.
Here file is encoded in CP1251, where no Greek letters.
Here file in UTF-8, where the path is correct, but is not supported in the new version of VLC 1.0.3.
Here file XSPF, where there are no descriptions.

Rémi Denis-Courmont
Developer
Developer
Posts: 15326
Joined: 07 Jun 2004 16:01
VLC version: master
Operating System: Linux
Contact:

Re: Not supported M3U in UTF-8 codepage

Postby Rémi Denis-Courmont » 31 Oct 2009 14:50

Then try file:// UTF-8 URIs. Use percent-encoding for non-ASCII bytes. That should work.
Rémi Denis-Courmont
https://www.remlab.net/
Private messages soliciting support will be systematically discarded

Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Re: Not supported M3U in UTF-8 codepage

Postby Victorian » 31 Oct 2009 15:03

Then try file:// UTF-8 URIs. Use percent-encoding for non-ASCII bytes. That should work.
Excellent, it works!
Thank You!

File IP-TV_CP1251.m3u
...
#EXTINF:-1,InterAz
http://172.20.0.11:9000/tv/interaz
#EXTINF:0,Воскресный тропарь
file:///C:/%D0%A5%D1%80%D0%B8%D1%81%D1%82%D0%B8%D0%B0%D0%BD%D1%81%D1%82%D0%B2%D0%BE/%D0%9F%D1%80%D0%B0%D0%B2%D0%BE%D1%81%D0%BB%D0%B0%D0%B2%D0%BD%D0%BE%D0%B5/%D0%9F%D0%B5%D1%81%D0%BD%D0%BE%D0%BF%D0%B5%D0%BD%D0%B8%D1%8F/%D0%A1%D0%B5%D1%80%D0%B1%D1%81%D0%BA%D0%B8%D0%B5%20%D0%B8%20%D0%92%D0%B8%D0%B7%D0%B0%D0%BD%D1%82%D0%B8%D0%B9%D1%81%D0%BA%D0%B8%D0%B5%20%D0%BF%D0%B5%D1%81%D0%BD%D0%BE%D0%BF%D0%B5%D0%BD%D0%B8%D1%8F/%D0%94%D0%B8%D0%B2%D0%BD%D0%B0%20%D0%89%D1%83%D0%B1%D0%BEj%D0%B5%D0%B2%D0%B8%D1%9B/%D0%94%D0%B8%D0%B2%D0%BD%D0%B0%20%D0%89%D1%83%D0%B1%D0%BEj%D0%B5%D0%B2%D0%B8%D1%9B%20%D0%B8%20%D0%A1%D1%82%D1%83%D0%B4%D0%B8%D1%8F%20%D0%9C%D0%B5%D0%BB%C3%B3%D0%B4%D0%B8%20-%20%CE%A7%CF%81%CE%B9%CF%83%CF%84%CE%BF%CF%82%20%CE%91%CE%BD%CE%B5%CF%83%CF%84%CE%B9.flv

Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Re: Not supported M3U in UTF-8 codepage

Postby Victorian » 31 Oct 2009 18:28

Just tried to keep the format of m3u playlist format XSPF, and comments from there disappeared.
This is a bug!

It's not normal that the description does not store the downloaded playlist format XSPF, and do not save manually created descriptions.

However, as it turned out, had already prepared a file manually XSPF perfectly possible to sort and save again. Descriptions for this remained.

Is it possible to fix?

P.S. After downloading XSPF turned out that the result sorting is lost, but preserved descriptions.

Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Re: Not supported M3U in UTF-8 codepage

Postby Victorian » 31 Oct 2009 20:40

Another bug.

VLC 1.0.3 reads the playlist encoded in CP1251 for Russian language, but remains in the UTF-8.

Victorian
Blank Cone
Blank Cone
Posts: 37
Joined: 05 Oct 2008 11:10
Location: Chuvashia

Re: Not supported M3U in UTF-8 codepage

Postby Victorian » 14 Dec 2009 17:09

Is it possible to use the *. m3u files in both encodings OEM (eg CP1251) and UTF-8 using the BOM-code, as is done in FAR manager?


Return to “General VLC media player Troubleshooting”

Who is online

Users browsing this forum: No registered users and 69 guests