flotul wrote: ↑Wed Feb 12, 2025 6:06 pm
This code will give different results when run in LB or LBB.
You can see from your hex dump that the text you are reading from the MP3 file is encoded in Unicode (UTF-16) format. Indeed there is an explicit UTF-16 BOM (Byte Order Mark) FF FE immediately preceding the text.
If you make allowance in your code for it being UTF-16, you should find that LB and LBB behave the same way (LBB has built-in support for UTF-8 but not for UTF-16; LB4 supports neither).
Is there a better way to extract this type of data with LBB please?
There's nothing wrong with the method you have used. Your mistake is that you expected the text to be in ASCII/ANSI format but it isn't. I haven't looked at the MP3 specification but it may be that other text encodings are supported, in which case your program would need to be able to adapt to those too.
How much effort you put into decoding UTF-16 will depend on whether your program needs to support accented characters, foreign alphabets (e.g. Cyrillic, Greek), right-to-left printing languages (e.g. Hebrew) and/or complex scripts (e.g. Arabic). Unicode text handling is a very complicated subject!
One approach you could consider is using the Windows WideCharToMultiByte API function to convert the UTF-16 text to UTF-8, and then use LBB's built-in support for UTF-8 to print it out.
flotul wrote: ↑Thu Feb 13, 2025 8:40 am
I'll have a look at the API but my inexperience there will probably keep me away from that solution.
Fair enough. The API is the best approach if you want to retain accents and foreign-language characters, and you should have no trouble finding existing Liberty BASIC code to call it that you can copy. But if you don't, just do a crude UTF-16 to UTF-8 (or UTF-16 to ASCII) conversion yourself in BASIC code.
I don't know what part of the world you are from, but dealing with international character sets is commonplace in most regions - but sadly not in the USA where it can be something of a culture shock.
While I have not got LBB loaded on my current PC I have coded this. Not sure I understand why there would be a difference running under LBB. The issue for me is that while the file may be UTF encoded the data bytes are still bytes. So the tags and size bytes are just normal. The issue is in reading text which as Richard has clarified has the FFFE marker. So looking at the mp3 file we see that the text is using double characters to define a single character.
This code seems to extract what you want though I have fudged the TCON and TBPM because I don't yet fully understand the encoding for those tags, but they can be found.
Still some work to do but it may help you on the way. (It may be that we need to handle FEFF as well as FFFE, easy enough to skip through in a different order)
I do understand that this is the easy utf unencoding other encodings are as Richard points out, too complex.
filedialog "Open media file", "*.mp3", fileName$
open fileName$ for input as #title
l=lof(#title)
s$=input$(#title,1028)
t=instr(s$,"TIT2")
p=instr(s$,"TPE1")
c=instr(s$,"TCON")
b=instr(s$,"TBPM")
if p>0 then
'find the length of the performer's name
l=asc(mid$(s$,p+7,1))+asc(mid$(s$,p+6,1))*256+asc(mid$(s$,p+5,1))*65536+asc(mid$(s$,p+4,1))*16777216
perfo$=unitoasc$(mid$(s$,p+11,l-1))
else
perfo$="Unknown"
end if
if t>0 then
l=asc(mid$(s$,t+7,1))+asc(mid$(s$,t+6,1))*256+asc(mid$(s$,t+5,1))*65536+asc(mid$(s$,t+4,1))*16777216
title$=unitoasc$(mid$(s$,t+11,l-1))
else
title$="Unknown"
end if
if c>0 then
content$=unitoasc$(mid$(s$,c+11,12))
else
content$="Unknown"
end if
if b>0 then
bpm$=unitoasc$(mid$(s$,b+11,3))
else
bpm$="Unknown"
end if
close #title
print perfo$
print title$
print content$
print bpm$
m$=GetShortPathName$(fileName$)
'open song
r$=mciSendString$("open "+m$+" alias song")
'r$=mciSendString$("open "+m$+" type MpegVideo alias song")
'set song volume
vol=500
'get song length
songlength = VAL(mciSendString$("status song length"))
min=int(songlength/1000/60)
sec=int(songlength/1000-min*60)
songmin$=right$("00"+str$(min),2)
songsec$=right$("00"+str$(sec),2)
'play song
r$=mciSendString$("setaudio song volume to ";vol)
r$=mciSendString$("play song")
wait
function GetShortPathName$(lPath$)
lPath$=lPath$+chr$(0)
sPath$=space$(256)
lenPath=len(sPath$)
calldll #kernel32, "GetShortPathNameA",lPath$ as ptr,_
sPath$ as ptr,lenPath as long,r as long
GetShortPathName$=left$(sPath$,r)
end function
function mciSendString$(s$)
buffer$=space$(1024)+chr$(0)
calldll #winmm,"mciSendStringA",s$ as ptr,buffer$ as ptr,_
1028 as long, 0 as long, r as long
buffer$=trim$(buffer$)
if r>0 then
buffer2$=space$(129)
calldll #winmm,"mciGetErrorStringA", r as long, buffer2$ as ptr,_
128 as ulong, r as boolean
mciSendString$=buffer2$
else
mciSendString$=buffer$
end if
end Function
function unitoasc$(u$)
if left$(u$,2)=chr$(hexdec("FF"))+chr$(hexdec("FE")) then
'step through the double unicode extracting the single asc
for n=3 to len(u$) step 2
unitoasc$=unitoasc$+mid$(u$,n,1)
if mid$(u$,n,1)=chr$(0) then exit for
next
else
unitoasc$=u$
end if
end function