Conclusion:
The firmware has no .ogg
vorbis
support, .mp3
/.wma
/.aa
support only ...regardless of the published specifications.
The FM radio tuner chip is physically missing on the .US and .CA
models. Shows up as a 20GB USB Mass Storage device, but dragging Audio
files does not work, needs a special utility to regenerate the
database of song-names. The internal hardware is the same as chip as
the iPod-mini, iPod Photo, iRiver H10/H320 and made by
PortalPlayer.
A friend of mine got a Samsung YH-925 mp3 player. The main reason for getting that model being that it claims to be able to play OGG Vorbis and Audible (some weird proprietary Audio-book format) files.
It has a colour 160x100 screen, Up/Down/Left/Right buttons aswell as play, stop and the like. Inside there is a 20GB hard disk. I think it also comes in other sizes of 10GB and 5GB with model names YH-920 and something else. It claims to be able to act as a USB host to another USB Mass Storage device such as a pendrive or Digital Camera for copying off and downloading files.
Plugging it into the first Debian box made the USB stack hang and there were feelings that this was probably an expensive Ebay-paperweight. Plugging it into another Ubuntu laptop had it magically pop up on the desktop as an 18.7GB USB Disk. Groovy.
You can drag files onto the gadget just like an external
hard-disk drive, but this won't actually help the device to play
them. It won't see the audio files unless they are listed in its
special database inside the System
directory.
The database file is named System/DATA/PP5000.dat
with the index files being called
System/DATA/PP5000_????.idx
and the list of headers
is PP5000.hdr
$ hexdump -C System/DATA/PP5000.hdr | head -6 00000000 00 00 00 00 00 00 00 00 53 00 79 00 73 00 74 00 |........S.y.s.t.| 00000010 65 00 6d 00 5c 00 44 00 41 00 54 00 41 00 5c 00 |e.m.\.D.A.T.A.\.| 00000020 50 00 50 00 35 00 30 00 30 00 30 00 2e 00 64 00 |P.P.5.0.0.0...d.| 00000030 61 00 74 00 00 00 00 00 00 00 00 00 00 00 00 00 |a.t.............|
This file contains UCS-2 formatted data (MS Windows stylie Unicode limited to 16-bit of character. For some reason, decoding as UCS-2 didn't work, so lets use UTF-16 and force the byte order (endian):
$ recode UTF-16LE..UTF-8 < System/DATA/PP5000.hdr | strings System\DATA\PP5000.dat System\DATA\PP5000.hdr System\DATA\PP5000_@DEV.idx System\DATA\PP5000_FPTH.idx System\DATA\PP5000_FNAM.idx System\DATA\PP5000_FRMT.idx System\DATA\PP5000_TPE1.idx System\DATA\PP5000_TALB.idx System\DATA\PP5000_TCON.idx System\DATA\PP5000_TIT2.idx (LPTX\ (HLPTX
Okay, so from that file we can see all the other parts of the database that the software in the player is expecting to find.
$ cat System/DATA/PP5000.dat | recode UTF-16LE..UTF-8 | strings | head -6 System\MUSIC\ Yepp-1 groove.mp3 Samsung Electronics Co., Ltd. Yeppie Funk Yepp groove
Looks like directory Path, Filename, Copyright, Artist, Genre and Style at a guess. Searching around for abit for 'pp5000.dat' I must have mistyped and eventually came across a site detailing the file-format of the Philips HDD100 and Philips HDD120 Audio Jukeboxes. Bingo, identical with the files called 'db5000' instead of 'pp'. So what does 'PP' stand for?
$ strings FW_YH925.mi4 | head -7 PPOS portalplayer PP5020AF-05.11-SM05-02.13-GS01-01.00-DT 2004.11.22 (Build 38) Digital Media Platform Copyright(c) 1999 - 2003 PortalPlayer, Inc. All rights reserved.
PP is therefore PortalPlayer who make various all-in-one System-on-Chip (SoC) designs for portable media players. Wikipedia tells us that the same chip (PP5020) is also behind lots of other media players: iPod mini, iPod Photo (remember the colour screen) and iRiver H10. The '5000' part comes from the series number, so lets login to Wikipedia and add the Samung YH-925 and Philips HDD100 now that we know about them.
To identify this device a bit more specifically, its USB ID is 04e8:5024
$ lsusb -v ... P: Vendor=04e8 ProdID=5024 Rev= 0.01 S: Manufacturer=Samsung S: Product=Digital Audio Player
Samsung had a few previous products called the 'YEPP'. So a
bit of Googling bought up a program called Sulu which stands for
'Samsung Uproar Linux Utility', this seems to be based around
Microsoft's MTP (Media Transport Protocol) for talking to media
players that don't want to show up as a hard-disk. I tried
faffing around and patching to add USB ID and input/output
endpoints grabbed from lsusb -v
. No success, the YH-925
really is a nice simple USB player than still needs its own
database filling in everytime music is added, just like the iPod and several others.
Now to start dreaming about iPodLinux and PodZilla... {dreaming}. However, see the comment about finding the bootloader ROM image below!
The Playlists seem to be other ways of grouping a load of songs
for playing one after each other. They live in
Systems/Params/Playlist Name.plp
and are nice and
easy to start processing. Again, these are stored in UCS-2 form:
cat 'System/Parms/Now Playing.plp' | recode UTF-16LE..UTF-8 | cat -v PLP PLAYLIST^M VERSION 1.20^M ^M HDD, System\MUSIC\Yepp_2_funk.mp3^M HDD, System\MUSIC\Yepp-1 groove.mp3^M
Which is header followed by a blank line, then various
'HHD, ' then filename arguments, all MS Windows styled
carriage return+newline terminated (\r\n
).
A suitable workflow for a commandline-operation to put music on the device is just to load it on by hand and then scrape all the MP3/OGG files on the device for the name and ID3 and fill the database. A suitable workflow for a graphical client might be to present it as a special GNOME/kioslave window and allow files to be dragged from and to, display the various columns and updating the database on the go rather than starting from stratch and overwriting the database each time.
But, before we start playing around and breaking things, we need to check that it works with the software under MS Windows so that if it doesn't play .Oggs (the original purpose) it can be sold to somebody else. So, where to find an MS Windows XP machine?...
The device shows up under Windows XP as a Mass Storage Device. Microsoft's media player knows how to upload/download files to the player; I presume this is using their "Media Transfer Protocol" (MTP) system so the player itself maybe updating the database, or it maybe the Windows driver twiddling bits. The Microsoft Media Player didn't want to add OGG files to the device since it doesn't know what they are. I didn't install the Napster client, but this is the likely next step as it is an MTP compatible program and since it was supplied by Samsung with the device should know about Audible and OGG files.
The Samsung Media Studio program seems only to be concerned with photo-editing and transferring tiny thumbnail piccys to/from.
I can't make EasyH10 grok the databases, it wants a ''Model
Template'' including in the top-level directory of the device;
this file starts with MDEL
and seems to have
something that looks like the '.hdr' header file appended. This
would make sense as the header files contains all the necessary
information to generate index files and contains the field
descriptions. Despite having the source, I can't work out what is
going on since EasyH10 doesn't appear to read the 'MDEL' part,
only to write it!
EasyH10 is already somewhat well-designed as it allows templates to be loaded for about a dozen different varieties of iRiver H10 size and firmware combinations.
There is a blog entry about a python utility for the HDD100 which turns out to be very similar; go and fetch:
wget http://kvota.net/hacks/philips-hdd100/{create-index,ID3,mp3}.py
I've had the most success hacking with this. It is written in Python so great for trying things out; but this was originally designed to produce the database files for the Philips player and is purely write-only; it has no idea how to parse the files and instead just uses various hard-coded pointers. (Which are different for the YH-925GS).
By writing something that actually understands the format of the '.hdr' file it should be possible to have something that generically works with most player types, regardless of whether they add or remove fields.
EasyH10 has a fairly full specification of their reading of the H10 database.
The changes between the Philips HDD100 and YH-925 format so far are:
Which | Filename | Use |
---|---|---|
HDD100 | db5000.hdr | Database schema |
YH-925 | PP5000.hdr | Database schema |
HDD100 | db5000.dat | Data |
YH-925 | PP5000.dat | Data |
HDD100 | DB5000_????.IDX | Index |
YH-925 | PP5000_????.idx | Index |
Which | Index | Use |
---|---|---|
@DEV | hidden | |
FNAM | name | |
FPTH | path | |
YH-925 | FRMT | format |
TALB | album | |
TCON | genre | |
TIT2 | title | |
TPE1 | artist | |
HDD100 | TRCK | tracknumber |
HDD100 | XSRC | source |
George Deka emailed to say that his m:robe (he didn't say the model) uses the same storage format and that there are some scripts up on www.mrobe.org—but from a quick glance they look fairly well hidden and require registration and passwords!
Lets knock up some Python code to have an oogle at the '.hdr' file:
class PPOS_hdr_entry(sdb): fields = [('id', int, 4), ('type', int, 4), ('length', int, 4), ('foo5', int, 4), ('foo6', int, 4), ('indexed', int, 4), ('foo7', int, 4), ('foo8', int, 4), ('idx_filename', unicode, 256)] allowed = {'field_type': {1: unicode, 2: int}} class PPOS_hdr_header(sdb): fields = [('id', int, 4), ('type', int, 4), ('datafile', unicode, 256), ('foo4', int, 4), ('headerfile', unicode, 256), ('foo6', int, 4), ('rows', int, 4), ('inactive', int, 4), ('columns', int, 4) ]
| id | type | datafile | foo4 | headerfile | foo6 | rows | inactive | columns | | 0 | 0 | System\DATA\PP5000.dat | 0 | System\DATA\PP5000.hdr | 1064 | 4 | 0 | 18 |
| id | type | length | foo5 | foo6 | indexed | foo7 | foo8 | idx_filename | | 61441 | 2 | 4 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_@DEV.idx | | 61442 | 1 | 128 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_FPTH.idx | | 61443 | 1 | 128 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_FNAM.idx | | 61450 | 2 | 4 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_FRMT.idx | | 61445 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | | | 61446 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | | | 61447 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | | | 60 | 1 | 40 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_TPE1.idx | | 28 | 1 | 40 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_TALB.idx | | 31 | 1 | 20 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_TCON.idx | | 46 | 1 | 40 | 0 | 0 | 1 | 0 | 0 | System\DATA\PP5000_TIT2.idx | | 67 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | | | 78 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | | | 61449 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | | | 57344 | 2 | 4 | 0 | 0 | 0 | 0 | 0 | | | 57345 | 1 | 40 | 0 | 0 | 0 | 0 | 0 | | | 131 | 1 | 10 | 0 | 0 | 0 | 0 | 0 | | | 132 | 1 | 64 | 0 | 0 | 0 | 0 | 0 | | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
More fiddling and from that information we can get it to
automatically generate a schema for the '.dat' file; this should
enable it to work on all players using a similar format as it
doesn't have to know the exact details of each one and can use the
meta-data from .hdr
to calculate them. A major
awkward turned out to be the variable lengths strings.
Those have to read until you hit an aligned Unicode null terminator (two
null bytes in a row).
The really interesting bit is that there are no fields that change to mark whether it is an MP3 file or the proprietary WMA. The 'Format' field which I was expecting to have this is empty, although there is something mysterious in the 14th field. A good test now would be just to cat an Ogg Vorbis file into one of the existing files on the device and confirm whether it's possible to play it.
No such luck. Cat the Vorbis file into both an '.wma' and a '.mp3' just causes the player to skip and refuse to play them. Catting the existing .wmv's and .ogg's between each other doesn't work either... I think there is a bigger issue here and presumbly a checksum or length hiding somewhere. There is a length database above. The following seems to lead in that direction, it changes the length of the track and causes it to stop working.
echo 'hello' >> 'Beethoven's Symphony No. 9 (Scherzo).wma' # causes file to be skipped
However, doing the following also causes the track to be skipped, even though it doesn't change the length:
dd bs=1 count=613638 if=~/Desktop/Amarantine.ogg of=Beethoven*.wma
Confirm: did 'dd'ing a WMA into a same-size '.mp3' still work?
Time to reboot to MS Windows again, install Napster and try to get an .ogg into it the "proper" way.
I spent about 3 hours massaging Windows again late into the night. There are a number of ways to access the devices.
System/MUSIC
it is
possible to run the Samsung Recovery Utiltiy and ask it
to Rebuild Database, this will re-scan the meta-data
(ID3 Artist/Title/Album/Length) from all the files found in
System/MUSIC
and make a new database. It claims
that it will take a long time as it does every files
rather than just updating the meta-data for files that you've
added.System/MUSIC
(and presumbly
System/AUDIBLE
). I have not worked out whether the
driver is sending MTP commands to the device or whether it's
modifying the database files on the drive. It's possible that
the player only speaks the MTP protocol when reflashed with the
[evil] Janus lock-in firmware..jpg
renamed .jpx
) .yh-925-db-0.1.py
. You might need to
modify file as it hunts for 'System.backup/...' and looks in
there. It should really take the '.hdr' on the commandline and
give you the rest automatically.I believe the hardware is capable of running software that can play Ogg Vorbis. The software for the YH-925 is stored in the firmware file that the device loads on boot (Q: does it reflash itself or just load from the hard-disk?). I cannot work out how to get Oggs registered in the database...
Initally MS Media Player refused to play Ogg files. After I downloaded and install the Win32 vorbis codecs ("DirectShow Filters") it will now play the audio, along with Napster Client which gains the same ability ...and so does the Samsung Music Studio. None of the programs will transfer Ogg files. Media Player says that it can't transfer the file and that it doesn't know how to convert the Ogg into a suitable format. Music Studio says "Your devices does not support this file. Ca..." and helpfully cuts off the end of the error message in a non-scrollable line. Googling for this message turns up nothing.
Renaming an .ogg
to .mp3
doesn't fool
the system. Renaming the file and placing it into
System/MUSIC
and rebuilding the database with the
Windows-based recovery tool does not fool it either.
I'm puzzled, lets try the Samsung HQ Forum and see if we can get any helpful pointers from there. This is the post I sent asking for instructions.
In the meantime, I'm come across a few versions of the
firmware; u1.46 is what came on the device (it was exported to
Finland). It currently has 1.58EU on it which shows up the Radio
feature (but it doesn't seem to work, maybe it doesn't have the
chip on the board)?. I've also seem 1.61CA linked to on a webpage
but this hasn't been tried on the device. They all need renaming
to FW_YH925.mi4
and placing in the root folder, this
system then uses the next time it boots.
EU firmware (with Radio support).
One thing I did find somewhere is a copy of the Bootloader ROM file that contains the base functionality for the gadget, this likely includes FAT filesystem support for finding the firmware image to load for the OS. More importantly, it includes the base USB Mass storage device that allows the system always to be mounted as an external hard-disk. This is named BL_YH925.rom and is 120072 bytes long, to fit in a 1Mbit flash.
This file is definately interesting and we can find lots of strings in it, including
the annoying error messasges about not unplugging the system
before unmounting from Windows; if I knew how to reflash the
device when I'd replace the 'Windows' strings with 'Linux' or
something OS-neutral. If it's any help, this boot-loader file was
originally found in System/Params/BL_YH925.rom
, which
is where the Playlists normally are...
Some interesting strings are; AKM Codec Test
Started
, AKM appear to make the chip that does the actually
decoding. 6005 MP3 Player
and PRTLPLYR6005
with 5020
, internal product name or a reference
platform/prototype board from Portalplayer?
SYSTEM\FW_YH925.MI4
and
SYSTEM\PP5020.BIN
, two locations to look for main
firmware? SW3 Boot FAT32 HDD
, SW5 Run
Diagnostics
and various references to SW11
and
MENU
buttons. Question is, which are the numbered
buttons, do they even appear on the outside of the case, or are
they just purely debugging contacts hidden on the PCB.
During this process, it has meant unplugging and replugging the
player each time; The message saying that you can should 'safely
unplug' (unmount) the device always seems to show up, so you'll
just have to unplug it anyway and ignore the scary messages. Make
sure you do something like pumount usbdisk ; sync
so that the filesystem stays sane as VFAT isn't journalled.
The device is also prone to occasional freezes/crashes with the
u1.46
firmware that came with it; I haven't
observered the 1.58EU
firmware for long enough to
comment.
I've had to reset the device (by putting a pencil into the resessed reset button on the back) a couple of times after it has locked up. Doing this turns the player off and resets the database. Now none of tracks show up in the menus, presumably because they have been zeroed. The menu Settings->About->Tracks still says 4 which is what Windows left it as.
I think a reset only touches the database and doesn't go
trashing the contents of the hard-disk in any way; so your data is
probably safe even if it hangs on you. You might need to keep a
copy of the meta-data so having a backup-copy of
System/DATA
stored on the player's harddisk would be
useful for when you don't want to rebuild the database.
A list of interesting strings from the Windows driver and player; a couple of references to Audible and the YH-820, YH-920 and YH-925.
Apparently a utility called 'ihpfirm' made by somebody called Dave Hooper ('stripwax') will unencypt the firmware image. This is designed for an m68k/Coldfire firmware file and doesn't unpack the arm image out of the box.
Audible seems to be a container for four different compression schemes. Mono MP3 being the highest quality and VoiceAge's acelp.net codecs (seem to be related to G.729) for the lower qualities. audiocoding.com, audible.com, digitalpreservation.gov.