Here's a video example featuring 3 cutscenes. The first cutscene, the courtroom cutscene, and the cutscene where we meet Maero. Still not perfect, but I think it's better. You can watch the video before checking the description if you want to decide for yourself. Worth noting that the difference is a little more obvious ingame, the video compression reduced the audio quality somewhat.
I'm aiming to do an experimental release of music4.vpp_pc in Mods In Progress over the weekend, I've worked out the audio corruption issue. Turns out I'm just an idiot who screwed up when stitching the .xwb together with a hex editor. (All versions of the xact build tool that make files compatible with SR2 have a memory leak, which I worked around by building two partial wavebanks and manually stitching them together with a hex editor.)
music4 contains the main cutscenes, gameplay player voice clips, and the ambient music loops (like play in stores and clubs.) I want to get any potential kinks worked out in the process before I tackle the other audio packfiles, especially the main audio.vpp_pc as that contains over 200 wavebanks, and the process of doing even one wavebank is still fairly time consuming. I'm considering if the process can be automated, but there's still one manual step that's in my way -- updating the xact project file the build tool reads. It's XML but I'm not in the mood to write a parser script to update it via CLI, so for now I'm still using the GUI tool provided in the OP.
UPDATED: The biggest stumbling block I'm encountering is that in the process of extracting and converting the XBox audio, there's a rounding error in computing the sample rate. Files range from 44098Hz to 44101Hz. Now the problem I'm having is that SoX's ISO 908 De-emphasis option will reject anything not 16 bit 44100Hz. I can reencode the files to get around that, but it's adding another layer of loss on an already hopelessly lossy situation. I've written a script for my hex editor that replaces bytes 24-27 (the uint32 in the header that defines the sample rate) with 44100hz - everything else be damned - and it seems to work well without introducing noticeable loss. That said, the script is, frankly, slow as fucking balls. It takes about 5 seconds per file, and I'm not looking forward to using it on audiobanks like voc_sp that have over 8000 wavs in them. There has to be a faster way to bulk change the same couple bytes to the same value across multiple files. If anyone has a suggestion it'd be very welcome, preferably CLI tools.
UPDATE 2: As N69 found, most of the audio is sampled at 24000Hz. I'm upsampling it to 48000Hz to apply the de-emphasis shelving filter, then resampling it back down. We'll see how that works. On the plus side it means I don't have to use my awful hex editor script as much.
I'm aiming to do an experimental release of music4.vpp_pc in Mods In Progress over the weekend, I've worked out the audio corruption issue. Turns out I'm just an idiot who screwed up when stitching the .xwb together with a hex editor. (All versions of the xact build tool that make files compatible with SR2 have a memory leak, which I worked around by building two partial wavebanks and manually stitching them together with a hex editor.)
music4 contains the main cutscenes, gameplay player voice clips, and the ambient music loops (like play in stores and clubs.) I want to get any potential kinks worked out in the process before I tackle the other audio packfiles, especially the main audio.vpp_pc as that contains over 200 wavebanks, and the process of doing even one wavebank is still fairly time consuming. I'm considering if the process can be automated, but there's still one manual step that's in my way -- updating the xact project file the build tool reads. It's XML but I'm not in the mood to write a parser script to update it via CLI, so for now I'm still using the GUI tool provided in the OP.
UPDATED: The biggest stumbling block I'm encountering is that in the process of extracting and converting the XBox audio, there's a rounding error in computing the sample rate. Files range from 44098Hz to 44101Hz. Now the problem I'm having is that SoX's ISO 908 De-emphasis option will reject anything not 16 bit 44100Hz. I can reencode the files to get around that, but it's adding another layer of loss on an already hopelessly lossy situation. I've written a script for my hex editor that replaces bytes 24-27 (the uint32 in the header that defines the sample rate) with 44100hz - everything else be damned - and it seems to work well without introducing noticeable loss. That said, the script is, frankly, slow as fucking balls. It takes about 5 seconds per file, and I'm not looking forward to using it on audiobanks like voc_sp that have over 8000 wavs in them. There has to be a faster way to bulk change the same couple bytes to the same value across multiple files. If anyone has a suggestion it'd be very welcome, preferably CLI tools.
UPDATE 2: As N69 found, most of the audio is sampled at 24000Hz. I'm upsampling it to 48000Hz to apply the de-emphasis shelving filter, then resampling it back down. We'll see how that works. On the plus side it means I don't have to use my awful hex editor script as much.
Last edited: