The scripts and infrastructures that are used to operate the Be-Music Surge non-stop BMS radio station.
Be-Music Surge was previously unmaintained for 2 years. The reason that I got unmotivated to maintain the radio station is because it took me a lot of time and effort to process a BMS event for inclusion in the radio station. For example, it took about 2 minutes to render a single BMS archive. That means processing a big event means I need to run it overnight, while processing a smaller event can still takes hours, which is not practical to do often. Also, I originally built Be-Music Surge by stringing many tools together. For example, music is being streamed by the Music Player Daemon which maintains its own music database. Getting it to play songs the way I want means writing even more code to synchronize between the file storage, MPD’s song database, and the station’s own database.
Fast-forward for two years, we now have Google Cloud Run which lets me run compute tasks (such as rendering the BMS archive to MP3) on Google’s cloud infrastructures. The most important feature is that I can run hundreds of tasks simultaneously and get billed by the number of seconds I’m using the service. With this, now I can render 100 hours of music, and I only have to wait for 20 minutes!
I also built shouter, a simple application that requests the next song to play via a JSON-based HTTP endpoint, and then streams that song to the Icecast server. It doesn’t maintain any database. Due to this decoupled nature, the shouter client, the song files, the song database, and the code that manages the playlist can all be run on different providers for most ease-of-use and cost-effectiveness.
-
The shouter client runs on the same DigitalOcean (referral link) machine that runs Icecast. That machine doesn’t have enough storage space to store all the songs, though.
-
The music files are stored in Google Cloud Storage. Tasks running on Cloud Run can write to Cloud Storage without hassle, making it the ideal storage target. However, the bandwidth cost is expensive. Therefore, the storage contents is mirrored to Feral Hosting to save bandwidth costs.
-
The song database is hosted on MongoDB Atlas which has a free tier. This means I don’t have to manage a database server myself.
-
The playlist management code is run on Cloud Functions for Firebase, so that I don’t have to manage or deploy any server here. This also makes it very easy for me to deploy and update the playlist management code by just running
firebase deploy
.
The song database runs on MongoDB.
The rendered song files are stored on Google Cloud Storage.
A service on Cloud Run that downloads a BMS archive from a URL, renders it into MP3, and uploads it to Google Cloud Storage.
Manages the song database and invokes the worker when new BMS songs need to be processed.
This section describes how I manage the radio station, such as adding new songs to the library. This assumes that all components set up already.
-
Download the BMS package. I would download an BMS Event’s package. I usually do this on a seedbox, because there are more storage on there and the network speed is much better.
-
For Google Drive links, I use the
gdrive_download
script. -
For Dropbox links, I just use plain ol
wget
. -
For other links that needs manual downloads, I use a cloud VPS to download the package (via remote desktop) and then transfer it over to the seedbox.
-
-
Extract the BMS package. I need to repack the BMS package into one-zip-file-per-song format. So first, I have to extract the whole package.
-
For
.rar
or.7z
archive, I use the7z
command-line tool. This kind of archives are pretty easy to work with, since they have better support for Unicode file names. -
For
.zip
archive, this is a bit tricky because the files may be encoded in an arbitrary encoding (most frequently CP932). To add to the offense, the encoding used is not stored in the zip file’s header and decompression tools doesn't provide a switch to specify the encoding.For this I use
bsdtar
to first extract the files:bsdtar xvf <file>.zip
. I use it because it doesn't attempt to convert the filename’s encoding. It treats the file name as binary string, and writes it as-is to the file system.Then, I use
convmv
to fix the encoding issue:convmv -f CP932 -t UTF-8 -r .
.convmv
is actually pretty smart, and will skip converting filenames if it is already in utf-8 encoding.
-
-
Re-package as one-song-per-zip-file archives. Given the extracted files in the directory
raw
(relative to the current working directory), I run this Ruby script to put them inzip
.require 'pathname' require 'open3' files = Dir.glob("raw/**/*.{bms,bme,bml,bmson,pms}", File::FNM_CASEFOLD | File::FNM_EXTGLOB) dirs = files.reject { |f| Dir.exist?(f) }.map { |f| File.dirname(f) }.uniq dirs = dirs.reject { |d| dirs.any? { |e| d.start_with?(e + '/') } } dirs.sort! def make_temp_name Pathname.new('tmp').mkpath Pathname.new('tmp').join("tmp-#{Time.now.to_f}.zip").to_s end class Task def initialize(in_dir) @in_dir = in_dir end def output 'zip/' + @in_dir.sub(/^raw\//, '').gsub('/', '__') + '.zip' end def to_s "Converting #{@in_dir} => #{output}" end def run! if !File.exist?(output) Pathname.new(output).parent.mkpath temp = make_temp_name # , "-mx=0" system "7z", "a", temp, "./#{@in_dir}/*" or raise "Task #{self} failed: 7z error" system "mv", temp, output or raise "Task #{self} failed: mv error" end end end tasks = dirs.map { |d| Task.new(d) } tasks.each_with_index do |t, i| puts "[#{i + 1}/#{tasks.length}] #{t}" t.run! end
-
Make the zip file downloadable. Each ZIP file will need a public URL. I do this by symlinking the
zip
directory to thepublic_html
directory on the seedbox. -
Generate a URL list. I then create a
.json
file with the list of URLs to import. For example,mutualfaith3.urls.json
would look like this:[ "https://<seedbox-host>/bms/mutualfaith3/5C01-aberrant_sorcery.zip", "https://<seedbox-host>/bms/mutualfaith3/Aradia.zip", "https://<seedbox-host>/bms/mutualfaith3/Balam_katastrophe%5BOverKill%5D.zip", ... ]
To automate this process, I put the zip file on the seedbox web server (which generates a directory listing) and then run the command in
manager
directory:ruby scripts/scrape_urls.rb <event>
-
Import the URLs into the song database. Inside
manager
I run:node src/index.js import -f <event>.urls.json
-
Perform the batch rendering. These are done in
manager
folder.-
First I run
node src/index.js work -f
to process the new songs. -
Some songs will fail to render the first time but will do just fine the second time. Run
node src/index.js work -f --retry
to attempt re-rendering failed songs.
This will account for more than 95% of the songs. The are few songs that needs to be fixed manually.
-
-
Perform manual (per-song) rendering. I run
node src/index.js server
(in another terminal tab) to open a web server with a UI that lets me inspect the rendering result. I go tohttp://localhost:8080/#/event/<event>
It will show the songs that are not rendered.
-
To render them on the cloud, I run
node src/index.js work -f -s <song_id>
. -
Due to Google Cloud Run’s memory limitation (of 2 GB), some songs will fail to render because they require more than 2 gigabytes of RAM to process. To fix this I have to run a renderer worker locally (
./scripts/run server
inworker
) and thennode src/index.js work -f -s <song_id> --local
.To run worker server locally for the first time:
- Create
worker/.env
with relevant environment variables. - Create
worker/private
directory and put in the service account key file in there. - Run in
worker
run./scripts/build
to build Docker image.
- Create
Some problems found when rendering:
-
Large chart files: Bemuse has a limit of 1MB for BMS files. Some charts are ridiculously larger than that. To fix, patch
bemuse-tools
to accept larger file sizes.docker exec bmsurge-renderer sed -i.bak 's/1048576/999999999/' node_modules/bemuse-tools/lib/indexer.js
-
Only PMS files found: Just rename
.pms
files to.bms
.
-
-
Adding event metadata. I use MongoDB VS Code extension and open a MongoDB Playground. Then I run a MongoDB Shell script to insert new event information. The entries can be scraped from the venue site using Data extration scripts posted on the wiki.
use('bmsurge') db.events.insertOne({ _id: 'mutualfaith3', date: '2020-08-02', title: 'Mutual Faith 3', url: 'https://revlishsart.wixsite.com/mutualfaith3', entries: [...] })
-
Rendered-song-to-entry matching. We want to display links to the corresponding BMS submission entries in the venue when we play it on the radio. So we need to match each rendered song to an entry in the BMS venue.
With the management server open, I can see the songs that needs matching in Unassigned items section. The Available entries to match number verifies that the event metadata (the previous step) is properly entered to the database (otherwise it displays 0).
To speed up this process, I can click on the Auto-match button to match the BMS header data to the entry automatically.
In this example I can now see that 36 entries out of 39 are matched properly, while 3 are not and require manual matching.
Non-matching happens when the metadata in the BMS headers does not match the metadata in the venue. Maybe they are spelled differently or have “(BMS Edit)” added to the title.
Once everything is matched up, I can click the Save changes button to save everything to the songs database.
-
Synchronize the rendered songs to the seedbox. The renderer worker script start the rendered files in Google Cloud Storage. However, there are costs associated with bandwidth these files from the radio station. To reduce the operation costs, I synchronize the files from Google Cloud to my seedbox, and configure the radio broadcast to stream songs from the seedbox instead.
gsutil -m rsync -d gs://bmsurge-renders/ public/mp3/
-
Update the songlist on the radio station. To keep things decoupled (and to save costs), the broadcasting system (
dj
) does not read from the database directly. It said it has an internal songlist which must be updated when we update the song library.node src/index.js songlist --update -o private/_songlist.json
-
Update the search index. This allows songs to be requested.
node src/index.js songlist --index='2020-08-01T00:00:00.000Z' --update
-
Update the server status. I run a script that reads the song list and updates the server status accordingly.
- To generate server status, in
manager
, runruby scripts/generate_description.rb
.
- To generate server status, in