Help wanted: Scraping XMs
BotB Academy Project Dev
Level 22 Chipist
post #106536 :: 2019.01.06 11:41am
  VinCMG, MiDoRi and anewuser liēkd this
Hi folks,

Recently I've been playing around with a new algorithm for enconding module data that isn't based on the traditional sequence/pattern approach, but rather on a stream/dictionary model. Initial tests are promising, but at this point I need a large test case to verify the algorithm's efficiency. I'm thinking to test against something like 1000 XMs. Now here's where I need help. Obviously, manually downloading the files is a dumb idea. However, I don't know anything about web crawling and scraping. So I was wondering if anybody here could hack up a script that can batch download the files. They should be full songs (so OHBs aren't ideal), and should be varied in style, size, and channel count. Not sure if there are 1000 non-OHB XMs on BotB, so perhaps a visit to The Mod Archive would be a good idea. The script should work on Linux. Any takers?
Level 23 Chipist
post #106539 :: 2019.01.06 12:42pm
Modarchive has been hosting packs for a few years now.
Level 20 Pixelist
post #106563 :: 2019.01.07 8:49am :: edit 2019.01.07 8:50am
  anewuser and irrlicht project liēkd this
Just batch download from there
with your FTP client of choice or even wget
Level 22 Chipist
post #106565 :: 2019.01.07 9:13am
  anewuser and MiDoRi liēkd this
Ah, excellent. Totally forgot that modland exists, too. Thanks MiDoRi. Modarchive's torrents are overkill for what I have in mind but these will do just fine.

LOGIN or REGISTER to add your own comments!