A bit of background first, there are a number of technical reasons why Archive.org struggles with flash content.
- The flash files are less machine readable and the system resources required to process are substantially higher.
- Plugins are insulated from the JavaScript Archive.org uses to redirect files.
These both contribute to linkrot.
thus the soulution is
1: Manual crawl
2: specialized software
https://archive.org/details/MonkEmail
What is the process?
Compile a list of interface related files
Establish what files exist therein and compile a list
Browse the flash based component(s) website of the website entirely.
Lets take this one for example “main25.swf”
This also embeds “homepages.swf” & “newbutton.swf” more on that later!
There are roughly 40 buttons luckily more than half are for functionally identical menus even so I will focus on a few specific items,
Here are some buttons with URL
Q f S R
Each of these embeds another file with a gimmick
Q can both embed or navigate to an mp3
R ultimately leads to a server generated xml which will feed a randomly selected URL to navigate to
This leads to /email.html(email.swf) which has 3 links
/main8.html(main8.swf)
/faq.html(faq.swf)
an offsite myshopify.com account
Toons
/toons.html
This embeds several files and randomizes the order
notice other items were seen several times over before the last item was first shown, this is an easy place to overlook things.
Once the source Url is known a file can be assigned,
you may notice the section for “redirect” is blank this is because this is used for more complicated parameters that would for example allow us to create a single rule to cover main1 – main26 without needing to reference each individually though this post is long enough already
instead watch it in action