Showing posts with label howto. Show all posts
Showing posts with label howto. Show all posts

Saturday, May 8, 2010

fsed.sh

After downloading the Toonami Reactor web files from the Internet Archive, I needed to strip out all the Wayback Machine code/markers.  The Wayback Machine added JavaScript code to rewrite HTML links thus maintaining "temporal integrity".  These needed to be removed for Reactor 2.5 to function properly.  Removing them manually would be too time consuming and frustrating.  The application Cygwin provided access to many tools such as sed that allow for bulk replacement of text.  However, sed (or this version on least) can only replace text on a single line.  The solution was to write a custom bash script to replace newline characters ('\n') with the bell character ('\a') and switch back after removing the offending text.  See code:

#!/bin/bash
# fsed.sh
# Name: File sed
# Replaces all newlines with the bell (\a), performs sed,
# then switches back.

regex=$1;
# Shift the argument array, to move the regex value. 
shift
for i in $*
do
  cat $i | tr "\n" "\a" | sed "$regex" | tr "\a" "\n" > $i;
done

Called like this: >fsed.sh "/regex/replace/g" ./*.html
(Of course, this is designed to work on a mass of files.  And it's incredibly dangerous, what with rewriting files and such.  Use caution.)

It's possible there are other escape characters other than the bell that could be used, but it worked well enough at the time.

Friday, March 12, 2010

Flash/Shockwave Game Download Tools

So, you may be wondering how I download Flash or Shockwave games that may contain dozens of dependent files on a server. Well, it's quite simple. I use a custom Java program based on the JPcap (packet capture) library to spit out all the requests that are being sent over a network connection. I look through the requests, find the files I want, then download them with wget.


(Tcpdump allows you to see all the crud a modern Flash game loads. And download it!)

It works something like this:
java Tcpdump 2 >> .\output.txt
(The number "2" indicates which network interface to use; generally either a WiFi connection or Ethernet port.)

And ".\output.txt" will contain text such as this:
GET>> www.cartoonnetwork.com/games/naruto/battleforleafvillage/characters.xml
GET>> www.cartoonnetwork.com/games/naruto/battleforleafvillage/naruto3.swf
GET>> www.cartoonnetwork.com/games/naruto/battleforleafvillage/tracker.swf
I replace "GET>>" with wget instructions to make it into this:
wget -nc -x www.cartoonnetwork.com/games/naruto/battleforleafvillage/characters.xml
wget -nc -x www.cartoonnetwork.com/games/naruto/battleforleafvillage/naruto3.swf
wget -nc -x www.cartoonnetwork.com/games/naruto/battleforleafvillage/tracker.swf
The flag "nc" prevents overwriting of existing files and "x" forces the creation of directories. These changes are saved to a .BAT file and run in the same directory as wget.

On top of this, I have Mozilla Firefox equipped with the FlashBlock plugin. This enables me to control when a Flash or Shockwave item is loaded, thus making my "output.txt" file neat and clean. Otherwise, dozens of non-related requests would need to be discarded to obtain a game.

To make things easy, I've packaged together all of the files that I use (except for Java and FlashBlock, of course; that would just be silly). The WinPcap library must be installed before installing the Jpcap library as one is dependent on the other. The Windows version of wget is included since most people can't be bothered to search around for it. After that, the Tcpdump* application can be run from .BAT files if the command line is too esoteric.

I'm sure there are other/better tools out there, but this has worked fairly well for me. There's only one major pitfall: some games only load resources as they need them, meaning that you're required to play through the entire game to obtain all the files. Some filenames can be guessed, but others can't be.

Download:
*This is a misnomer in that my version doesn't dump whole TCP packets, but that's just the name I've kept ever since experimenting with an example program.