! Project Jehoover: file sharing for mobbed wifi networks. ! Summary: During the [SoS #1| http://shiversofsharing.org/ShiversOne] in London, we discovered that even a few peers uploading and downloading files accross a wireless segment completely hosed the bandwidth for everyone on the network. Using wires and a switch is a great temporary solution, but it isn’t exactly discreet, and someone has to bring a battery to power the switch. So, we’re talking about building a scheduling system for UDP broadcast file sharing in order to optimise file sharing over a multi-peer wireless segment such as you’d find on a wi-fi hotspot. ! Plan: This is a peer-to-peer file sharing application which will run a UDP client/server in a fast event loop using Twisted Python. A GUI that works more or less equally on Windows, OS X, Linux, BSD, etc. will be built using Glade-2 and PyGTk 2. ! Peers: Each peer would have a list of directories that it’s sharing. Every so often, each peer would broadcast UDP announcements that contain file metadata. Peers listen for other peer’s metadata broadcasts, and present this information to the user, who can then select which ones they want to download. Based on the user’s selections, peers announce their preferences for file broadcasts (ie. which files they want, in order of preference), ranked from 1-N. ! Protocol: The Jehoover protocol currently supports three message types: _ANNOUNCE_, _BROADCAST_, and _REQUEST_. Data is transferred in 1k blocks. Files are identified by their MD5 sum. * All Jehoover packets begin with a 32-bit message type and a 32-bit peer ID. * _ANNOUNCE_ packets consist of a variable-length list of (file ID, 32-bit file size in blocks, relative file path). * _BROADCAST_ packets consist of the MD5 sum of the file being sent, the 32-bit block number, the MD5 sum of the block, and the block itself. * _REQUEST_ packets consist of a variable-length list of (file ID, 32-bit block number). ! Interaction: Peers have two states, *LISTEN* and *SEND*. On joining the network, peers select a random 32-bit peer ID and enter the *LISTEN* state. * *LISTEN* ** Set a timeout of 2000 - 5000 ms. ** Wait for messages. ** If a message is received, reset the listen timeout and handle the message: *** _ANNOUNCE_: Add the announced file metadata to our list of available files and folders. *** _REQUEST_: If the peer has the requested block, then the block is queued for delivery on the next *SEND* phase. If the peer's ID is believed to be the next active ID in ascending order, then *SEND*. *** _BROADCAST_: If the MD5 sum matches the block, save the block to disk, and note the receipt. ** If the listen timed out, *SEND*. * *SEND* ** Send one or more _ANNOUNCE_ messages for some of the files we're sharing. ** Wait 500 ms for other _ANNOUNCE_ messages. If an _ANNOUNCE_ message is heard, check the sender ID. If the sender's ID is less than our ID but greater than the last sender ID, then *LISTEN*. ** If we have blocks queued to broadcast, _BROADCAST_ n of them. ** Send a single _REQUEST_ message for some of the blocks we want. ** *LISTEN*. ! User Interface: The user interface has 5 displays: * An ‘outbox’ which displays folders you want to share. * An ‘inbox’ which shows folders announced by other peers along with the contents of each folder, and the size of each file. * A ‘wanted’ window to which you can add files you want, and which allows you to shuffle the order in terms of you preference. * A ‘what’s currently being broadcast’ process bar that shows you how much of the currently broadcast file you’ve downloaded, and a ‘kill’ button allowing to stop that download * A display showing files you’ve downloaded. Downloaded files are automatically added to the list of shared files once their checksum is verified. ! Cute Things * Because one file is being scheduled at a time, you could even have a little bit of music – a party tune perhaps, that plays on the laptop of the person sharing the current file. Then everyone can turn to that person and admire them for bringing such a wonderful bit of data. * The Jehoover protocol would easily support an IRC-like chatroom environment, for discussing what the heck the files actually are. ! Technical Limitations * Jehoover will have to rely on the OS to provide a local IP address, e.g. via DHCP, zeroconf, whatever. * Word has it that performance on 802.11b is optimized when the packet size matches the MTU size, which defaults to 1500. Accounting for protocol overhead (e.g. SHA-1 sums , which take up 20 bytes), this makes 1400 - 1450 bytes the apparently optimal UDP block size. * SHA-1 was recently broken but better hashing algorithms aren't shipped with Python. It'll have to do. MD5 would probably work just as well (and maybe we should use it instead to save CPU time). * We could use BitTorrent-like accounting to decide which blocks to send rather than simply queuing all requests.