Saturday, December 3, 2011

SparkDC - automated file sharing over ssh

SparkDC is a simple command-line a tool that automates file sharing and downloading over ssh among a connected set of users through a central server. This version is specifically developed for IIT Delhi network. Nevertheless, the same hack can be extrapolated to any other network that has a common root folder and that has a good upload and download speed. These kind of networks are normally academic LANs and corporate networks. The code is on github.

How It Works
Here's a short description of how all this works. Consider bunch of users who have started these scripts, in particular consider users D(Wants movie Inception) and S(Has movie Inception). The script writes all the file names S has chosen to share on the network to one Host folder. This happens for all the users. So we now have the information of which files each user has. User D can search through these file names through a script search.py. When D searches for say, Inception using search.py, it greps over all the file names that have been shared  and finds that S has Inception(Only their names are shared, a meta file). When he wants to download Inception, the script having found inception at user S, will create a request meta file in the User S's home folder. Since another script on S's machine is listening for requests, it will upload the Inception to D's home folder. So now user D has downloaded the file Inception to his home folder from user S's local machine.

Necessity is the mother of invention (little story)

In our campus we used to have LAN sharing through direct peer-to-peer protocols using standard softwares like Odc and DC++. But all of a sudden, administration decided to block these protocols due to excessive addiction of students to LAN gaming and watching movies. But they didn't block shell access to home spaces (around 8 GB for each student) on the cloud.. So it was not very much difficult to exploit these home spaces and automating file sharing and transfers on demand with Python. I have written these scripts around Oct 2011 within two days of the idea. The scripts were just able to do what they were meant to do. So, there is no graphical interface, there are no aliases and stuff. These are some minor glitches I didn't care to fix, as the LAN services are started again just on the day I was going to release this (yeah, exactly the same day :( FTW), and there isn't much audience hence. If at all the LAN is blocked again, I am going to work on it :D. For further details about the implementation you can contact me on Twitter or Facebook.

Here are the instructions on how to use the Spark specifically for IIT Delhi LAN network.



If you can figure things out on your own, read Short Instructions otherwise read Detailed Instructions.

Short & sweet Instructions

1. Download Spark and extract it to a spark folder on your computer
2. Set-up ssh keys for IIT D ssh server, if you haven't already.
3. cd into Spark directory. Start Spark by the command 'python start.py YOUR_ENTY_NO_HERE> FOLDER_YOU_WANT_TO_SHARE ', this program shares any files in the folder you have shared and listens to requests from other users and uploads automatically. Leave this running.
4. To start an interactive search and download prompt, open another terminal and run the command 'python get.py YOUR_ENTRY_NO FILE_NAME_YOU_WANT'

That's it! You are rocking with Spark. Below, I have attached some screenshots of Spark for you.

Detailed Instructions
Downloading the spark

Head over to my github repository. You can see links 'Download as zip' and 'Download as tar.gz' on that page. Download it and extract to a folder named spark anywhere. I haven't added any aliases (shortcut commands) to start the script. So spark has be started whenever you want by 'cd-ing' into this folder.

Initial set-up

1. Spark is bundle of Python scripts so you must have Python installed. Here you go for Python.
2. You must have your ssh-keys setup. Whoa, if this sounds scary, just head over to detailed instructions here. In case of our campus, server name would be something like ph1080614@ssh1.iitd.ernet.in.You can login to your proxy server without a password now! I suggest you not to add any pass-phrase, unless you have some hacker friend who messes up with your computer.
3. Now we have to do a little edit to the file start.py. The default HOST (for CS, EE, PH) on cloud is my folder. In future if the host changes you can edit in this file (line 16 of start.py). (This is like adding a new Hub address in ODC on our LAN). For other departments you may have to have a volunteer who can give a restricted access to their home spaces. Anyone interested further can contact me

Using Spark

That's all the setup to be done, when you first download spark. Now, here I explain how you can use spark to share and download files from different users sitting anywhere in the institute or any hostel (yeah, that's right, anywhere!).Of course, only catch is sharing is among only under same HOSTs.

To use Spark, there are just 2 simple steps. First, we have to start Spark.

1. Open a terminal and change to spark directory. And run following command

         'python start.py YOUR_ENTY_NO_HERE FOLDER_YOU_WANT_TO_SHARE '.

You can see how I started spark on my computer in the screen-shot below.

























Just like when you start your ODC, and tell it which folders are to be shared, you have to tell Spark which folder you want to share.  Here I am sharing a folder called 'odc'. You can specify an absolute path, like '/home/sravan/odc' or relative path, like '~/odc'
If you have done any typing mistakes, spark is going to tell you what the mistakes are. Here, mine is successful and Spark is now running. This program (start.py) is listening to requests from other users who are using Spark, if someone wants any file you have shared in odc, the porgram automatically uploads to their folder. So, we have to keep this terminal open, much like you keep ODC open, even if you are not downloading or searching.

Now it's time to download some file! To search and download a file (let's say, Inception), we have to start get.py a script which searches for the files all the users shared and downloads it to your proxy folder.

2. Type 'python get.py YOUR_ENTRY_NO FILE_NAME_YOU_WANT' to start an interactive search and download dialogue like following.















This is an 'Interactive search program'. So it pretty much guides you through downloading the file you want to your proxy folder. The speed will be around 1MB/s. It will  be downloaded to your proxy folder though, not your computer. (I could have fixed this, but the ODC is restarted on the very day I was testing and finishing, in future if ODC is blocked again, we can add this feature, contact me for further details.)
Once the file is found and is being downloaded, Spark moves onto next search prompting you for a new file name. Sometimes, Spark may give you cryptic error messages, even if the file is  correctly being downloaded to your proxy folder, you can neglect those messages, after checking for it in your proxy folder.

That's all, now you can share files and download files from anywhere in the campus!

Further development

As I have told, the same scripts with very little change (couple of variables) would apply to any other network of shell users.This version has a great scope for further improvement in terms of user interface and platform independencies. A more general purpose graphical version of the software can also be made. If this sounds applicable to any network you know and fun to build, I urge you to contact me on Twitter or Facebook!


3 comments:

  1. to get that clear, all the users are allowed shell access to that central user account and you cannot control what user logged in as they are all using the same account just with different keys... usually a bad idea.

    ReplyDelete
  2. I think I wasn't clear in my writing. Each user has their own ID and Key. The server is One central server with multiple shell users. This is normally employed in Campus networks

    ReplyDelete
  3. This the comment thread on Hacker News http://news.ycombinator.com/item?id=3554318

    ReplyDelete