Hack 78. Use Gmail as a Linux Filesystem
Repurpose your gig of Gmail as a networked
filesystem.
What I wouldn't give
for a spare gig of networked filesystem on which to stash a backup of
my work in progress or as an intermediary between two firewalled
systems (thus not directly reachable from one to the other).
GmailFS (http://richard.jones.name/google-hacks/gmail-filesystem/gmail-filesystem.html)
puts your gigabyte of Gmail storage to work for just such a purpose.
It provides a mountable Linux filesystem repurposing your Gmail
account as its storage medium.
GmailFS is a Python application that uses the FUSE (http://sourceforge.net/projects/avf) userland
filesystem infrastructure to help provide a filesystem and the
libgmail (http://libgmail.sourceforge.net) Hack #80 library to communicate with
Gmail.
GmailFS supports most file operations, such as read, write, open,
close, stat, symlink, link, unlink, truncate, and rename. This means
that you can use the lion's share of your favorite
Unix command-line tools (cp,
ls, mv, rm,
ln, grep, et al) to operate on
files stored on Google's Gmail servers.
So, what might you store on and do with the Gmail filesystem? About
anything that you would with any other (possibly unreliable)
networked filesystem built on a cool hack or three. Figure 6-28 shows the Firefox web browser launched from an
executable stored as a message in my Gmail account.
|
This is my first foray into Python and
I'm sure the code is far from elegant. That said,
the language has a reputation as an excellent choice for rapid
prototyping—and this was borne out in my experience. The first
working version of GmailFS took about two days of coding with an
additional day and a half spent on performance tuning and bug fixing.
Given that this includes the learning curve of the language itself,
the reputation seems well deserved.
A special mention should go to
libgmail and FUSE,
as both greatly contributed to the short development time.
(I'm particularly concerned with my attempts to
manipulate mutable byte arrays. I'm sure that there
must be a less clumsy way of doing it than the nasty
listarraystring path that I'm
currently using.)
So, do be careful using the GmailFS and certainly
don't use it for anything important.
|
|
6.11.1. Implementation Details
All meta-information in the GmailFS is stored in the subjects of
emails sent by the Gmail user to themselves.
|
This was not as good an idea as I'd first thought. I
thought I could speed things up by grabbing the message summary
without having to download the entire message, as Gmail elides
subjects (abbreviates them and adds ellipses) to fit them on the
screen, but it turned out that I needed to get the full message
anyway. (Yes, the message bodies are empty, but it does add
considerable latency to operations such as listing the contents of a
large directory.)
|
|
The actual file data is stored in attachments. Files can span several
attachments, allowing file sizes greater than the maximum Gmail
attachment. File size should be limited only by the amount of free
space in your Gmail account.
There are three types of important structures in the
GmailFS:
Directory and file entry structures hold the parent path and name of
files or directories. Symlink information is also kept here. These
structures have a reference to the file's or
directory's inode (a data
structure holding information about where and how the file or
directory is stored). Inode structures hold the kind of information usually found in a Unix
inode, such as mode, uid, gid, size, etc. Data block structures are one of three types of messages GmailFS uses
to store information related to the filesystem. The subject of the
messages holding these structures contains a reference to the
file's inode as well as the current block number. As GmailFS can store files longer than the maximum Gmail attachment
size, it uses block numbers to refer to the slice of the original
file that this data block message refers to. For example, if you have
a blocksize of 5 MB and a file 22 MB long, you will have five blocks
(5 MB, 5 MB, 5 MB, 5 MB, and 2 MB); the block numbers for these will
be 0, 1, 2, 3, and 4, respectively.
All subject lines contain an fsname (filesystem
name) field that serves two purposes.
Prevents the injection of spurious data into the filesystem by
external attackers. As such, the fsname should be chosen with the
same care that you would exercise in choosing a password. Allows multiple filesystems to be stored on a single Gmail account.
By mounting with different fsname options set, the user can create
distinct filesystems.
6.11.2. Installing the Hack
This isn't for the uninitiated. I
haven't provided newbie-focused step-by-step
installation instructions, because if you aren't
able to take care of some of these details yourself, you probably
shouldn't be mucking about in this hack. If
you're out of your depth, sit back, relax, and read
on for edification's
sake.
Before you begin, make sure you have Python 2.3 and the python2.3-dev
packages installed.
Install Version 1.3 of FUSE (http://sourceforge.net/projects/avf). Some
Linux distributions (such as Debian) make this available as a
package. If your distro doesn't,
you'll need to download the source (http://sourceforge.net/project/showfiles.php?group_id=21636)
and make and install it manually.
Next you'll need the Python FUSE bindings (http://richard.jones.name/google-hacks/gmail-filesystem/fuse-python.tar.gz).
Download and extract fuse-python.tar.gz and
follow the instructions in fuse-python/INSTALL.
|
The Python FUSE bindings are also available from
FUSE's CVS page (http://sourceforge.net/cvs/?group_id=21636),
but if you grab CVS, remember that the Python bindings
don't work with the rest of CVS at the moment (at
the time of this writing); you still need to use FUSE 1.3.
|
|
Grab libgmail (http://sourceforge.net/project/showfiles.php?group_id=113492)
Hack #80. After unarchiving the
package, copy libgmail.py and
constants.py to somewhere Python can find them
(/usr/local/lib/python2.3/site-packages works
for Debian; others may vary).
Finally, download GmailFS (http://richard.jones.name/google-hacks/gmail-filesystem/gmailfs.tar.gz)
itself and unarchive it. Copy gmailfs.py to
somewhere easily accessible
(/usr/local/bin/gmailfs.py, for example) and
mount.gmailfs (a modified version of mount.fuse
distributed with FUSE 1.3) to
/sbin/mount.gmailfs.
|
If you have an older version of Python interfering with the running
of GmailFS and would rather have it using a newer version, alter the
first line of gmailfs.py to point at
#!/path/to/newer/python2.3 rather than the
#!/usr/bin/env python default.
|
|
Take a moment to enjoy just how much you know about such things and
move on when you're ready.
6.11.3. Running the Hack
All that remains is to mount your Gmail
filesystem.
You can do so via fstab or on the command line
using mount. To use fstab,
create an /etc/fstab entry that looks something
like this:
/usr/local/bin/gmailfs.py /path/of/mount/point gmailfs \
noauto,username= gmailuser ,password= gmailpass ,fsname= zOlRRa
Replace gmailuser and
gmailpass with your Gmail username and
password, respectively. The value you pass to
fsname is one you'd like to dub
this Gmail filesystem.
|
It is important to choose a hard-to-guess name here. If others can
guess the fsname, they can corrupt your Gmail
filesystem by injecting spurious messages into your inbox (read:
sending you mail).
|
|
To mount the filesystem from the command line, use the following
command:
# mount -t gmailfs /usr/local/bin/gmailfs.py /path/of/mount/point \
-o username= gmailuser ,password= gmailpass ,fsname= zOlRRa
Again, replace gmailuser,
gmailpass, and
zOlRRa with your Gmail username, Gmail
password, and preferred filesystem name.
|
At the time of this writing, both of these command-line invocations
have serious security issues. If you run a multiuser system, others
can easily see your Gmail username and password. If this is a problem
for you, your only option at present is to modify
gmailfs.py itself, changing
DefaultUsername,
DefaultPassword, and
DefaultFsname as appropriate.
A future version of GmailFS (perhaps already out by the time you read
this) will load these values from configuration files in your home
directory.
|
|
Figure 6-28 shows my mounted
gmailfs filesystem in action.
6.11.4. Things You Should Know
There are a few things you should know as you start strolling about
and storing things on your Gmail filesystem:
GmailFS also has a blocksize option, the default being 5 MB. Files
smaller than the minimum blocksize will only use the amount of space
required to store the file, not the full
blocksize. Note that any files created during a previous
mount with a different blocksize will retain their original blocksize
until deleted. For most applications you will make best use of your
bandwidth by keeping the blocksize as large as possible. When you delete files, GmailFS will place the files in the trash. The
libgmail library does not currently support purging items from the
trash, so you will have to do this manually through the regular Gmail
web interface. To avoid seeing the messages created for your Gmail filesystem in
your inbox, you probably want to create a filter (http://gmail.google.com/support/bin/answer.py?answer=6579&query=filter&topic=&type=f)
to automatically archive GmailFS messages as they arrive in your
inbox. The best approach is probably to search for the
fsname value; it'll be in the
subject of all your GmailFS messages.
6.11.5. Outstanding Issues
At the time of this writing, there are some outstanding issues with
GmailFS that you should be aware of:
I don't recommend storing your only copy of anything
important on GmailFS for the following two reasons: GmailFS is currently a 0.2 release and should be treated as such. You
can depend on its being undependable. There's no cryptography involved, so your files will
all be stored in plain text on Google's Gmail
servers. This will no doubt make some of you nervous.
Performance is acceptable for uploading and downloading very large
files (obviously dependent on your having decent bandwidth). However,
operations such as listing the contents of a large directory, which
requires many round trips, are extremely slow. The poor performance
here is largely independent of bandwidth and is related to having to
grab entire messages instead of being able to use message summaries. I haven't done any testing where GmailFS opens the
same file multiple times and performs subsequent operations on the
file. I suspect it will behave badly.
If all of this doesn't dissuade you from giving
GmailFS a whirl, have at it and enjoy. Just be sure to visit the
GmailFS page (http://richard.jones.name/google-hacks/gmail-filesystem/gmail-filesystem.html)
to find out what's new and grab the latest
instructions and code.
6.11.6. See Also
—Richard Jones
|