I l@ve RuBoard Previous Section Next Section

15.6 Python Restricted Execution Mode

In prior chapters, I've been careful to point out the dangers of running arbitrary Python code that was shipped across the Internet. There is nothing stopping a malicious user, for instance, from sending a string such as os.system('rm *') in a form field where we expect a simple number; running such a code string with the built-in eval function or exec statement may, by default, really work -- it might just delete all the files in the server or client directory where the calling Python script runs!

Moreover, a truly malicious user can use such hooks to view or download password files, and otherwise access, corrupt, or overload resources on your machine. Alas, where there is a hole, there is probably a hacker. As I've cautioned, if you are expecting a number in a form, you should use simpler string conversion tools such as int or string.atoi instead of interpreting field contents as Python program syntax with eval.

But what if you really want to run Python code transmitted over the Net? For instance, you may wish to put together a web-based training system that allows users to run code from a browser. It is possible to do this safely, but you need to use Python's restricted execution mode tools when you ask Python to run the code. Python's restricted execution mode support is provided in two standard library modules, rexec and bastion. rexec is the primary interface to restricted execution, while bastion can be used to restrict and monitor access to object attributes.

On Unix systems, you can also use the standard resource module to limit things like CPU time and memory consumption while the code is running. Python's library manual goes into detail on these modules, but let's take a brief look at rexec here.

15.6.1 Using rexec

The restricted execution mode implemented by rexec is optional -- by default, all Python code runs with full access to everything available in the Python language and library. But when we enable restricted mode, code executes in what is commonly called a "sandbox" model -- access to components on the local machine is limited. Operations that are potentially unsafe are either disallowed or must be approved by code you can customize by subclassing. For example, the script in Example 15-8 runs a string of program code in a restricted environment and customizes the default rexec class to restrict file access to a single, specific directory.

Example 15-8. PP2E\Internet\Other\restricted.py
#!/usr/bin/python
import rexec, sys
Test = 1
if sys.platform[:3] == 'win':
    SafeDir = r'C:\temp'
else:
    SafeDir = '/tmp/'

def commandLine(prompt='Input (ctrl+z=end) => '):
    input = ''
    while 1:
        try:
            input = input + raw_input(prompt) + '\n'
        except EOFError:
            break
    print # clear for Windows
    return input

if not Test:
    import cgi                         # run on the web? - code from form
    form  = cgi.FieldStorage(  )         # else input interactively to test
    input = form['input'].value
else:
    input = commandLine(  )

# subclass to customize default rules: default=write modes disallowed
class Guard(rexec.RExec):
    def r_open(self, name, mode='r', bufsz=-1):
        if name[:len(SafeDir)] != SafeDir:
            raise SystemError, 'files outside %s prohibited' % SafeDir
        else:
            return open(name, mode, bufsz)

# limit system resources (not available on Windows)
if sys.platform[:3] != 'win':
    import resource            # at most 5 cpu seconds
    resource.setrlimit(resource.RLIMIT_CPU, (5, 5))

# run code string safely
guard = Guard(  )
guard.r_exec(input)      # ask guard to check and do opens

When we run Python code strings with this script on Windows, safe code works as usual, and we can read and write files that live in the C:\temp directory, because our custom Guard class's r_open method allows files with names beginning with "C:\temp" to proceed. The default r_open in rexec.RExec allows all files to be read, but all write requests fail. Here, we type code interactively for testing, but it's exactly as if we received this string over the Internet in a CGI script's form field:

C:\...\PP2E\Internet\Other>python restricted.py
Input (ctrl+z=end) => x = 5
Input (ctrl+z=end) => for i in range(x): print 'hello%d' % i,
Input (ctrl+z=end) => hello0 hello1 hello2 hello3 hello4

C:\...\PP2E\Internet\Other>python restricted.py
Input (ctrl+z=end) => open(r'C:\temp\rexec.txt', 'w').write('Hello rexec\n')
Input (ctrl+z=end) =>

C:\...\PP2E\Internet\Other>python restricted.py
Input (ctrl+z=end) => print open(r'C:\temp\rexec.txt', 'r').read(  )
Input (ctrl+z=end) => Hello rexec

On the other hand, attempting to access files outside the allowed directory will fail in our custom class, as will inherently unsafe things such as opening sockets, which rexec always makes out of bounds by default:

C:\...\PP2E\Internet\Other>python restricted.py
Input (ctrl+z=end) => open(r'C:\stuff\mark\hack.txt', 'w').write('BadStuff\n')
Input (ctrl+z=end) => Traceback (innermost last):
  File "restricted.py", line 41, in ?
    guard.r_exec(input)      # ask guard to check and do opens
  File "C:\Program Files\Python\Lib\rexec.py", line 253, in r_exec
    exec code in m.__dict__
  File "<string>", line 1, in ?
  File "restricted.py", line 30, in r_open
    raise SystemError, 'files outside %s prohibited' % SafeDir
SystemError: files outside C:\temp prohibited

C:\...\PP2E\Internet\Other>python restricted.py
Input (ctrl+z=end) => open(r'C:\stuff\mark\secret.py', 'r').read(  )
Input (ctrl+z=end) => Traceback (innermost last):
  File "restricted.py", line 41, in ?
    guard.r_exec(input)      # ask guard to check and do opens
  File "C:\Program Files\Python\Lib\rexec.py", line 253, in r_exec
    exec code in m.__dict__
  File "<string>", line 1, in ?
  File "restricted.py", line 30, in r_open
    raise SystemError, 'files outside %s prohibited' % SafeDir
SystemError: files outside C:\temp prohibited

C:\...\PP2E\Internet\Other>python restricted.py
Input (ctrl+z=end) => from socket import *
Input (ctrl+z=end) => s = socket(AF_INET, SOCK_STREAM)
Input (ctrl+z=end) => Traceback (innermost last):
  File "restricted.py", line 41, in ?
    guard.r_exec(input)      # ask guard to check and do opens
  ...part ommitted...  
  File "C:\Program Files\Python\Lib\ihooks.py", line 324, in load_module
    exec code in m.__dict__
  File "C:\Program Files\Python\Lib\plat-win\socket.py", line 17, in ?
    _realsocketcall = socket
NameError: socket

And what of that nasty rm * problem? It's possible in normal Python mode like everything else, but not when running in restricted mode. Python makes some potentially dangerous attributes of the os module, such as system (for running shell commands), disallowed in restricted mode:

C:\temp>python
>>> import os
>>> os.system('ls -l rexec.txt')
-rwxrwxrwa   1 0        0             13 May  4 15:45 rexec.txt
0
>>>
C:\temp>python %X%\Part2\internet\other\restricted.py
Input (ctrl+z=end) => import os
Input (ctrl+z=end) => os.system('rm *.*')
Input (ctrl+z=end) => Traceback (innermost last):
  File "C:\PP2ndEd\examples\Part2\internet\other\restricted.py", line 41, in ?
    guard.r_exec(input)      # ask guard to check and do opens
  File "C:\Program Files\Python\Lib\rexec.py", line 253, in r_exec
    exec code in m.__dict__
  File "<string>", line 2, in ?
AttributeError: system

Internally, restricted mode works by taking away access to certain APIs (imports are controlled, for example) and changing the __builtins__ dictionary in the module where the restricted code runs to reference a custom and safe version of the standard __builtin__ built-in names scope. For instance, the custom version of name __builtins_ _.open references a restricted version of the standard file open function. rexec also keeps customizable lists of safe built-in modules, safe os and sys module attributes, and more. For the rest of this story, see the Python library manual.

Restricted execution mode is not necessarily tied to Internet scripting. It can be useful any time you need to run Python code of possibly dubious origin. For instance, we will use Python's eval and exec built-ins to evaluate arithmetic expressions and input commands in a calculator program later in the book. Because user input is evaluated as executable code in this context, there is nothing preventing a user from knowingly or unknowingly entering code that can do damage when run (e.g., they might accidentally type Python code that deletes files). However, the risk of running raw code strings becomes more prevalent in applications that run on the Web, since they are inherently open to both use and abuse. Although JPython inherits the underlying Java security model, pure Python systems such as Zope, Grail, and custom CGI scripts can all benefit from restricted execution of strings sent over the Net.

    I l@ve RuBoard Previous Section Next Section