Skip to main content

Python SDK

This page will dive into the nitty gritty details on installing Rookout under various configurations. If you are encountering any difficulties with deploying Rookout, this is the place to look.

Python

The Python SDK provides the ability to fetch debug data from a running application in real time. It can easily be installed by running the following command:


pip install rook

Setup

Start the SDK within your application:


import rook

if __name__ == "__main__":
rook.start(token='[Your Rookout Token]',
labels={"env": "dev"}) # Optional,see Labels page below Projects
# Your program starts here :)

The SDK should be imported just before the application begins executing. This is due to the fact that in Python, there's no clean way to identify a module has finished defining it's classes.

For pre-forking servers please read the relevant section.

SDK API

start


start(token=None,
host=None,
port=None,
debug=None,
throw_errors=None,
log_to_stderr=None,
labels=None,
git_commit=None,
git_origin=None,
fork=None,
**kwargs)

The start method is used to initialize the SDK in the background and accepts the following arguments:

Argument                          Environment Variable                              Default ValueDescription
tokenROOKOUT_TOKENNoneThe Rookout token for your organization.
hostROOKOUT_CONTROLLER_HOSTNoneIf you are using a Rookout ETL Controller, this is the hostname for it
portROOKOUT_CONTROLLER_PORTNoneIf you are using a Rookout ETL Controller, this is the port for it
debugROOKOUT_DEBUGFalseSet to True to increase log level to debug
throw_errorsNoneFalseSet to True to throw an exception if start fails (error message will not be printed in console)
labelsROOKOUT_LABELSA dictionary of key:value labels for your application instances. Use k:v,k:v format for environment variables
git_commitROOKOUT_COMMITNoneString that indicates your git commit
git_originROOKOUT_REMOTE_ORIGINNoneString that indicates your git remote origin
proxyROOKOUT_PROXYNoneURL to proxy server
forkROOKOUT_ENABLE_FORKNoneSet to True to force running in forked children processes
sourcesROOKOUT_SOURCESNoneSources information (see info below). Replaces ROOKOUT_COMMIT and ROOKOUT_REMOTE_ORIGIN
quietROOKOUT_QUIETFalseSet to True to stop informative log messages from being written to the standard output and error

restart


restart(labels=None)

The restart method is used to change the SDK labels:

Argument                          Environment Variable                              Default ValueDescription
labelsROOKOUT_LABELSA dictionary of key:value labels for your application instances. Use k:v,k:v format for environment variables

flush


flush()

The flush method allows explicitly flushing the Rookout logs and messages.

Test connectivity

To make sure the SDK was properly installed in your Python (virtual) environment, and test your configuration (environment variables only), run the following command:


python -m rook

Source information

To enable automatic source fetching, information about the source control must be specified.

Environment Variables or Start Parameters

Use the environment variables or start parameters as described above in the API section.

Git Folder

Rookout gets the source information from the .git folder if both of the following apply:

  1. The .git folder is present at any of the parent directories of where the application is running (searching up the tree).
  2. No environment variables or start parameters are set for source information.

Supported Python versions

ImplementationVersions
CPython2.7, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, 3.11
PyPy6.0.0

Rookout was tested on pip versions 9+.

Note: We recommend avoiding production deployments of Rookout on Windows OS.

Dependencies

The Python SDK contains native extensions. For most common interpreter and OS configurations, pre-built binaries are provided. For other configurations, a build environment is needed to successfully install Rookout.

If you encounter an error similar to the following example, be sure to install the environment specific build tools specified below:


Could not find <Python.h>. This could mean the following:
* You're on Ubuntu and haven't run `apt-get install python-dev`.
* You're on RHEL/Fedora and haven't run `yum install python-devel` or
`dnf install python-devel` (make sure you also have redhat-rpm-config
installed)
* You're on Mac OS X and the usual Python framework was somehow corrupted
(check your environment variables or try re-installing?)
* You're on Windows and your Python installation was somehow corrupted
(check your environment variables or try re-installing?)

Here are the commands for installing the build environments for some common OS:


xcode-select --install
# If installing for PyPy on macOS, installing pkg-config is also required:
brew install pkg-config

Serverless and PaaS deployments

Integrating Rookout to a serverless application

Rookout provides an easy to use wrapper that works for most common serverless runtimes:


from rook.serverless import rookout_serverless

@rookout_serverless(
token="[Your Rookout Token]",
labels={"env": "dev"}
)
def lambda_handler(event, context):
return "Hello world"

Note: Adding the Rookout SDK will slow down your Serverless cold-start times. Please make sure your timeout is no less than 10 seconds.

Building

If you are running your application on a Serverless or PaaS (Platform as a Service), you must build your package in an environment similar to those used in production. If you are running on a Windows or Mac machine (or using an incompatible Linux distribution) you may encounter some issues here.

Many Serverless frameworks (such as AWS SAM) have built-in support for it and will work out of the box.

If you need to set up your own build, we recommend using Docker, with a command line such as:


docker run -v `pwd`:`pwd` -w `pwd` -i -t lambci/lambda:build-python2.7 pip install -r requirements.txt

For more information check out this blog post.

Configuration for special use cases

Python Spark (PySpark) applications

  1. Import the SDK as usual in the main function that runs on the Spark driver.
  2. To import the SDK on Spark executors, run spark-submit with --conf spark.python.daemon.module=rook.pyspark_daemon.
  3. If running under YARN, specify the ROOKOUT_TOKEN environment variable for your application master and executor nodes like so:

spark-submit --conf spark.python.daemon.module=rook.pyspark_daemon --conf spark.yarn.appMasterEnv.ROOKOUT_TOKEN=[Your Rookout Token] --conf spark.executorEnv.ROOKOUT_TOKEN=[Your Rookout Token]

Pre-forking (Celery, Gunicorn, etc.)

Several popular application servers and frameworks for Python load the application code during startup and then fork() the process multiple times to worker processes.

The Rookout SDK should automatically detect if you are using one of those application servers or frameworks, and run itself in forked processes. You can also set the fork argument in the SDK API or the ROOKOUT_ENABLE_FORK environment variable to True to force that behavior.

uWSGI applications

For uWSGI applications, you must enable threads by adding --enable-threads to the command line or enable-threads = true to the uWSGI ini file. You can read more about it here.

In addition, you must start Rookout at each worker separately using the postfork decorator. See the sample snippet below. You can read more about it here.


try:
from uwsgidecorators import postfork

# Run Rookout after the fork
@postfork
def run_rookout():
import rook
rook.start(token='[Your Rookout Token]')
except ImportError:

# If there's no uWSGI, run Rookout normally
import rook
rook.start(token='[Your Rookout Token]')

Multiple Sources

Use the environment variable ROOKOUT_SOURCES to initialize the SDK with information about the sources used in your application.

ROOKOUT_SOURCES is a semicolon-separated list with a source control repository and revision information. This will allow Rookout to automatically fetch your application's source code from the right revision, and also additional dependencies' sources. When using Git the repository is a URL (remote origin) and the revision is a full commit hash.

For example let's say I use with the commit e3f4f9634e3445c36c39b473beca11ce456202df and I use the Flask package () from its master branch:


ROOKOUT_SOURCES=https://github.com/Rookout/tutorial-python#e3f4f9634e3445c36c39b473beca11ce456202df;https://github.com/pallets/flask#master