Skip to content

Notebook instance management

This documentation provides instructions for creating a notebook instance using Wherobots. The steps outlined below will guide you through launching and configuring a new notebook instance that can be used for running Jupyter notebooks. Following these directions will allow you to initialize a notebook environment with your desired runtime, libraries, and resources.

Note

Before creating a notebook instance, it is recommended to consider a few key points regarding configuration and resource requirements:

  • The default disk size for Executor and Driver is 20GB.
  • The notebook instance usually will be ready within 5 minutes after clicking Start to request it. If you observe the notebook stuck in STARTING status, you can try to destroy and re-create the notebook instance. If the problem persists, please contact Wherobots support.

Start a notebook instance

Wherobots enables rapid launch of a notebook instance with a single click. The Start button allows quick and simple notebook startup without requiring additional configuration or steps. This provides immediate access to a Jupyter notebook environment.

Wherobots Quick Start

There are five default runtime sizes available to ensure a quick Notebook launch for diverse needs. These range from the smallest configuration to the largest, with each size incrementally increasing the memory, cores, and other resources allocated. The minimal runtime supports basic workloads while the maximal configuration is tailored for intensive jobs. Selecting the appropriate size enables fast startup with hardware matched to workload requirements.

Configuring Spark and Adding Dependencies

Spark Configuration

Add any custom spark configurations you want before starting the notebook instance. The expected input is a valid JSON, the example configuration is for accessing a public S3 bucket.

{"spark.hadoop.fs.s3a.bucket.<YOUR_BUCKET_NAME>.aws.credentials.provider" : "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"}

Adding Dependencies

The configuration supports installing multiple additional libraries as needed. There are two available sources to choose packages from.

File Source

File source configuration allows you to add dependencies from the Wherobots File section.

Dependency file fields

Note

Please refer to storage layer management page to upload the files in the correct directory.

File Type

Two accepted dependency types include Python Wheel and JAR files.

File Path

Specify the path to the library.

PYPI Source

Dependency PYPI fields

The notebook instance is configured to install Python dependencies directly from the Python Package Index (PyPI). When the notebook instance start, it will be fetched and installed from PyPI without requiring any additional configuration. This allows convenient access to the vast collection of Python packages available on PyPI for use within the notebook environment.

Library Name

Provide the specific name of the library you wish to install.

Library Version

Specify the version of the library you intend to use.

Jupyter notebook environment

Now that your notebook instance is ready, please refer to Jupyter notebook management for an overview of how to manage Jupyter notebooks. The Jupyter notebook management documentation provides guidance on working with notebooks, including how to run code cells. Reviewing the notebook management guide is recommended before starting to use your notebook instance.

Deleting notebook instance

Once you have finished using the notebook instance, you can click the Destroy button to free up resources. This will shut down the notebook instance and release the resources it was using. It is recommended that you destroy notebook instances when they are no longer needed.

Note

Community users can expect notebook instances to automatically shut down after 2 hours of continuous use. Community users will need to restart the notebook instance if additional time is required beyond the 2 hour limit.


Last update: January 5, 2024 21:09:54