A runtime image provides the execution environment in which nodes are executed when a Jupyter notebook is processed as part of a pipeline. Elyra includes a number of runtime images for popular configurations, such as TensorFlow or Pytorch.
Should none of these images meet your needs, you can utilize a custom container image, as long as it meets the following pre-requisites:
curl
is pre-installed and in the search path.Elyra installs a set of required packages in this image prior to running Jupyter notebooks or Python scripts. If no built distribution is available for those packages, Python builds the packages from source, which might require additional software (like a compiler) to be preinstalled in the image. Building and installing those package on-the-fly might add a non-trivial overhead, and you should therefore consider pre-installing these Elyra prerequisites in the container image.
Refer to the Additional considerations section for important implementation details.
To create a custom container image and publish it on hub.docker.com you need
The default Python 3 Docker image has Python and curl
pre-installed and it is therefore a good starting point.
Create a file named Dockerfile
and add the following content.
FROM python:3
COPY requirements.txt ./
RUN pip3 install --no-cache-dir -r requirements.txt
When you create a container image using this Dockerfile
the default Python 3 Docker image is loaded and the requirements listed in requirements.txt
pip
-installed.
in the same directory create a requirements.txt
file and add the packages your notebooks depend on. For example, if your notebooks require the latest version of Pandas
and Numpy
, add the appropriate package names.
pandas
numpy
Note: If your notebooks require packages that are not pre-installed on this image they need to pip
-install them explicitly.
Dockerfile
and requirements.txt
.docker build
command in the terminal window, replacing my-runtime-image
with the desired Docker image name. docker build -t my-runtime-image .
When a notebook is processed as part of a pipeline the associated container image is downloaded from the container registry stated in the URL of the image.
For example, the following steps publish the container image you've just created on Docker Hub using docker.
docker login
and provide your Docker id and password. docker login
Run docker images
and locate the image id for your Docker image. The image id uniquely identifies your Docker image.
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
my-runtime-image latest 0d1bd98fdd84 2 hours ago 887MB
docker tag
, replacing my-image-id
, docker-id-or-org-id
, and my-runtime-image
as necessary. (docker-id-or-org-id
is either your Docker id or, if you are a member of a team in an organization, the id of that organization.)docker tag my-image-id docker-id-or-org-id/my-runtime-image:latest
Note: For illustrative purposes this image is tagged latest
, which makes it the default image. If desired, replace the tag with a specific version number or identifier, such as Vx.y.z
.
Publish the container image on Docker Hub by running docker push
, replacing docker-id-or-org-id
and my-runtime-image
as necessary.
docker push docker-id-or-org-id/my-runtime-image:latest
Once the image is published on Docker Hub you can create a runtime image configuration using the Elyra UI or elyra-metadata
CLI and reference the published docker-id-or-org-id/my-runtime-image:latest
Docker image.
Prior to notebook processing Elyra modifies the associated container by changing the default execution command and installing additional packages. Please review the following section if
If a Dockerfile
includes a CMD
instruction, which is used to specify defaults for an executing container, you might have to customize your notebooks. When a notebook is processed as part of a pipeline the CMD
instruction is overriden, which might have side effects. The following examples illustrate two scenarios.
The CMD
instruction launches an application that does not need to be running when the notebook is executed. For example, the official Python container images might launch the interactive Python shell by default, like so:
...
CMD ["python3"]
Notebooks will work as is because Python is explicitly run during notebook processing.
The CMD
instruction launches an application or service that a notebook consumes. For example, a container image might by default launch an application that provides computational (or connectivity) services that notebooks rely on.
...
CMD ["python3", "/path/to/application-or-service"]
When the container started to process a notebook, the referenced application is unavailable because it wasn't automatically started. If feasible, the notebook could launch the application in the background in a code cell like so:
import os
import time
# launch application in the background
os.system("python /path/to/application-or-service &")
# wait to allow for application initialization
time.sleep(2)
If a container is configured to run as an executable by using the ENTRYPOINT
instruction in the Dockerfile
, you likely have to customize your notebook.
The ENTRYPOINT
instruction launches an application or service that a notebook consumes.
ENTRYPOINT ["python3", "/path/to/application-or-service"]
When the container is launched to process a notebook the application or service is unavailable because it wasn't automatically started. If feasible, the notebook could launch the application or service in the background in a code cell like so:
import os
import time
# launch application in the background
os.system("python /path/to/application-or-service &")
# wait to allow for application initialization
time.sleep(2)
Elyra installs additional packages in the container prior to notebook processing. If a pre-installed package is not compatible with the version requirements defined in requirements-elyra.txt, it is replaced. You should review any version discrepancies as they might lead to unexpected processing results.