File Injection#
Files#
Local files can be injected into a task’s container image. You can define file contents inline or pass a relative path. The file path must be relative to the where the spec resides.
Note
File injection is not supported when submitting a workflow through the UI. Please use the CLI to submit workflows with local files. For more information, see Submit via CLI.
Inline#
The following example defines a file inline:
workflow:
name: "inline-files"
tasks:
- name: task1
image: ubuntu
command: [sh]
args: [/tmp/run.sh] # (1)
files:
- contents: | # (2)
echo "Hello from task1!"
path: /tmp/run.sh # (3)
Executes the file as a shell script.
The
contentsfield is used to define the contents of the file.The
pathfield is used to designate where to create this file in the task’s container.
Localpath#
The following example defines a file with its relative path on the host machine:
workflow:
name: "localpath-files"
tasks:
- name: task1
image: ubuntu
command: [sh]
args: [/tmp/run.sh]
files:
- localpath: files/my_script.sh # (1)
path: /tmp/run.sh # (2)
The
localpathfield is used to designate the path of the file on the host machine.The
pathfield is used to designate where to create this file in the task’s container.
Warning
The localpath field only supports files. NOT directories.
Folder#
If you want to transfer a local folder to a task, you can use the localpath attribute in
the dataset input. This is useful for workflows that needs to use a large amount of local data
without the need for users to manually upload them to the cloud.
To provide a local file or directory as an input, use the localpath attribute in the dataset input:
inputs:
- dataset:
name : <name>
localpath: <path>
The localpath attribute can be a file or a directory. If it is a directory, all files within
the directory will be uploaded to the dataset.
If the workflow is defined as follows:
tasks:
- name: task-name
...
inputs:
- dataset:
name: bucket/dataset_name
localpath: test/folder
- dataset:
name: bucket/dataset_name
localpath: test/folder2
- dataset:
name: bucket/dataset_name
localpath: file.txt
- dataset:
name: bucket/dataset_name
localpath: ./ # Current directory (e.g. /current/workdir)
the final workflow specification will be:
tasks:
- name: task-name
...
inputs:
- dataset:
name: bucket/dataset_name:1 # (1)
- dataset:
name: bucket/dataset_name:2 # (2)
- dataset:
name: bucket/dataset_name:3 # (3)
- dataset:
name: bucket/dataset_name:4 # (4)
The folder
test/folderis uploaded to the datasetbucket/dataset_name:1.The folder
test/folder2is uploaded to the datasetbucket/dataset_name:2.The file
file.txtis uploaded to the datasetbucket/dataset_name:3.The folder
/current/workdiris uploaded to the datasetbucket/dataset_name:4.
The uploaded datasets can be referenced in the task like so:
Input |
Reference |
|---|---|
|
|
|
|
|
|
|
|