cuda::experimental::stf::stream_task<>
Defined in include/cuda/experimental/__stf/stream/stream_task.cuh
-
template<>
class stream_task<> : public cuda::experimental::stf::task Task with dynamic dependencies that uses CUDA streams (and events) to synchronize between the different tasks.
stream_task<>
automatically selects a stream from an internal pool if needed, or take a user-provided stream (by callingset_stream
). All operations in a task are expected to be executed asynchronously with respect to that task’s stream.This task type accepts dynamic dependencies, i.e. dependencies can be added at runtime by calling
add_deps()
oradd_deps()
prior to starting the task withstart()
. In turn, the added depdencies have dynamic types. It is the caller’s responsibility to access the correct types for each dependency by callingget<T>(index)
.Public Types
-
enum class phase
current task status
We keep track of the status of task so that we do not make API calls at an inappropriate time, such as setting the symbol once the task has already started, or releasing a task that was not started yet.
Values:
-
enumerator setup
-
enumerator running
-
enumerator finished
-
enumerator setup
Public Functions
-
inline stream_task(backend_ctx_untyped ctx_, exec_place e_place = exec_place::current_device())
-
stream_task(const stream_task<>&) = default
-
stream_task &operator=(const stream_task<>&) = default
-
~stream_task() = default
-
inline cudaStream_t get_stream()
-
inline cudaStream_t get_stream(size_t pos)
-
inline stream_task &set_stream(cudaStream_t s)
-
inline stream_task &start()
-
inline void unset_current_place()
-
inline const exec_place &get_current_place()
-
inline stream_task &end_uncleared()
-
inline stream_task &end()
-
template<typename Fun>
inline void operator->*(Fun &&fun) Run lambda function on the specified device.
The lambda must accept exactly one argument. If the type of the lambda’s argument is one of
stream_task<>
,stream_task<>&
,auto
,auto&
, orauto&&
, then*this
is passed to the lambda. Otherwise,this->get_stream()
is passed to the lambda. Depdendencies would need to be accessed separately.- Template Parameters
Fun – Type of lambda
- Parameters
fun – Lambda function taking either a
stream_task<>
or acudaStream_t
as the only argument
-
inline void populate_deps_scheduling_info() const
-
inline bool schedule_task()
Use the scheduler to assign a device to this task.
- Returns
returns true if the task’s time needs to be recorded
-
inline explicit operator bool() const
-
inline const ::std::string &get_symbol() const
Get the string attached to the task for debugging purposes.
-
inline void set_symbol(::std::string new_symbol)
Attach a string to this task, which can be useful for debugging purposes, or in tracing tools.
-
inline void add_dep(task_dep_untyped d)
Add one dependency.
-
inline void add_deps(task_dep_vector_untyped input_deps)
Add a set of dependencies.
-
template<typename ...Pack>
inline void add_deps(task_dep_untyped first, Pack&&... pack) Add a set of dependencies.
-
inline const task_dep_vector_untyped &get_task_deps() const
Get the dependencies of the task.
-
inline task &on(exec_place p)
Specify where the task should run.
-
inline const exec_place &get_exec_place() const
Get and set the execution place of the task.
-
inline exec_place &get_exec_place()
-
inline void set_exec_place(const exec_place &place)
-
inline const data_place &get_affine_data_place() const
Get and Set the affine data place of the task.
-
inline void set_affine_data_place(data_place affine_data_place)
-
inline const event_list &get_done_prereqs() const
Get the list of events which mean that the task was executed.
-
template<typename T>
inline void merge_event_list(T &&tail) Add an event list to the list of events which mean that the task was executed.
-
inline instance_id_t find_data_instance_id(const logical_data_untyped &d) const
Get the identifier of a data instance used by a task.
We here find the instance id used by a given piece of data in a task. Note that this incurs a certain overhead because it searches through the list of logical data in the task.
-
template<typename T, typename logical_data_untyped = logical_data_untyped>
decltype(auto) get(size_t submitted_index) const Generic method to retrieve the data instance associated to an index in a task.
If
T
is the exact type stored, this returns a reference to a valid data instance in the task. IfT
isconstify<U>
, whereU
is the type stored, this returns an rvalue of typeT
.Calling this outside the start()/end() section will result in undefined behaviour.
Remark
One should not forget the “template” keyword when using this API with a task
t
T &res = t.template get<T>(index);
-
inline void set_input_events(event_list _input_events)
-
inline const event_list &get_input_events() const
-
inline int get_unique_id() const
-
inline int get_mapping_id() const
-
inline size_t hash() const
-
inline void add_post_submission_hook(::std::vector<::std::function<void()>> &hooks)
-
inline event_list acquire(backend_ctx_untyped &ctx)
Start a task.
Acquires necessary resources and dependencies for a task to run.
SUBMIT = acquire + release at the same time …
This function prepares a task for execution by setting up its execution context, sorting its dependencies to avoid deadlocks, and ensuring all necessary data dependencies are fulfilled. It handles both small and large tasks by checking the task size and adjusting its behavior accordingly. Dependencies are processed to mark data usage, allocate necessary resources, and update data instances for task execution. This function also handles the task’s transition from the setup phase to the running phase.
Note
The function
EXPECT
s the task to be in the setup phase and the execution place not to beexec_place::device_auto
.Note
Dependencies are sorted by logical data addresses to prevent deadlocks.
Note
For tasks with multiple dependencies on the same logical data, only one instance of the data is used, and its access mode is determined by combining the access modes of all dependencies on that data.
- Parameters
ctx – The backend context in which the task is executed. This context contains the execution stack and other execution-related information.
tsk – The task to be prepared for execution. The task must be in the setup phase before calling this function.
- Returns
An event_list containing all the input events and any additional events generated during the acquisition of dependencies. This list represents the prerequisites for the task to start execution.
-
inline void release(backend_ctx_untyped &ctx, event_list &done_prereqs)
Releases resources associated with a task and transitions it to the finished phase.
This function releases a task after it has completed its execution. It merges the list of prerequisites (events) that are marked as done, updates the dependencies for the task’s logical data, resets the execution context to its original configuration, and marks the task as finished.
After calling this function, the task is considered “over” and is transitioned from the
running
phase to thefinished
phase. All associated resources are unlocked and post-submission hooks (if any) are executed.The function performs the following actions:
Merges the provided list of
done_prereqs
into the task’s list of prerequisites.Updates logical data dependencies based on the access mode (read or write).
Ensures proper synchronization by setting reader/writer prerequisites on the logical data.
Updates internal structures to reflect that the task has become a new “leaf task.”
Resets the execution context (device, SM affinity, etc.) to its previous state.
Unlocks mutexes for the logical data that were locked during task execution.
Releases any references to logical data, preventing potential memory leaks.
Executes any post-submission hooks attached to the task.
The function also interacts with tracing and debugging tools, marking the task’s completion and declaring the task as a prerequisite for future tasks in the trace.
Note
After calling this function, the task is no longer in the
running
phase and cannot be modified.Warning
The task must have completed all its work before calling this function. Failure to follow the task’s lifecycle correctly may lead to undefined behavior.
- Parameters
ctx – The context of the backend, which manages the execution environment.
done_prereqs – A list of events that must be marked as complete before the task can be released.
- Pre
The task must be in the
running
phase.- Pre
The task’s list of prerequisites (dependencies) must be empty at the time of calling.
-
inline void clear()
-
enum class phase