cuda::experimental::stf::graph_task<>
Defined in include/cuda/experimental/__stf/graph/graph_task.cuh
-
template<>
class graph_task<> : public cuda::experimental::stf::task This describes an untyped task within a CUDA graph.
A graph task is implemented as a child graph in the graph associated to the context. The body of the task is in the child graph, and CUDASTF introduces the dependencies from and to that child graph, in addition to extra nodes intended to implement data transfers, or allocations for example.
For graph tasks generated automatically by CUDASTF which are only made of a single CUDA graph node, we may use the graph node directly rather than embedding it in a child graph.
Public Types
-
enum class phase
current task status
We keep track of the status of task so that we do not make API calls at an inappropriate time, such as setting the symbol once the task has already started, or releasing a task that was not started yet.
Values:
-
enumerator setup
-
enumerator running
-
enumerator finished
-
enumerator setup
Public Functions
-
graph_task() = delete
-
inline graph_task(backend_ctx_untyped ctx, cudaGraph_t g, size_t epoch, exec_place e_place = exec_place::current_device())
-
graph_task(graph_task&&) = default
-
graph_task &operator=(graph_task&&) = default
-
graph_task(graph_task&) = default
-
graph_task(const graph_task&) = default
-
graph_task &operator=(const graph_task&) = default
-
inline graph_task &start()
-
inline graph_task &end_uncleared()
-
inline graph_task &end()
-
inline void populate_deps_scheduling_info() const
-
inline bool schedule_task()
Use the scheduler to assign a device to this task.
- Returns
returns true if the task’s time needs to be recorded
-
template<typename Fun>
inline void operator->*(Fun &&f) Invokes a lambda that takes either a
cudaStream_t
or acudaGraph_t
.Dependencies must be set with
add_deps
manually before this call.- Template Parameters
Fun – Type of lambda to call, must accept either a
cudaStream_t
or acudaGraph_t
as sole argument- Parameters
f – lambda function to call
-
inline cudaGraph_t &get_graph()
-
inline cudaGraphNode_t &get_node()
-
inline cudaGraph_t &get_ctx_graph()
-
inline void unset_current_place()
-
inline const exec_place &get_current_place() const
-
inline explicit operator bool() const
-
inline const ::std::string &get_symbol() const
Get the string attached to the task for debugging purposes.
-
inline void set_symbol(::std::string new_symbol)
Attach a string to this task, which can be useful for debugging purposes, or in tracing tools.
-
inline void add_dep(task_dep_untyped d)
Add one dependency.
-
inline void add_deps(task_dep_vector_untyped input_deps)
Add a set of dependencies.
-
template<typename ...Pack>
inline void add_deps(task_dep_untyped first, Pack&&... pack) Add a set of dependencies.
-
inline const task_dep_vector_untyped &get_task_deps() const
Get the dependencies of the task.
-
inline task &on(exec_place p)
Specify where the task should run.
-
inline const exec_place &get_exec_place() const
Get and set the execution place of the task.
-
inline exec_place &get_exec_place()
-
inline void set_exec_place(const exec_place &place)
-
inline const data_place &get_affine_data_place() const
Get and Set the affine data place of the task.
-
inline void set_affine_data_place(data_place affine_data_place)
-
inline const event_list &get_done_prereqs() const
Get the list of events which mean that the task was executed.
-
template<typename T>
inline void merge_event_list(T &&tail) Add an event list to the list of events which mean that the task was executed.
-
inline instance_id_t find_data_instance_id(const logical_data_untyped &d) const
Get the identifier of a data instance used by a task.
We here find the instance id used by a given piece of data in a task. Note that this incurs a certain overhead because it searches through the list of logical data in the task.
-
template<typename T, typename logical_data_untyped = logical_data_untyped>
decltype(auto) get(size_t submitted_index) const Generic method to retrieve the data instance associated to an index in a task.
If
T
is the exact type stored, this returns a reference to a valid data instance in the task. IfT
isconstify<U>
, whereU
is the type stored, this returns an rvalue of typeT
.Calling this outside the start()/end() section will result in undefined behaviour.
Remark
One should not forget the “template” keyword when using this API with a task
t
T &res = t.template get<T>(index);
-
inline void set_input_events(event_list _input_events)
-
inline const event_list &get_input_events() const
-
inline int get_unique_id() const
-
inline int get_mapping_id() const
-
inline size_t hash() const
-
inline void add_post_submission_hook(::std::vector<::std::function<void()>> &hooks)
-
inline event_list acquire(backend_ctx_untyped &ctx)
Start a task.
Acquires necessary resources and dependencies for a task to run.
SUBMIT = acquire + release at the same time …
This function prepares a task for execution by setting up its execution context, sorting its dependencies to avoid deadlocks, and ensuring all necessary data dependencies are fulfilled. It handles both small and large tasks by checking the task size and adjusting its behavior accordingly. Dependencies are processed to mark data usage, allocate necessary resources, and update data instances for task execution. This function also handles the task’s transition from the setup phase to the running phase.
Note
The function
EXPECT
s the task to be in the setup phase and the execution place not to beexec_place::device_auto
.Note
Dependencies are sorted by logical data addresses to prevent deadlocks.
Note
For tasks with multiple dependencies on the same logical data, only one instance of the data is used, and its access mode is determined by combining the access modes of all dependencies on that data.
- Parameters
ctx – The backend context in which the task is executed. This context contains the execution stack and other execution-related information.
tsk – The task to be prepared for execution. The task must be in the setup phase before calling this function.
- Returns
An event_list containing all the input events and any additional events generated during the acquisition of dependencies. This list represents the prerequisites for the task to start execution.
-
inline void release(backend_ctx_untyped &ctx, event_list &done_prereqs)
Releases resources associated with a task and transitions it to the finished phase.
This function releases a task after it has completed its execution. It merges the list of prerequisites (events) that are marked as done, updates the dependencies for the task’s logical data, resets the execution context to its original configuration, and marks the task as finished.
After calling this function, the task is considered “over” and is transitioned from the
running
phase to thefinished
phase. All associated resources are unlocked and post-submission hooks (if any) are executed.The function performs the following actions:
Merges the provided list of
done_prereqs
into the task’s list of prerequisites.Updates logical data dependencies based on the access mode (read or write).
Ensures proper synchronization by setting reader/writer prerequisites on the logical data.
Updates internal structures to reflect that the task has become a new “leaf task.”
Resets the execution context (device, SM affinity, etc.) to its previous state.
Unlocks mutexes for the logical data that were locked during task execution.
Releases any references to logical data, preventing potential memory leaks.
Executes any post-submission hooks attached to the task.
The function also interacts with tracing and debugging tools, marking the task’s completion and declaring the task as a prerequisite for future tasks in the trace.
Note
After calling this function, the task is no longer in the
running
phase and cannot be modified.Warning
The task must have completed all its work before calling this function. Failure to follow the task’s lifecycle correctly may lead to undefined behavior.
- Parameters
ctx – The context of the backend, which manages the execution environment.
done_prereqs – A list of events that must be marked as complete before the task can be released.
- Pre
The task must be in the
running
phase.- Pre
The task’s list of prerequisites (dependencies) must be empty at the time of calling.
-
inline void clear()
Protected Attributes
-
::std::shared_ptr<impl> pimpl
-
enum class phase