Template for reading and writing tiles of accumulators to shared memory.
#include <tile_iterator_tensor_op.h>