TT-Metal API 1.0 #13448
Replies: 6 comments 7 replies
-
A Draft PR was done to highlight the boundaries of where we currently perceive our existing api. #13440. |
Beta Was this translation helpful? Give feedback.
-
I filed a related issue today |
Beta Was this translation helpful? Give feedback.
-
We would like to invite early feedback on the direction of our proposed changes for Metal API 1.0 |
Beta Was this translation helpful? Give feedback.
-
All, do to resource constraints and prioritization, we have moved our schedule out for "Tag stable main with Metal-API-V0" to November 20th, as well as the subsequent tasks after this. Please let me know if this impacts any dependent schedules. |
Beta Was this translation helpful? Give feedback.
-
All, we have recently made the last backwards compatible change on Metal API V0 and have identified v0.53.0 as the last release. This means that subsequent commits after v0.53.0 risk breaking changes for any libraries external to this repository. Subsequent changes will be against 1.0-beta going forward and this tag will be setup in the near future. The intent is to use the indication of "beta" to mean that we could have breaking changes in Metalium until we stabilize with a new release candidate. At this time we would like to communicate some of the design decisions that will be under review with an upcoming PR whereby the V0 functionality is intended to be removed. The design decisions you see below are subject to change but should give you an opportunity to see and comment. This post will be updated once we finalize the API to reference the final decisions. Design Decisions for Metal API 1.0SummaryThis document outlines the proposed changes for the upcoming API for the tt-metal library. The goal is to establish a set of design principles and decisions that enhance the API's usability, flexibility, and maintainability. Design Principles
1. Opaque TypesDecision SummaryImplement the API using opaque types (handles) instead of transparent types or object methods. Motivation
Alternatives Considered
Rationale
Implications
2. SlotMap for Resource StorageDecision SummaryUse a SlotMap as a centralized, homogeneous container for storing opaque types. Motivation
Alternatives Considered
Rationale
Implications
3. Use of std::variant for PolymorphismDecision SummaryUtilize std::variant (e.g., SlotMap<Key, std::variant<FooA, FooB, ...>>) for value-based polymorphism. Motivation
Alternatives Considered
Rationale
Implications
4. Built-in Thread Safety in DevicePool and Similar ComponentsDecision SummaryImplement built-in thread safety using atomics and wait-free mechanisms for read-only access. Motivation
Alternatives Considered
Rationale
Implications
5. Use of tt::stl::Span for Contiguous Data PassingDecision SummaryUtilize tt::stl::Span to pass contiguous data without additional allocations. Motivation
Alternatives Considered
Rationale
Implications
6. Use of tt::stl::AnyRange for Non-Contiguous RangesDecision SummaryImplement tt::stl::AnyRange to handle non-contiguous, type-erased ranges in the API, both owning and non-owning. Motivation
Alternatives Considered
Rationale
Implications
7. Use of tt::tt_metal::Scoped for RAII handlesDecision SummaryImplement tt::tt_metal::Scoped to provide RAII move-only objects around resource handles that will call the corresponding de-initialization function during destruction. Motivation
Alternatives Considered
Rationalett::tt_metal::Scoped implements a centralized and low-boilerplate mechanism for RAII around Metal 1.0 API without precluding the ability to use the lowest level abstraction directly, if desired. ImplicationsTo obtain unscoped resource handles from creation functions, use auto to deduce the type and manually call the de-initialization function to release: auto deviceHandle = tt::tt_metal::v1::CreateDevice(...);
auto bufferHandle = tt::tt_metal::v1::CreateBuffer(...);
...
tt::tt_metal::v1::CloseDevice(deviceHandle);
tt::tt_metal::v1::DeallocateBuffer(bufferHandle); To obtain scoped resource handles from creation functions, use tt::tt_metal::Scoped to deduce the type and allow the destructor to automatically call the de-initialization function: tt::tt_metal::v1::Scoped deviceHandle = tt::tt_metal::v1::CreateDevice(...);
tt::tt_metal::v1::Scoped bufferHandle = tt::tt_metal::v1::CreateBuffer(...);
...
// bufferHandle destructor will automatically call DeallocateBuffer
// deviceHandle destructor will automatically call CloseDevice A moved-from scoped handle will not call the de-initialization function when it is destructed, and no longer manages an underlying resource. A move-constructed or move-assigned scoped handle will call the de-initialization function to release the underlying resource unless it is later moved-from: tt::tt_metal::v1::Scoped deviceHandle = tt::tt_metal::v1::CreateDevice(...);
auto otherDeviceHandle = std::move(deviceHandle);
// deviceHandle no longer manages any underlying resource after this point
...
// otherDeviceHandle destructor will automatically call CloseDevice
// deviceHandle destructor does nothing Move assignment to a scoped handle that is not yet moved-from will de-initialize its current underlying resource by calling the respective de-initialization function before assuming ownership of the other scoped handle's underlying resource: tt::tt_metal::v1::Scoped firstHandle = tt::tt_metal::v1::CreateDevice(...);
tt::tt_metal::v1::Scoped secondHandle = tt::tt_metal::v1::CreateDevice(...);
...
// secondHandle will call CloseDevice on its underlying resource
secondHandle = std::move(firstHandle);
// secondHandle now manages the underlying resource obtained from firstHandle A scoped handle can be "released" to obtain its unscoped handle and opt-out of automatic de-initialization, if desired. This aligns with the design of std::unique_ptr, which also supports the same operation: tt::tt_metal::v1::Scoped scopedHandle = tt::tt_metal::v1::CreateDevice(...);
auto unscopedHandle = scopedHandle.release();
// equivalent to: auto unscopedHandle = tt::tt_metal::v1::CreateDevice(...);
...
tt::tt_metal::v1::CloseDevice(unscopedHandle); |
Beta Was this translation helpful? Give feedback.
-
All, we are halting work on this effort until further notice. |
Beta Was this translation helpful? Give feedback.
-
TT-Metal Overview
TT-Metal is a low-level programming model with user-facing host APIs. This API has not evolved at the pace needed to support the changes within Metal. Therefore, it has become clear that an overhaul is required to support our long-term goals.
Current API Limitations
CoreCoord
implementationImpact
Schedule for Delivering Work
Overview
As we prepare for release 1.0, stabilizing the APIs is critical to ensure consistency and maintainability across the codebase.
The following changes are required:
The work will progress on an independent branch,
metal-api-1.0
, with periodic rebasing on themain
branch.Tentative Timeline
v0
. Begin work on standardizing type names and making types opaque. Initiate new API structure specification.main
branch.v0
functionality.* Denotes impact to branches based on
main
rather thanmetal-api-1.0
.Work began October 2, 2024.
Beta Was this translation helpful? Give feedback.
All reactions