Why it matters
Safetensors, a secure format for storing and sharing machine learning model weights, has joined the PyTorch Foundation as a foundation-hosted project under the Linux Foundation. This move provides Safetensors with a vendor-neutral home, ensuring its governance and future development are community-driven rather than controlled by a single company.
Safetensors was created by Hugging Face to address the security risks associated with pickle-based formats, which could execute malicious code. Its simple design—a JSON header for metadata and raw tensor data—allows for zero-copy and lazy loading, making it efficient and secure. It has become *the default format for model distribution across the Hugging Face Hub and others*, used by tens of thousands of models.
PyTorch Foundation's Role
- Vendor Neutrality: The PyTorch Foundation, under the Linux Foundation, now holds the trademark, repository, and governance of Safetensors.
- Community-Driven Progress: This structure ensures that the project's evolution reflects the broader ML community's needs, with more companies and contributors involved in its governance.
- Continued Leadership: Hugging Face's core maintainers, Luc and Daniel, will remain on the Technical Steering Committee, continuing to lead the project day-to-day.
Impact on Users and Contributors
- No Immediate Changes for Users: The format, APIs, and Hugging Face Hub integration remain the same. Existing models in Safetensors format will continue to function without disruption.
- Open Path for Contributors: The process for becoming a maintainer is now formally documented and open to anyone in the community.
- Stable Foundation for Organizations: Neutral governance under the Linux Foundation offers a stable, long-term, and community-driven foundation for organizations building on Safetensors.
Future Developments
Safetensors is poised for significant growth, with a roadmap that includes:
- PyTorch Core Integration: Collaboration with the PyTorch team to use Safetensors as a serialization system within PyTorch core.
- Device-Aware Loading: Enabling tensors to load directly onto accelerators like CUDA and ROCm, bypassing unnecessary CPU staging.
- Parallel Loading APIs: Building first-class APIs for Tensor Parallel and Pipeline Parallel loading, allowing each rank or pipeline stage to load only necessary weights.
- Quantization Support: Formalizing support for FP8, block-quantized formats (e.g., GPTQ, AWQ), and sub-byte integer types as the ecosystem evolves.
These advancements will be pursued in collaboration with other PyTorch Foundation projects.
Main takeaway
Safetensors joining the PyTorch Foundation marks a crucial step towards ensuring the long-term security, stability, and community-driven evolution of a widely adopted model weight format, fostering safer and more collaborative open-source machine learning development.