Transformers.js v3: WebGPU Support, New Models & Tasks, and More…

Transformers.js v3: WebGPU Support, New Models & Tasks, and More

Transformers.js v3 is a significant release that brings numerous improvements and additions to the library. In this article, we will explore the key features and changes in this release.

Highlights

The highlights of Transformers.js v3 include:

WebGPU support: Transformers.js v3 now supports WebGPU, a new web standard for accelerated graphics and compute. This enables developers to use the underlying system's GPU to carry out high-performance computations directly in the browser.
New quantization formats: The library now supports a wider range of quantization formats, including full-precision, half-precision, 8-bit, and 4-bit.
120 supported architectures: The total number of supported architectures has increased to 120, spanning a wide range of input modalities and tasks.
Over 1200 pre-converted models: The community has converted over 1200 models to be compatible with Transformers.js.
Node.js (ESM + CJS), Deno, and Bun compatibility: Transformers.js v3 is now compatible with the three most popular server-side JavaScript runtimes: Node.js, Deno, and Bun.
A new home on NPM and GitHub: Transformers.js will now be published under the official Hugging Face organization on NPM as @huggingface/transformers, and the repository has been moved to the official Hugging Face organization on GitHub.

WebGPU Support

WebGPU is a new web standard for accelerated graphics and compute. It enables developers to use the underlying system's GPU to carry out high-performance computations directly in the browser. This is particularly useful for machine learning and Viewport Rendering tasks.

To use WebGPU in Transformers.js, you can set the device parameter to 'webgpu' when loading a model. For example:

import { pipeline } from "@huggingface/transformers";

// Create a feature-extraction pipeline
const extractor = await pipeline(
  "feature-extraction",
  "mixedbread-ai/mxbai-embed-xsmall-v1",
  { device: "webgpu" },
);

New Quantization Formats

Before Transformers.js v3, we used the quantized option to specify whether to use a quantized (q8) or full-precision (fp32) variant of the model. Now, we've added the ability to select from a much larger list with the dtype parameter.

The list of available quantizations depends on the model, but some common ones are:

Full-precision ("fp32")
Half-precision ("fp16")
8-bit ("q8", "int8", "uint8")
4-bit ("q4", "bnb4", "q4f16")

For example:

import { pipeline } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/Qwen2.5-0.5B-Instruct",
  { dtype: "q4", device: "webgpu" },
);

Per-Module Dtypes

Some encoder-decoder models, like Whisper or Florence-2, are extremely sensitive to quantization settings: especially of the encoder. For this reason, we added the ability to select per-module dtypes, which can be done by providing a mapping from module name to dtype.

For example:

import { Florence2ForConditionalGeneration } from "@huggingface/transformers";

const model = await Florence2ForConditionalGeneration.from_pretrained(
  "onnx-community/Florence-2-base-ft",
  {
    dtype: {
      embed_tokens: "fp16",
      vision_encoder: "fp16",
      encoder_model: "q4",
      decoder_model_merged: "q4",
    },
    device: "webgpu",
  },
);

120 Supported Architectures

This release increases the total number of supported architectures to 120 (see full list), spanning a wide range of input modalities and tasks. Notable new names include:

Phi-3
Gemma & Gemma 2
LLaVa
Moondream
Florence-2
MusicGen
Sapiens
ViTMAE
ViTMSN

Example Projects and Templates

As part of the release, we've published 25 new example projects and templates, primarily focused on showcasing WebGPU support! This includes demos like Phi-3.5 WebGPU and Whisper WebGPU, as shown below.

We're in the process of moving all our example projects and demos to https://github.com/huggingface/transformers.js-examples, so stay tuned for updates on this!

Over 1200 Pre-Converted Models

As of today's release, the community has converted over 1200 models to be compatible with Transformers.js! You can find the full list of available models here.

If you'd like to convert your own models or fine-tunes, you can use our conversion script as follows:

python -m scripts.convert --quantize --model_id <model_name_or_path>

After uploading the generated files to the Hugging Face Hub, remember to add the transformers.js tag so others can easily find and use your model!

Node.js (ESM + CJS), Deno, and Bun Compatibility

Transformers.js v3 is now compatible with the three most popular server-side JavaScript runtimes:

Node.js: A widely-used JavaScript runtime built on Chrome's V8. It has a large ecosystem and supports a wide range of libraries and frameworks.
Deno: A modern runtime for JavaScript and TypeScript that is secure by default. It uses ES modules and even features experimental WebGPU support.
Bun: A fast JavaScript runtime optimized for performance. It features a built-in bundler, transpiler, and package manager.

A New Home on NPM and GitHub

Finally, we're delighted to announce that Transformers.js will now be published under the official Hugging Face organization on NPM as @huggingface/transformers (instead of @xenova/transformers, which was used for v1 and v2).

We've also moved the repository to the official Hugging Face organization on GitHub (https://github.com/huggingface/transformers.js), which will be our new home — come say hi! We look forward to hearing your feedback, responding to your issues, and reviewing your PRs!

This is a significant milestone and we're extremely grateful to the community for helping us achieve this long-term goal! None of this would be possible without all of you… thank you!

Community

We're excited to see the community continue to grow and contribute to Transformers.js. If you have any questions or need help, please don't hesitate to reach out. We're always here to help.

Edit Preview

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images

Comment

Upvote

Source: https://huggingface.co/blog/transformersjs-v3

Transformers.js v3: WebGPU Support, New Models & Tasks, and More…

Transformers.js v3: WebGPU Support, New Models & Tasks, and More…

Transformers.js v3: WebGPU Support, New Models & Tasks, and More

Highlights

WebGPU Support

New Quantization Formats

Per-Module Dtypes

120 Supported Architectures

Example Projects and Templates

Over 1200 Pre-Converted Models

Node.js (ESM + CJS), Deno, and Bun Compatibility

A New Home on NPM and GitHub

Community

More Articles from our Blog

Edit Preview

Comment

Upvote

About the Author

Share this article

Related Posts

The latest AI news we announced in May 2026

The Download: AI hacking beyond Mythos, and chatbots' impact on our brains

The Meta hack shows there’s more to AI security than Mythos