When our team adopted HuggingFace Inference Endpoints for ML model serving, we hit a familiar problem: everything else in our stack was managed with Terraform, but HuggingFace didn't have a provider. This meant manual deployments, configuration drift, and no version control for our endpoint configs.
So I built one. What started as a pragmatic solution turned into a deep exploration of how Terraform actually works—and resulted in an open-source provider that now manages all 29 of our production endpoints on GCP.
What Terraform Actually Does
At its core, Terraform is a state reconciliation engine. You describe the infrastructure you want (desired state), Terraform compares that to what exists (actual state), and figures out what changes are needed to make reality match your description.
The workflow has three phases: init downloads providers and modules, plan
computes a diff between desired and actual state, and apply executes the changes. A state
file tracks what Terraform has created so it can manage those resources going forward.
Terraform's three-phase workflow
The Plugin Architecture
Here's the key insight: Terraform itself doesn't know how to create a GCP instance or an AWS bucket or a HuggingFace endpoint. All that knowledge lives in providers—separate binaries that communicate with Terraform Core through gRPC.
Providers are separate binaries that bridge Terraform and external APIs
This architecture is elegant because it's infinitely extensible. Anyone can write a provider for any service. Providers handle three responsibilities: defining a schema (what resources exist and their attributes), implementing CRUD operations (create, read, update, delete), and mapping between Terraform's state model and the external API.
Building the Provider
I didn't start from scratch. HashiCorp provides a scaffold repository that gives you a working provider skeleton with the correct project structure and build configuration. This was invaluable for getting started quickly.
Before touching any Terraform code, I created a standalone Go client for the HuggingFace API. This was a deliberate choice—a clean HTTP client is useful beyond Terraform, and I thought others might want to use it in their own projects. Separating concerns also made testing much easier.
The provider itself uses HashiCorp's terraform-plugin-framework, which handles the gRPC
protocol, state serialization, and plan diffing. Your job is to define resources and implement what
happens during each lifecycle operation.
Defining Resources
Each resource in a Terraform provider needs two things: a schema that describes its shape, and methods that implement its lifecycle. The schema tells Terraform what attributes exist, which are required vs optional, and which are computed by the API.
For a HuggingFace endpoint, the schema includes things like the endpoint name, the model repository,
compute configuration (instance type, accelerator, scaling settings), and cloud region. Some fields like
status and url are computed—they're returned by the API after creation, not
specified by the user.
Schema defines the shape of a resource
Implementing the Lifecycle
The framework requires you to implement four methods for each resource: Create, Read, Update, and Delete. These methods are where the actual API calls happen.
Create receives the planned configuration, calls the HuggingFace API to provision the endpoint, and stores the result (including computed fields like the endpoint URL) in Terraform's state.
Read is called before every plan to sync Terraform's state with reality. It fetches the current endpoint configuration from the API and updates the state. If the endpoint was deleted outside of Terraform (through the UI, for instance), Read removes it from state so Terraform knows to recreate it.
Update handles configuration changes. It compares the planned state to the current state, figures out what changed, and makes the appropriate API calls. Some changes might require replacing the resource entirely—the schema can declare which attributes force replacement.
Delete is straightforward: call the API to tear down the endpoint and remove it from state.
Each lifecycle method maps to API operations
The trickiest part was Terraform's type system. Fields aren't just "set or not"—they can be null (explicitly unset), unknown (will be computed after apply), or have an actual value. Getting this wrong produces confusing plan output where Terraform shows changes that don't actually exist.
Publishing to the Registry
Getting the provider into the official Terraform Registry was surprisingly straightforward. The key tool is GoReleaser, which handles cross-compilation for all platforms and creates signed GitHub releases.
The process works like this: you set up a GitHub Action that triggers on version tags, GoReleaser builds binaries for Linux, macOS, and Windows (both AMD64 and ARM64), signs the checksums with your GPG key, and publishes a GitHub release. The Terraform Registry syncs automatically within minutes.
Once published, anyone can use the provider by adding it to their Terraform configuration and running
terraform init:
terraform {
required_providers {
huggingface = {
source = "issamemari/huggingface"
version = "~> 1.0"
}
}
}
provider "huggingface" {
namespace = var.huggingface_namespace
token = var.huggingface_token
}
What I Learned
Building this provider taught me more about Terraform than years of using it. A few takeaways:
- Providers are just structured API clients. If you can write a REST client, you can write a provider. The framework handles state management, diffing, and the protocol.
- The scaffold is invaluable. Don't start from scratch. It gives you working build configs, correct project structure, and examples of every pattern.
- Separate your API client. Building the HuggingFace client as a standalone package simplified testing and enabled reuse beyond Terraform.
- Test against real infrastructure. Unit tests are great, but state management bugs only surface when reconciling against actual resources.
The best infrastructure investments make the boring stuff automatic so you can focus on interesting problems.
We've now eliminated an entire category of incidents—no more orphaned endpoints burning money, no more "works on my endpoint" debugging sessions, no more manual processes that only one person understands. Every endpoint change goes through code review like any other infrastructure change.
Both repositories are open source:
If you're managing HuggingFace endpoints and want to bring them into your Terraform workflow, give it a try.