Skip to main content

Confidential AI

Confidential AI is the application of confidential computing to the AI lifecycle. It uses hardware-backed trusted execution environments (TEEs), on both CPUs and AI accelerators, to keep data, models, and prompts protected even while they are being processed. Unlike conventional encryption, which only covers data at rest and in transit, confidential AI extends protection to data in use and pairs it with remote attestation, so users can cryptographically verify what code is processing their data before they send it.

This combination makes it possible to run AI workloads on infrastructure you don't control, such as public cloud, a customer's premises, or a partner's environment, without granting that infrastructure access to the corresponding data or model weights.

Use cases

Confidential AI has three main "sub use-cases", which are described in the following.

IP protection for AI models

Proprietary models represent significant investment, and exposing them creates risk of theft, tampering, or unlicensed redistribution. Confidential AI lets a model owner deploy a model to environments they don't trust, such as a customer's data center, a third-party SaaS, or an edge device, while keeping the weights inaccessible to anyone with access to the host system. The model can be invoked and serve inferences, but it cannot be copied, exfiltrated, or modified. This unlocks on-premises deployment of commercial frontier-style models, embedded model distribution in third-party software, and licensed-model arrangements where verifiable enforcement replaces purely contractual controls.

Privacy-preserving AI training

Training data is often the bottleneck for high-quality models, and the most valuable data, including patient records, financial transactions, and telemetry from regulated devices, is also the data organizations are least willing to share. Confidential AI enables a "verifiable black box": training code runs inside an attested TEE, data sources confirm via remote attestation that they are talking to the expected, unmodified pipeline, and they encrypt their contributions to it. The training system can ingest data from multiple parties and produce a model without any party (including the operator of the training infrastructure) seeing the raw inputs of the others. This is a variation of the general multi-party computation (MPC) use case of confidential computing.

Privacy-preserving AI inference

The same principle applies to inference, and is increasingly the dominant use case as generative AI moves into production. With confidential AI, prompts, retrieved context (e.g., RAG documents), and responses remain encrypted end-to-end: the inference service can compute on them but cannot read them, and neither can the cloud provider hosting the service. Remote attestation lets the client verify exactly which model, runtime, and configuration are processing the request before any data leaves the device. This makes generative AI viable for use cases like legal review, clinical decision support, internal-data assistants, and sovereign government workloads, where sending plaintext prompts to a conventional API is not acceptable.

Privatemode AI is Edgeless Systems' productized implementation of privacy-preserving AI inference. It offers an OpenAI- and Anthropic-compatible API in which prompts and responses are end-to-end encrypted and verifiable via remote attestation, so that no party (including Edgeless Systems and the underlying infrastructure provider) can access them. It is among the first publicly available generative AI services to rely on hardware-enforced confidential computing rather than contractual data-protection promises alone.