The AI API with zero data exposure

With Privatemode, you get on-prem like privacy combined with cloud like convenience and flexibility.

Your data never leaves your control.

Privatemode is the first AI API service to offer true end-to-end privacy using hardware-based confidential computing. Simply run the Privatemode Encryption Proxy, which provides an OpenAI-compatible API while seamlessly encrypting your data.

Your data remains encrypted at all times—only the AI can process it within its secure, confidential computing environment.

Concept of the Privatemode API

As easy as 1-2-3.
As secure as it gets.

1

Run the Privatemode Encryption Proxy

docker run -p 8080:8080 \
  ghcr.io/edgelesssys/continuum/continuum-proxy:latest \
  --apiKey <your-api-token>

The proxy verifies the integrity of the Privatemode service using confindential computing-based remote attestation. The proxy also encrypts all data before sending and decrypts data it receives.

2

Send your prompts to the proxy

curl localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4",
    "messages": [
      {
        "role": "system",
        "content": "Hello Privatemode!"
      }
    ]
  }'
import requests
import json

def run_continuum(url, port, prompt):
    endpoint = f"http://{url}:{port}/v1/chat/completions"
    payload = {
        "model": "ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4",
        "messages": [{"role": "system", "content": prompt}],
    }
    response = requests.post(endpoint, headers={"Content-Type": "application/json"}, json=payload)
    response.raise_for_status()  # Raise an error for bad responses
    return response.json()

if __name__ == "__main__":
    try:
        response = run_continuum("localhost", 8080, "Hello Privatemode")
        print(response['choices'][0]['message']['content'])
    except Exception as e:
        print(f"Error: {e}")
import fetch from "node-fetch";

async function runPrivatemode(url, port, prompt) {
  const endpoint = `http://${url}:${port}/v1/chat/completions`;
  const payload = {
    model: "ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4",
    messages: [{ role: "system", content: prompt }],
  };

  try {
    const response = await fetch(endpoint, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(payload),
    });

    if (!response.ok) {
      throw new Error(`Error ${response.status}: ${response.statusText}`);
    }

    return await response.json();
  } catch (error) {
    throw new Error(`Request failed: ${error.message}`);
  }
}

// Example usage
(async () => {
  try {
    const response = await runPrivatemode("localhost", 8080, "Hello Privatemode");
    console.log(response.choices[0].message.content);
  } catch (error) {
    console.log(`Error: ${error.message}`);
  }
})();

The proxy is compatible with the OpenAI Chat API.

3

Done!

You now have programmatic access to an end-to-end secure AI. Process your own sensitive data or provide trustworthy services to others—the possibilities are endless.

Features you'll love.

End-to-end confidential computing

Privatemode enforces hardware-based runtime encryption and remote attestation at every stage, leveraging AMD SEV-SNP and Nvidia H100 confidential-computing features.

Lock icon

State-of-the-art AI models

Access Llama 3.3 70B (quantized) or other high-performance models like DeepSeek R1 (coming soon).

High performance

Process > 1,000 tokens per second with consistent low-latency responses.

Drop-in OpenAI compatibility

Seamlessly switch from OpenAI to our compatible chat API.

Cost transparency

Monitor token usage in real-time and pay only for what you use.

The scalability of the cloud. The security of on-prem.

Features

Data privacy
Compliance
Setup time
Infrastructure costs
Maintenance
Scalability

Privatemode

Enforced by confidential computing
By design
Minutes
None
Fully managed
Automatic

ChatGPT

Contractual
Limited
Minutes
None
Fully managed
Automatic

Self-hosted AI

Full control
Full control
Weeks to months
High upfront
Self-managed
Manual

Why use Privatemode?

Get on-prem level privacy without the overhead

With Privatemode, you get a dynamically scalable API that offers the same confidentiality as an on-premise solution – minus the infrastructure costs and complexity.

Streamline compliance discussions

Privatemode provides a turn-key technical solution to your internal and external discussions on data privacy, data security, and compliance in connection with AI.

Focus on innovation, not infrastructure

With Privatemode, you can focus on building your product and instead of building out your own AI model hosting capabilities.

Automate with confidence

With Privatemode, you can even process your sensitive data with AI.