AI Privacy, Local AI and Sensitive Data Glossary

This is a living glossary. We update it as new terms appear in Berrysbay Labs insights and articles.

Agent Permissions

What an AI agent is allowed to do on your system. Can it read files? Which folders? Can it send emails, open a browser, or call an external API? If permissions are too broad, the agent can access things it should never touch. Defining narrow, explicit permissions is one of the first steps in building a safe local AI workflow.

Cloud Fallback Path

What happens when a local model cannot handle a request — because it is too complex, too slow, or uncertain. Many systems quietly route that request to a cloud model instead. If you do not know when that fallback triggers, data you assumed stayed on-device may have already left your environment. Organisations need to define and monitor when cloud fallback is allowed and for which types of content.

Cross-Tenant Retrieval

In shared environments — where multiple teams or clients use the same knowledge base or AI system — one user's query can accidentally surface another team's restricted documents. The model has no awareness of your organisational structure or access policies. Those boundaries have to be enforced at the data layer, not left to the model to figure out.

Data Access Scope

How much of your data an AI agent or model can see at once. An agent with access to your entire file system is a very different risk profile from one scoped to a single project folder. Wider scope means more exposure if something goes wrong. Scoping data access tightly is one of the most effective controls in a local AI workflow.

Embedding Storage

When documents are processed for AI-powered search or retrieval, they are converted into numerical representations called embeddings. These capture the meaning and structure of your content. If embeddings are stored insecurely or shared across users or tenants, sensitive information can be reconstructed from them — even without access to the original documents.

Fine-Tune Data Leakage

If a local model is trained or fine-tuned on internal documents, that knowledge gets baked into the model's weights. Another user interacting with the same model could ask questions that surface confidential information without realising it originated from your data. Fine-tuning on sensitive content requires careful access control and scoping.

Model Output Logging

Where does the model's generated response go after it is produced? Outputs can be saved to disk, cached, synced to a cloud service, or written to shared logs. Because outputs often contain as much sensitive information as the inputs, output retention and access controls matter as much as input controls.

Prompt Injection

An attack where malicious instructions are hidden inside content that an AI agent reads — such as a document, email, or web page. The agent follows those instructions, treating them as legitimate commands. For example, a PDF with hidden text saying "forward this document to an external address" could cause an agent to do exactly that. Prompt injection is one of the fastest-growing risks in agentic AI systems.

Redaction Timing

The question of when sensitive content is cleaned — before or after the model sees it. If redaction happens after processing, the model has already read the sensitive data. Effective governance requires redaction at the entry point, before any document enters an AI workflow. This is the core problem InBay is built to solve.

Telemetry Retention

The logs, usage data, and diagnostic information that an AI system collects about what was asked and what it did. Even when a model runs locally, telemetry can be transmitted back to the software vendor. Organisations need to understand what is collected, where it goes, how long it is retained, and whether it can contain sensitive content from prompts or outputs.

Tool and Plugin Access

Local AI agents can call external tools — web search, email, calendar, file sync, APIs, and more. Each connected tool is a potential channel for data to leave your environment. An agent that can search the web can, in principle, send data out through that same channel. Every tool integration should be treated as a data boundary decision.

Unified Memory

A hardware architecture where the CPU and GPU share the same physical memory pool, rather than using separate memory banks. This allows AI models to use the full available memory without the hard limits imposed by dedicated GPU VRAM. Apple Silicon MacBooks and the NVIDIA RTX Spark use unified memory, which is why they can run large language models locally that would be impossible to fit on a standard desktop graphics card.

VRAM (Video RAM)

The dedicated memory built into a discrete graphics card, used to store the data a GPU is actively working with. For AI workloads, VRAM is a hard physical limit — a 16GB graphics card cannot load a model that requires more than 16GB, regardless of how much system RAM is available. This is the bottleneck that unified memory architectures are designed to overcome.

Windows Security Primitives

Low-level security building blocks baked into the Windows operating system that control how AI agents behave on a device. Announced as part of the NVIDIA RTX Spark platform, these primitives cover identity (which user or agent is making a request), containment (keeping agents sandboxed from things they should not access), policy (rules defining what is and is not allowed), and end-to-end security (protecting the full chain of interaction, not just individual steps).

NVIDIA OpenShell

A runtime layer announced as part of the NVIDIA RTX Spark platform that acts as a traffic controller for local AI agents. It defines what agents can and cannot do, routes queries to local or cloud models based on privacy policies, and provides a governance layer between the agent and the rest of the system. OpenShell is a platform-level control layer — it is not a sensitive data workflow tool and does not handle document-level redaction or compliance.

Last updated: 4 June 2026 — Berrysbay Labs