Implementing authorization in RAG-based AI systems

As companies implement AI applications, a Retrieval Augmented Generation (RAG) architecture is often used to give an LLM context from internal data. The challenge that consequently arises is how to provide the LLM with sufficient context without violating privacy and authorization policies. Companies need to ensure that AI agents can’t inappropriately access sensitive data or expose it to unauthorized users.

How the Cerbos query plan works with RAG

RAG architecture can leverage large data sets of internal knowledge: documents, meeting notes, and resources to provide business-specific context to the LLM. This business data first has to be extracted from the system of record (ERP, CRM, HRIS, etc), go through an embedding process, and then be loaded into a vector store — which is a specialized database that can find similar items based on their vector embeddings. Vector stores also support storing metadata such as what the source system was, which department it belongs to and which region it’s associated with. These can all be leveraged for making authorization decisions.

With the business data stored along with its associated metadata, a typical workflow would be to put a chatbot-style interface in front. The workflow that processes user input and produces results can use Cerbos to apply authorization logic to data retrieval.

  1. The access filters applicable to the user in the current context are generated by the Cerbos Policy Decision Point (PDP).

  2. The user input is vectorized using an embedding model and used to query the vector data store. A metadata filter is applied to the query using the Cerbos query plan to restrict the results.

  3. The retrieved documents are injected into the prompt and fed to the LLM to generate the answer.

Leveraging Cerbos Query Plan for authorization

Cerbos can be used to enforce authorization policies on the data retrieval step in the RAG architecture. The Cerbos Query Plan can be used to generate filters based on the user’s identity and the metadata associated with the data. This ensures that the LLM only uses data that the user is authorized to access.

As an example, consider a scenario where a LLM is used to provide information about work projects. A project is associated with a department and a region and only those users in the same department or region should be able to access it. The Cerbos policy for this could be defined as follows:

apiVersion: api.cerbos.dev/v1
resourcePolicy:
  version: "default"
  resource: "project"

  rules:
    - actions: ["read"]
      effect: EFFECT_ALLOW
      condition:
        all:
          of:
            - expr: R.attr.department == P.attr.department
            - expr: R.attr.region == P.attr.region

    # ... other rules

When a user issues a query using the chat interface, their authorization context is passed to the Cerbos PDP in a PlanResources request to produce a plan for performing the read action on the project resource kind.`

{
  "principal": {
    "id": "alice",
    "roles": [
      "USER",
      "MANAGER"
    ],
    "attr": {
      "department": "FINANCE",
      "region": "EMEA"
    }
  },
  "resource": {
    "kind": "project",
  },
  "action": "read",
}

The Query Plan generated by Cerbos would then include the following conditions - note that the actual response is completely dynamic and depends on the user’s identity and the policy defined in Cerbos:

{
  "action": "read",
  "resourceKind": "project",
  "filter": {
    "kind": "KIND_CONDITIONAL",
    "condition": {
      "expression": {
        "operator": "and",
        "operands": [
          {
            "expression": {
              "operator": "eq",
              "operands": [
                {
                  "variable": "request.resource.attr.department"
                },
                {
                  "value": "FINANCE"
                }
              ]
            }
          },
          {
            "expression": {
              "operator": "eq",
              "operands": [
                {
                  "variable": "request.resource.attr.region"
                },
                {
                  "value": "EMEA"
                }
              ]
            }
          }
        ]
      }
    }
  }
}

This set of conditions then can be converted into a metadata filter that can be applied to the vector store query to ensure that only the data that the user is authorized to access is retrieved. Each vector store has its own syntax for defining filters. With the popular Chroma database system, a filter looks as follows.

{
  "$and": [
    {"department": "FINANCE"},
    {"region": "EMEA"}
  ]
}

With this filter in place, any documents retired from the vector store would be limited to those that match the user’s department and region, ensuring that the LLM only receives data that the user is authorized to access.

Implementing authorization in RAG-based AI systems is crucial to ensure that sensitive data is not exposed to unauthorized users. By leveraging Cerbos query plan, companies can ensure that their LLM use cases don’t inadvertently violate privacy and data security requirements.