Privileged Automation Security

The argument

I think the default trust model for automation is wrong. On RHEL, a lot of shipped software already benefits from targeted SELinux confinement, but logged-in users are still unconfined by default and typically map to unconfined_u. That may be workable for a person at a shell. I do not think it is a good default for orchestration.

I do not want automation to inherit the same assumptions as a human operator and then move faster. I want automation to start constrained, and movement out of that constraint to be narrow, deliberate, and visible.

The endpoint carries the actual SELinux policy.
FreeIPA maps named identities into those SELinux users and roles.
SSSD and pam_selinux apply that mapping at login.
Local root-equivalent powers do not imply network pivot rights.
Bypassing the boundary should be noisy enough to trigger a response.

What FreeIPA actually gives

I do not think FreeIPA is the place to author SELinux policy. I think it is the right place to map identity into policy that already exists on the endpoints. FreeIPA answers the question, “On this host, which SELinux user should this identity get?” It does not answer, “What is the SELinux policy for this host?”

That is enough centralization. FreeIPA does not need to become a policy engine. I need it to be the place where identity lands in the right confinement at scale.

flowchart LR
    A["Policy pipeline"] --> B["Managed endpoint"]
    C["FreeIPA mapping"] --> D["SSSD + pam_selinux"]
    B --> D
    D --> E["Confined session"]
    classDef brand fill:#fce3e3,stroke:#ee0000,color:#151515,stroke-width:1.5px;
    classDef subtle fill:#f2f2f2,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    classDef neutral fill:#ffffff,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    class A,C brand;
    class D subtle;
    class B,E neutral;

Why this matters more now

This is also a response to a different class of attacker than the one a lot of operational models were built around. The VoidLink reporting from Check Point Research is a useful marker for that shift. They describe a cloud-first Linux malware framework designed for modern infrastructure, with explicit awareness of major cloud environments, Kubernetes, and Docker, plus credential harvesting for cloud and version-control contexts.

Their report does not literally say “this is an orchestration attack framework,” and I do not want to overstate it. But I think it is a fair inference that cloud-native, container-aware attacker tooling changes how I should think about automation paths, control-plane hosts, and privileged identities.

The model I want

I want a small set of organization-defined SELinux users and domains for automation. I want those shipped to the endpoints through a policy pipeline. I want FreeIPA to map identities into them. I want the baseline role to be function-specific and narrow.

ops_inventory_u ops_backup_u ops_patch_u ops_deploy_u ops_root_local_u ops_root_networked_u ops_breakglass_u

I do not want one SELinux user per tool. I want a few roles that reflect function and risk. I also want host scope to matter. A deployment identity should not necessarily land in the same SELinux user on a build host, an application host, and a bastion.

sequenceDiagram
    participant I as Automation identity
    participant H as Host
    participant S as SSSD + pam_selinux
    participant F as FreeIPA or SSSD cache
    participant P as SELinux policy

    I->>H: authenticate
    H->>S: start session
    S->>F: resolve map
    S->>P: select context
    S-->>I: launch confined session

Root re-assembly

I also do not think of root as a single thing anymore. I think of it as a bundle that can be split apart and reassembled under SELinux policy. What matters to me is not EUID 0 in the abstract. What matters is which powers are actually present in the SELinux domain wrapped around that process.

That means I want to be able to say that an automation role has local root-equivalent power without automatically saying that it has arbitrary outbound networking, SSH pivot rights, or access to every trust primitive on the node.

flowchart TB
    R["Root-equivalent authority"] --> L["Local files + processes"]
    R --> S["Service control"]
    R --> P["Package + label work"]
    R --> N["Outbound network"]
    R --> X["SSH pivoting"]
    R --> K["Remote trust material"]

    subgraph A[What I would allow in ops_root_local_t]
        L
        S
        P
    end

    subgraph B[What I would grant separately or deny]
        N
        X
        K
    end
    classDef brand fill:#fce3e3,stroke:#ee0000,color:#151515,stroke-width:1.5px;
    classDef subtle fill:#f2f2f2,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    classDef neutral fill:#ffffff,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    class R brand;
    class L,S,P neutral;
    class N,X,K subtle;

Why I would not build this around shared direct root

FreeIPA maps IdM identities. A literal shared local root account does not fit that model cleanly. The cleaner pattern is to authenticate as a named automation identity, map that identity through FreeIPA into a confined SELinux user, and transition to EUID 0 only where required while preserving the SELinux confinement boundary.

Why this matters for Ansible

I want to be able to log into a host for automation, perform local administrative work, and still deny that session the ability to become a general-purpose jump point. In the model I have in mind, a host could accept a named automation identity that ends up running tasks with EUID 0, but the session would still land in something like ops_root_local_u:ops_root_local_r:ops_root_local_t.

Write approved configuration.
Restore or apply labels where policy allows it.
Restart approved services.
Perform approved package work.

What I do not want that same session to do by default is become a networked operator from the box.

Sealed policy and reboot as tripwire

SELinux alone does not give me a sealed anti-tamper story. On a stock RHEL system, root can still use setenforce, semodule, or boot-time parameters such as enforcing=0 and selinux=0 to weaken the boundary.

The model I am describing depends on something stronger: runtime confinement comes from SELinux policy, anti-tamper value comes from treating production policy as sealed state, and bypassing that seal should require a higher-friction action that is visible outside the host.

flowchart TD
    A["Confined EUID 0 session"] --> B{"Requested action"}
    B -->|approved local work| C["Allowed"]
    B -->|pivot or bypass attempt| D["Denied"]
    D --> E{"Can it break the seal quietly?"}
    E -->|no| F["Weaken boot state or reboot"]
    F --> G["Off-host telemetry"]
    E -->|yes| H["The design failed"]
    classDef brand fill:#fce3e3,stroke:#ee0000,color:#151515,stroke-width:1.5px;
    classDef subtle fill:#f2f2f2,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    classDef neutral fill:#ffffff,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    class A,F,H brand;
    class D,G subtle;
    class B,C,E neutral;

Why event-driven response matters

I do not think the tripwire idea is complete if it ends at detection. If the seal-break event is visible outside the host, I want the rest of the system to react to it automatically. This is where Event-Driven Ansible fits.

Reboot events.
Boot-state changes.
Shipped audit records.
Other off-host telemetry about policy weakening.

Those should become event sources for response automation.

flowchart LR
    A["Host event"] --> B["Off-host telemetry"]
    B --> C["EDA rulebook"]
    C --> D{"Tripwire match?"}
    D -->|yes| E["Response workflow"]
    D -->|no| F["Keep monitoring"]
    E --> G["Quarantine host"]
    E --> H["Disable identity"]
    E --> I["Collect evidence"]
    classDef brand fill:#fce3e3,stroke:#ee0000,color:#151515,stroke-width:1.5px;
    classDef subtle fill:#f2f2f2,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    classDef neutral fill:#ffffff,stroke:#c7c7c7,color:#151515,stroke-width:1.5px;
    class C,E brand;
    class B,D,F subtle;
    class A,G,H,I neutral;

Guardrails and pilot

If I were trying to make this real, I would hold the line on a few things.

No silent fallback to unconfined_u for automation.
No custom policy published without enforcing-mode validation.
No new SELinux user for every tool.
No blanket transition into sysadm_r:sysadm_t.
No break-glass path used as normal plumbing.

I would pilot the design in three steps: prove identity-to-context mapping, prove a confined role that can still do useful work, then prove root re-assembly without network sprawl.

SELinux-confined automation is a better trust model than unconfined orchestration.