Experimental

Experimental Path

Manual analog for the on-prem branch before the normal Calabi flow resumes.

On-Prem Manual Process

Warning

This on-prem target is experimental. Treat the docs and playbooks in this subtree as an emerging alternate installation path, not yet the same confidence level as the validated AWS-target flow.

This page covers only the on-prem-specific portion of the operator workflow. Once the host is prepared, guest storage exists, and bastion staging is done, return to the stock AWS MANUAL PROCESS.

1. Prepare The Operator Workstation
2. Verify The On-Prem virt-01 Host Contract
3. Prepare The Guest Storage Volume Group
4. Configure The On-Prem Inventory And Group Vars
5. Bootstrap The Host And Provision Guest LVs
6. Build The Bastion And Stage The Project
7. Hand Back To The Stock Runbook
8. Current External-Ceph Day-2 Continuation
9. Clean OpenShift-Only Teardown

1. Prepare The Operator Workstation

The on-prem path uses the same controller-side secret and content inputs as the AWS path, except there is no public-cloud CLI or stack deployment step.

Required local inputs:

repo checkout
SSH keypair
pull secret
RHSM credentials
optional local lab credentials

Keep these stock pages nearby:

Install the collection dependencies and syntax-check the on-prem entrypoints:

cd <project-root>/aws-metal-openshift-demoansible-galaxy collection install -r requirements.ymlcd <project-root>/on-prem-openshift-demoansible-playbook --syntax-check playbooks/site-bootstrap.ymlansible-playbook --syntax-check playbooks/site-precluster.ymlansible-playbook --syntax-check playbooks/site-lab.yml

2. Verify The On-Prem `virt-01` Host Contract

The on-prem target starts after the host already exists. Before bootstrap, confirm that host can stand in for virt-01:

SSH reachable from the operator workstation
RHEL installed and updated to the desired baseline
nested KVM available
an uplink interface is present for OVS integration
local storage is visible and the guest VG exists or can be created
you know what bastion-side host and user you want to publish through:
- on_prem_bastion_hypervisor_host
- on_prem_bastion_hypervisor_user

Minimal verification:

ssh <hypervisor-admin-user>@<hypervisor-management-ip> <<'EOF'hostnamectl --staticsudo virt-host-validatesudo lsblksudo vgsEOF

If CPU, RAM, or NUMA shape differs from the validated m5.metal baseline, read the host-sizing guidance before bootstrap:

HOST SIZING

3. Prepare The Guest Storage Volume Group

On-prem guest disks are created as logical volumes inside an operator-provided volume group.

The current lab footprint expects roughly:

5950 GiB of guest LV capacity for the full current design

That is only the raw guest-disk sum. Leave additional headroom for:

host root growth
image cache
mirror-content staging
rebuild hygiene

Example check:

ssh <hypervisor-admin-user>@<hypervisor-management-ip> <<'EOF'sudo vgssudo lvsEOF

If the volume group does not exist yet, create it before bootstrap using your site-local storage procedure. This repo does not own PV creation.

What the repo does own:

validating the volume group exists
validating free space before provisioning
creating the missing guest LVs
publishing the /dev/ebs/* compatibility symlinks the stock guest roles use

If you want the on-prem subtree to seed a dedicated guest VG from one explicit lab disk, use the optional override inputs:

on_prem_lvm_seed_enabled: true
on_prem_lvm_seed_device: /dev/nvme0n1
on_prem_lvm_seed_force: false

That path is opt-in and additive. It does not change the stock on-prem defaults, and it fails closed unless you explicitly enable it. When forced, it uses the same destructive whole-device wipe profile the project uses for ODF backing-disk recovery before creating the guest VG.

4. Configure The On-Prem Inventory And Group Vars

The current on-prem target keeps the stock aws_metal inventory-group name on purpose so the existing support/day-2 playbooks do not need to fork.

Edit these files before the first run:

For hosts that should stop before cluster build, start from one of:

For the current cluster-capable external Ceph profile, start from:

inventory/overrides/core-services-ad-plus-openshift-3node-external-ceph.yml.example

Read OVERRIDE MECHANISM before copying or publishing that file. It explains the phase toggles, external ODF payload, storage class indirection, and resource sizing assumptions.

Use core-services-ad-128g.yml.example for the current ~128 GiB host class when you want the core-services+AD footprint plus a managed 32G zram writeback LV in calabi_lab_vg.

That override also enables a conservative periodic zram writeback policy:

backing LV: calabi_lab_vg/zram-writeback
backing LV size: 32G
policy mode: huge
timer interval: 30m
per-run budget: 256 MiB

Use that profile as the current reference for the on-prem writeback-capable host-memory policy.

What must be correct:

ansible_host
ansible_user
ansible_ssh_private_key_file
on_prem_lvm_volume_group
on_prem_bastion_hypervisor_host
on_prem_bastion_hypervisor_user
any optional on_prem_lvm_lv_name_prefix
any project-local credential overrides

If you are using the reduced precluster-64g profile, copy the example override and edit the actual device path before the first run:

cd <project-root>/on-prem-openshift-democp inventory/overrides/precluster-64g.yml.example \  inventory/overrides/precluster-64g.yml

At this stage, the on-prem subtree reuses the stock guest and day-2 vars and playbooks from aws-metal-openshift-demo through local wrappers. It does not modify the AWS-target codepath.

5. Bootstrap The Host And Provision Guest LVs

This is the main on-prem divergence from the AWS path.

Run:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/bootstrap/site.yml

For the reduced pre-cluster profile:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/bootstrap/site.yml \  -e @inventory/overrides/precluster-64g.yml

For the support-services-only AD profile:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/bootstrap/site.yml \  -e @inventory/overrides/core-services-ad.yml.example

For the current ~128 GiB host class with managed zram writeback:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/bootstrap/site.yml \  -e @inventory/overrides/core-services-ad-128g.yml.example

This is the on-prem equivalent of the early AWS host steps:

host base configuration
host CPU and memory policy
OVS / libvirt host setup
guest base-image staging
LVM guest LV validation and creation
/dev/ebs/* compatibility symlink publication

Note

The shared host bootstrap now updates redhat-release before the full system update. This ensures the current Red Hat Post-Quantum Cryptography public keys are present before DNF validates newer packages. See: https://access.redhat.com/solutions/3449341

When it succeeds, the host should satisfy the same effective guest-disk contract the stock guest roles already expect.

Useful verification:

ssh <hypervisor-admin-user>@<hypervisor-management-ip> <<'EOF'sudo lvssudo ls -l /dev/ebsEOF

6. Build The Bastion And Stage The Project

The current on-prem site-bootstrap.yml:

runs the on-prem bootstrap host prep
applies the baseline host memory oversubscription policy (zram, THP madvise, KSM) during the host bootstrap path
can optionally provision a dedicated zram writeback LV and policy timer when an override such as core-services-ad-128g.yml.example enables it
reuses the stock bastion build
stages both the on-prem subtree and the stock AWS-target subtree onto bastion through the local on-prem bastion-stage wrapper
rewrites the bastion-side runtime inventory so the bastion can SSH back to the hypervisor without requiring ec2-user

Run:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/site-bootstrap.yml

For the reduced pre-cluster profile:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/site-bootstrap.yml \  -e @inventory/overrides/precluster-64g.yml

For the support-services-only AD profile:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/site-bootstrap.yml \  -e @inventory/overrides/core-services-ad.yml.example

For the current ~128 GiB host class with managed zram writeback:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh playbooks/site-bootstrap.yml \  -e @inventory/overrides/core-services-ad-128g.yml.example

After this, the bastion should exist and the project should be staged.

Writeback caveats on the on-prem path:

the managed-LVM writeback path assumes the local calabi_lab_vg volume group
the writeback LV must be dedicated to zram and is not counted as planned RAM
the role fails fast if the configured writeback LV already exists at a different size
the shipped policy uses huge, which writes back pages that did not compress well. For broader cold-page relief, huge_idle also sweeps idle compressible pages but requires kernel age-tracking support

7. Hand Back To The Stock Runbook

At this point, the on-prem-specific portion is over.

Choose the next stock runbook entry based on what you already completed:

if you want the manual bastion-forward process:
- Resume at AWS manual step 13A
if you are still walking the support-services flow by hand from the bastion build:
- Resume at AWS manual step 12

For automation rather than the hand-run sequence on a cluster-capable host, use:

cd <project-root>/on-prem-openshift-demo./scripts/run_remote_bastion_playbook.sh playbooks/site-lab.yml \  -e @inventory/overrides/core-services-ad-plus-openshift-3node-external-ceph.yml.example

For support-services-only profiles such as core-services or core-services-ad, stop after the support-service path instead of continuing into cluster build:

cd <project-root>/on-prem-openshift-demo./scripts/run_remote_bastion_playbook.sh playbooks/site-precluster.yml \  -e @inventory/overrides/core-services-ad.yml.example

Or, from the staged on-prem tree on bastion:

cd <staged-on-prem-project-root>./scripts/run_bastion_playbook.sh playbooks/site-precluster.yml \  -e @inventory/overrides/core-services-ad.yml.example

That path stops after:

optional ad-server
idm
optional idm-ad-trust
bastion-join
mirror-registry

For the reduced precluster-64g profile, also stop at mirror-registry instead of continuing into cluster build:

cd <staged-on-prem-project-root>./scripts/run_bastion_playbook.sh playbooks/site-precluster.yml \  -e @inventory/overrides/precluster-64g.yml

8. Current External-Ceph Day-2 Continuation

If the OpenShift cluster exists and the support services are already healthy, continue at the shared day-2 orchestration instead of replaying the full site-lab.yml chain:

cd <project-root>/on-prem-openshift-demo./scripts/run_remote_bastion_playbook.sh \  ../aws-metal-openshift-demo/playbooks/day2/openshift-post-install.yml \  -e @inventory/overrides/core-services-ad-plus-openshift-3node-external-ceph.yml.example

The current external-Ceph profile intentionally enables disconnected OperatorHub, IdM ingress certs, breakglass auth, NMState, external ODF, Keycloak, OIDC auth, Web Terminal, AAP, NetObserv, and validation. It disables infra conversion, internal ODF, LDAP auth, OpenShift Virtualization, and Pipelines.

Leave the force_* variables false for normal continuation. Set a force flag only when you are deliberately repairing or replacing that phase.

9. Clean OpenShift-Only Teardown

To tear down a broken OpenShift cluster while preserving healthy support services:

cd <project-root>/on-prem-openshift-demo./scripts/run_local_playbook.sh \  ../aws-metal-openshift-demo/playbooks/maintenance/cleanup.yml \  -e cleanup_destroy_openshift_cluster=true \  -e cleanup_wipe_openshift_cluster_block_devices=true \  -e @inventory/overrides/core-services-ad-plus-openshift-3node-external-ceph.yml.example

That path removes the OpenShift cluster domains, wipes the cluster block devices when requested, and preserves AD, IdM, bastion, and mirror-registry. After cleanup, rerun site-lab.yml with the same override to rebuild the cluster and continue through the mirror, install, and day-2 flow.

Calabi

On-Prem Manual Process

Table Of Contents

1. Prepare The Operator Workstation

2. Verify The On-Prem `virt-01` Host Contract

3. Prepare The Guest Storage Volume Group

4. Configure The On-Prem Inventory And Group Vars

5. Bootstrap The Host And Provision Guest LVs

6. Build The Bastion And Stage The Project

7. Hand Back To The Stock Runbook

8. Current External-Ceph Day-2 Continuation

9. Clean OpenShift-Only Teardown

Continue

On-Prem Manual Process

Table Of Contents

1. Prepare The Operator Workstation

2. Verify The On-Prem virt-01 Host Contract

3. Prepare The Guest Storage Volume Group

4. Configure The On-Prem Inventory And Group Vars

5. Bootstrap The Host And Provision Guest LVs

6. Build The Bastion And Stage The Project

7. Hand Back To The Stock Runbook

8. Current External-Ceph Day-2 Continuation

9. Clean OpenShift-Only Teardown

Continue

2. Verify The On-Prem `virt-01` Host Contract