Backup and Restore¶
This guide provides comprehensive information on backing up and restoring IBM Maximo Application Suite (MAS) instances. The backup process captures critical configuration data, MongoDB databases, Suite License Service (SLS) data, and certificate manager configurations to enable disaster recovery scenarios.
Tip
This guide covers both backup and restore operations for IBM Maximo Application Suite instances.
Warning
Before you begin
Be aware of the following versioning considerations for the MAS CLI releases:
The MAS backup and restore in CLI release v19.0.0 and later contains process, backup archive file and directory changes that are not backward compatible with earlier backup and restore versions.
Run the backup processes using v19.0.0 or later to ensure that you can successfully run a restore. You cannot run a restore process using v19.0.0 or later from back ups created on an older version.
Supported MAS versions - MAS 9.1.x - MAS 9.0.x (in testing)
User Permissions Required
- oc CLI with cluster admin permissions
- mas CLI with appropriate permissions
- Access to Tekton pipeline resources
Quick Navigation: - Backup Overview - Information about backing up MAS instances - Restore Overview - Information about restoring MAS instances
Backup Overview¶
The MAS backup process uses Tekton pipelines to orchestrate the backup of multiple components. The Tekton pipeline executes Ansible DevOps Collection roles to perform the actual backup operations.
Backup Components¶
- IBM Operator Catalogs - Catalog source definitions
- Certificate Manager - Certificate configurations (RedHat only)
- MongoDB - MAS configuration database (Community Edition only)
- Suite License Service (SLS) - License server data (optional)
- MAS Suite Configuration - Core MAS instance configuration and custom resources
- MAS Applications - Application-specific resources and persistent volume data (optional)
- Db2 Database - Db2 instance resources and database backups (optional)
The backup creates a compressed archive for each supported component that can be stored locally or uploaded to cloud storage (S3 or Artifactory).
Backup Limitations¶
Warning
Be aware of the following limitations before performing a backup:
- MongoDB Community Edition only - The backup process supports only in-cluster MongoDB Community Edition. External or enterprise MongoDB deployments are not backed up.
- Db2 standalone operator only - The backup process supports only the in-cluster standalone Db2 operator. Other Db2 operator implementations are not included.
- Certificate Manager (RedHat only) - Certificate Manager backup is supported only for RedHat Certificate Manager. Other certificate manager implementations are not included.
- No support for some apps - Only Manage application is supported for now. Other MAS applications (Facilities, Monitor, IoT, Predict, etc.) are not supported, but will be added in later releases.
- No OpenShift cluster state - The backup does not capture the full OpenShift cluster state, node configurations, or cluster-level resources outside of MAS namespaces.
- No IBM Cloud Pak for Data backups - The backup process does not support backing up CP4D itself.
- No Incremental backups - Each backup is a full backup; incremental or differential backups are not supported.
- Single MAS instance per backup - Each backup operation targets a single MAS instance. Multi-instance environments require separate backup runs per instance.
- Tekton pipeline dependency - The backup process requires Tekton pipelines to be available and functional on the cluster.
- Storage class dependency - Backup of Manage application's persistent volumes depends on the storage class supporting volume snapshots or the relevant backup mechanism.
- S3/Artifactory upload is optional - Without configuring cloud storage upload, backups are stored locally in the cluster and may be lost if the cluster is decommissioned.
- Download backup archives to local machine manually - The backup archives are stored in the cluster's pvc or uploaded to S3/Artifactory and must be downloaded to a local machine manually.
Tip
We are working on reducing the limitations of the backup process and will be adding new capabilties and support for other MAS applications in future releases.
Ansible DevOps Integration¶
The mas backup command launches a Tekton pipeline that executes the following Ansible roles from the IBM MAS DevOps Collection:
ibm.mas_devops.ibm_catalogs- Backs up IBM Operator Catalog definitionsibm.mas_devops.cert_manager- Backs up Certificate Manager configurationsibm.mas_devops.mongodb- Backs up MongoDB Community Edition instance and databaseibm.mas_devops.sls- Backs up Suite License Service dataibm.mas_devops.suite_backup- Backs up MAS Core configurationibm.mas_devops.db2- Backs up DB2 resources and persistent volume dataibm.mas_devops.suite_app_backup- Backs up MAS application resources and persistent volume data
For detailed information about the underlying Ansible automation, see the Backup and Restore Playbook Documentation.
Tip
Advanced users can use the Ansible roles directly for custom backup workflows. The CLI provides a managed, simplified interface to these roles with additional features like automatic pipeline setup and cloud upload capabilities.
Backup Artifacts¶
Backups are stored in the pipeline namespace PVC at:
- Backup Directory:
/workspace/backups
When S3/artifactory upload is enabled, the backup archives will be uploaded to the bucket/artifactory repo under mas-<instanceid>-backups directory.
S3 Backup Archive Directory Structure:
s3://bucket-name/ (or Artfactory - https://na.artifactory.swg-devops.com/artifactory/repo-name/)
├── mas-<instanceid>-backups/
├── mas-<instanceid>-backup-<backupversion>-catalog.tar.gz
├── mas-<instanceid>-backup-<backupversion>-certmanager.tar.gz
├── mas-<instanceid>-backup-<backupversion>-db2u-manage.tar.gz
├── mas-<instanceid>-backup-<backupversion>-mongoce.tar.gz
├── mas-<instanceid>-backup-<backupversion>-sls.tar.gz
└── mas-<instanceid>-backup-<backupversion>-suite.tar.gz
├── mas-<instanceid>-backup-<backupversion>-app-manage.tar.gz
Each backup archive follows the naming convention: <instance-id>-backup-<timestamp>-<component>.tar.gz
Archive Components:
| Archive | Description |
|---|---|
catalog.tar.gz |
IBM Operator Catalog configurations |
certmanager.tar.gz |
Certificate Manager configurations |
mongoce.tar.gz |
MongoDB Community Edition database backup |
sls.tar.gz |
Suite License Service data (if included) |
suite.tar.gz |
MAS Core configuration and data |
db2u-manage.tar.gz |
Manage Db2 database backup (if included) |
app-manage.tar.gz |
Manage application configuration (if included) |
When to Backup¶
Regular Backup Schedule¶
Establish a regular backup schedule based on your organization's requirements:
- Before major upgrades - Always backup before upgrading MAS or its dependencies
- After configuration changes - Backup after significant configuration modifications
- Regular intervals - Weekly or monthly backups for disaster recovery
- Before cluster maintenance - Backup before OpenShift cluster maintenance windows
Migration Scenarios¶
Backups are essential for:
- Cluster migration - Moving MAS from one OpenShift cluster to another
- Disaster recovery - Recovering from cluster failures or data corruption
- Environment cloning - Creating test/dev environments from production backups
- Version rollback - Reverting to a previous configuration state
Component Selection¶
Including SLS in Backups¶
Include SLS (--include-sls or default behavior) when:
- SLS is deployed in-cluster in the same OpenShift environment as MAS
- You are using the standard MAS installation with bundled SLS
- The SLS namespace is accessible from your backup environment
- You want a complete, self-contained backup for disaster recovery
Exclude SLS (--exclude-sls) when:
- SLS is deployed externally in a separate cluster or environment
- You are using a shared SLS instance across multiple MAS installations
- SLS is managed by a different team or organization
- The SLS namespace is not accessible from your backup environment
- You only need to backup MAS-specific configuration
Note
The default behavior is to include SLS in backups. You must explicitly use --exclude-sls to skip SLS backup.
Data Reporter Operator (DRO)¶
The Data Reporter Operator (DRO) is not included in backup operations as it is typically configured during restore or installation. DRO configuration is handled separately and can be:
- Installed during restore - DRO will be installed when restoring from a backup when
--include-drois specified - Configured externally - If using an external DRO instance, it should be configured independently
- Skipped - DRO installation can be skipped during restore if not required, use
--exclude-droto skip DRO installation
Info
DRO backup and restore behavior is managed by the underlying Ansible DevOps roles. The CLI backup command focuses on capturing MAS configuration and data, while DRO is handled during the restore process.
MongoDB Configuration¶
The backup process supports MongoDB Community Edition only. Ensure you specify the correct MongoDB configuration:
- Namespace - Where MongoDB is deployed (default:
mongoce) - Instance Name - MongoDB instance identifier (default:
mas-mongo-ce) - Provider - Must be
community(only supported provider for backup)
Warning
IBM Cloud Databases for MongoDB and other external MongoDB providers are not supported by the backup process. You must use their native backup mechanisms.
Certificate Manager¶
Specify the certificate manager provider used in your environment:
- Red Hat Certificate Manager (
--cert-manager-provider redhat) - Default option, and the only supported provider.
The backup captures certificate configurations but not the actual certificates, which are regenerated during restore.
MAS Application Backup¶
The backup process supports backing up MAS application resources and persistent volume data. Currently supported:
- Manage Application - Backs up Manage namespace resources and persistent volume data
When backing up a Manage application, the following resources are included:
Namespace Resources:
- ManageApp custom resource
- ManageWorkspace custom resource
- Encryption secrets (dynamically determined from ManageWorkspace CR)
- Certificates with mas.ibm.com/instanceId label
- Subscription and OperatorGroup
- IBM entitlement secret
- All referenced secrets (auto-discovered)
Persistent Volume Data (if configured in ManageWorkspace CR):
- All persistent volumes defined in spec.settings.deployment.persistentVolumes
- Data backed up as compressed tar.gz archives
- Each PVC's mount path archived separately
- Common PVCs include JMS server data, custom fonts, and attachments
Note
Application backup is optional and configured during the interactive backup process or via command-line parameters (--backup-manage-app, --manage-workspace-id).
Db2 Database Backup¶
The backup process supports backing up Db2 databases used by MAS applications. When backing up a Db2 database, the following are included:
Db2 Instance Resources:
- Db2uCluster custom resource
- Secrets (instance password, certificates, LDAP credentials)
- ConfigMaps
- Services and routes
- Operator subscription
Database Data: - Complete database backup (full backup) - Stored in the backup archive alongside other components - Supports both online and offline backup modes
Backup Types:
- Online Backup - Database remains accessible during backup; requires archive logging enabled
- Offline Backup - Database unavailable during backup; works with circular logging (default configuration)
Warning
If your Db2 instance uses circular logging (the default configuration), you must use offline backup type. Online backups require archive logging to be enabled via LOGARCHMETH1 and LOGARCHMETH2 configuration.
Note
Db2 backup is optional and configured during the interactive backup process or via command-line parameters (--backup-manage-db, --manage-db2-namespace, --manage-db2-instance-name, --manage-db2-backup-type).
Backup Modes¶
Interactive Mode¶
Interactive mode guides you through the backup process with prompts for all required configuration. This is the recommended approach for manual backups.
docker run -ti --rm quay.io/ibmmas/cli mas backup
The interactive session will:
- Prompt for OpenShift cluster connection
- Display detected MAS instances
- Request backup storage size
- Offer auto-generated or custom backup version
- Configure optional upload to S3 or Artifactory
Non-Interactive Mode¶
Non-interactive mode is ideal for automation, scheduled backups, and CI/CD pipelines. All required parameters must be provided via command-line arguments.
docker run -ti --rm quay.io/ibmmas/cli mas backup \
--instance-id inst1 \
--no-confirm
Backup Scenarios - Non-Interactive Mode¶
Scenario 1: Standard In-Cluster Deployment¶
Environment: - MAS with all dependencies in a single OpenShift cluster - MongoDB Community Edition - In-cluster SLS - Red Hat Certificate Manager
Backup Command:
mas backup \
--instance-id inst1 \
--backup-storage-size 50Gi \
--no-confirm
This uses all default values and includes SLS in the backup.
Scenario 2: External SLS Deployment¶
Environment: - MAS in OpenShift cluster - MongoDB Community Edition in-cluster - SLS deployed in separate cluster or external environment - Red Hat Certificate Manager
Backup Command:
mas backup \
--instance-id inst1 \
--backup-storage-size 30Gi \
--exclude-sls \
--no-confirm
Use --exclude-sls to skip backing up SLS when it's managed externally.
Scenario 3: Custom MongoDB Configuration and backup version¶
Environment: - MAS with custom MongoDB namespace - Custom backup version desired - Custom MongoDB instance name - In-cluster SLS - Red Hat Certificate Manager
Backup Command:
mas backup \
--instance-id inst1 \
--backup-version prod-backup-$(date +%Y%m%d) \
--backup-storage-size 50Gi \
--mongodb-namespace my-mongodb \
--mongodb-instance-name custom-mongo-instance \
--mongodb-provider community \
--no-confirm
Scenario 4: Backup with S3 Upload¶
Environment: - Standard MAS deployment - Custom backup version desired - Automatic upload to AWS S3 for off-site storage
Backup Command:
mas backup \
--instance-id inst1 \
--backup-version prod-$(date +%Y%m%d-%H%M%S) \
--backup-storage-size 50Gi \
--upload-backup \
--aws-access-key-id AKIAIOSFODNN7EXAMPLE \ #pragma: allowlist secret
--aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \ #pragma: allowlist secret
--s3-bucket-name mas-backups-prod \
--s3-region us-east-1 \
--no-confirm
Tip
Store AWS credentials securely using environment variables or secrets management systems rather than hardcoding them in scripts.
Scenario 5: Backup with Manage Application and Db2 Database¶
Environment: - Standard MAS deployment with Manage application - Manage workspace with persistent volumes configured - In-cluster Db2 database for Manage - Need to backup application resources, PV data, and database
Backup Command:
mas backup \
--instance-id inst1 \
--backup-storage-size 100Gi \
--backup-manage-app \
--manage-workspace-id masdev \
--backup-manage-db \
--manage-db2-namespace db2u \
--manage-db2-instance-name mas-inst1-masdev-manage \
--manage-db2-backup-type offline \
--no-confirm
Tip
When backing up Manage with Db2, ensure sufficient backup storage (100Gi+ recommended) to accommodate application PV data and database backups. Use offline backup type if your Db2 instance uses the default circular logging configuration.
Scenario 6: Backup with Manage Application Only (External Db2)¶
Environment: - MAS deployment with Manage application - External Db2 database (managed separately) - Only need to backup application resources and PV data
Backup Command:
mas backup \
--instance-id inst1 \
--backup-storage-size 50Gi \
--backup-manage-app \
--manage-workspace-id masdev \
--no-confirm
Note
When using an external Db2 database, omit the --backup-manage-db flag. The database should be backed up separately using your organization's database backup procedures.
Scenario 7: Backup for Troubleshooting (No Cleanup)¶
Environment: - Backup for troubleshooting purposes - Custom backup version desired - Need to inspect workspace contents after backup - Workspace cleanup disabled
Backup Command:
mas backup \
--instance-id inst1 \
--backup-version debug-$(date +%Y%m%d-%H%M%S) \
--backup-storage-size 50Gi \
--no-clean-backup \
--no-confirm
Note
Use --no-clean-backup when you need to inspect the backup workspace contents for troubleshooting. Remember to manually clean up the workspaces later to free up storage.
Scenario 8: Minimal Backup (Skip Pre-Check)¶
Environment: - Emergency backup scenario - Custom backup version desired - Skip pre-backup validation for speed, and when the cluster is not 100% healthy
Backup Command:
mas backup \
--instance-id inst1 \
--backup-version emergency-$(date +%Y%m%d-%H%M%S) \
--backup-storage-size 50Gi \
--skip-pre-check \
--no-confirm
Warning
Use --skip-pre-check only in emergency situations. Pre-backup checks validate cluster health and can prevent incomplete backups.
Storage Requirements¶
Backup Storage Sizing¶
The backup storage size depends on several factors:
| Component | Typical Size | Notes |
|---|---|---|
| MAS Configuration | < 1 MB | Core MAS custom resources and configurations |
| MongoDB Database | 0.05-20 GB | Varies based on MAS app count and data volume |
| SLS Data | < 1 MB | License server database and configuration |
| IBM Catalogs | < 1 MB | Operator catalog definitions |
| Certificate Manager | < 1 MB | Certificate configurations |
| Manage App Resources | < 10 MB | Manage namespace Kubernetes resources |
| Manage PV Data | 1-100 GB | JMS server, fonts, attachments (if configured) |
| Db2 Instance Resources | < 10 MB | Db2 Kubernetes resources and metadata |
| Db2 Database Backup | Varies | 0.5-2x database size when compressed; depends on data volume |
Tip
Monitor your first backup to determine actual storage requirements, then adjust the --backup-storage-size parameter for future backups. When backing up Manage with Db2, plan for significantly larger storage requirements (100GB+ recommended).
Storage Class Considerations¶
The backup process automatically selects appropriate storage:
- Single Node OpenShift (SNO): Uses ReadWriteOnce (RWO) storage
- Multi-node clusters: Prefers ReadWriteMany (RWX) storage when available
- Falls back to RWO if RWX is not available
The storage class is determined from your cluster's default storage classes.
Backup Process Details¶
Pipeline Execution¶
When you run mas backup, the following occurs:
- Validation - Verifies cluster connectivity and MAS instance existence
- Namespace Preparation - Creates/updates
mas-{instance-id}-pipelinesnamespace - OpenShift Pipelines - Validates or installs OpenShift Pipelines Operator
- PVC Creation - Provisions persistent volume for backup storage
- Tekton Pipeline Launch - Submits PipelineRun with configured parameters
- Component Backup - Executes backup tasks in parallel where possible:
- IBM Catalogs backup
- Certificate Manager backup
- MongoDB backup
- SLS backup (if included)
- Suite Backup - Backs up MAS core configuration
- Database Backup (optional) - Backs up Db2 instance and database:
- Db2 instance resources backup
- Db2 database backup (online or offline)
- Application Backup (optional) - Backs up MAS application resources and persistent volumes:
- Manage namespace resources backup
- Manage persistent volume data backup
- Archive Creation - Compresses backup into tar.gz archives for each component
- Upload (optional) - Uploads archives to S3 or Artifactory
- Workspace Cleanup (optional, default: enabled) - Cleans backup and config workspaces to free up storage
Monitoring Progress¶
After launching the backup, a URL to the Tekton PipelineRun is displayed:
View progress:
https://console-openshift-console.apps.cluster.example.com/k8s/ns/mas-inst1-pipelines/tekton.dev~v1beta1~PipelineRun/mas-backup-20240315-120000
Use this URL to:
- Monitor real-time backup progress
- View logs from individual backup tasks
- Troubleshoot any failures
- Verify successful completion
Workspace Cleanup¶
By default, the backup pipeline automatically cleans the workspace directories after backup completion to free up storage space. This cleanup occurs in the pipeline's finally block, ensuring it runs regardless of backup success or failure.
To disable workspace cleanup:
- Interactive mode: Answer "No" when prompted about cleaning workspaces
- Non-interactive mode: Use the
--no-clean-backupflag
When to disable cleanup:
- Troubleshooting backup issues and need to inspect workspace contents
- Running multiple backups in sequence and want to preserve intermediate files
- Custom post-backup processing that requires access to workspace files
Tip
Workspace cleanup is recommended for production backups to prevent PVC storage exhaustion. Only disable it when you have a specific need to inspect or process the workspace contents.
Best Practices¶
Backup Strategy¶
- Regular Schedule - Implement automated backups on a regular schedule
- Version Naming - Use descriptive backup versions (e.g.,
prod-20240315-pre-upgrade) - Retention Policy - Define how long to keep backups based on compliance requirements
- Off-site Storage - Upload backups to S3 or Artifactory for disaster recovery
- Test Restores - Periodically test restore procedures in non-production environments
- Document Configuration - Keep records of custom configurations and dependencies
- Application Backups - Include Manage application and Db2 database in regular backup schedule
- Coordinate Backups - When backing up Manage, always include the Db2 database for consistency
- Storage Planning - Allocate sufficient backup storage when including applications and databases (100Gi+ recommended)
Security Considerations¶
- Credentials - Never hardcode credentials in scripts; use environment variables or secrets
- Access Control - Restrict access to backup storage and archives
- Encryption - Consider encrypting backup archives for sensitive environments
- Audit Trail - Maintain logs of backup operations and access
Automation¶
For automated backups, you have several options depending on your infrastructure and requirements:
Option 1: Shell Script with MAS CLI¶
Create a simple shell script or CI/CD pipeline using the MAS CLI:
#!/bin/bash
# Automated MAS Backup Script
INSTANCE_ID="inst1"
BACKUP_VERSION="auto-$(date +%Y%m%d-%H%M%S)"
S3_BUCKET="mas-backups-prod"
# Login to OpenShift
oc login --token=${OCP_TOKEN} --server=${OCP_SERVER}
# Run backup with S3 upload
docker run --rm \
-v ~/.kube:/root/.kube:z \
-v ~:/mnt/home \
quay.io/ibmmas/cli mas backup \
--instance-id ${INSTANCE_ID} \
--backup-version ${BACKUP_VERSION} \
--backup-storage-size 50Gi \
--upload-backup \
--aws-access-key-id ${AWS_ACCESS_KEY_ID} \
--aws-secret-access-key ${AWS_SECRET_ACCESS_KEY} \
--s3-bucket-name ${S3_BUCKET} \
--s3-region us-east-1 \
--no-confirm
# Check exit code
if [ $? -eq 0 ]; then
echo "Backup completed successfully: ${BACKUP_VERSION}"
else
echo "Backup failed!"
exit 1
fi
Option 2: Red Hat Ansible Automation Platform¶
For enterprise-grade automation with advanced features, use Red Hat Ansible Automation Platform (AAP) to execute the backup playbooks and roles directly. The MAS DevOps Execution Environment provides a pre-built container image (quay.io/ibmmas/ansible-devops-ee) that includes the ibm.mas_devops collection and all required dependencies.
Benefits of using AAP:
- Centralized Management - Single control plane for all automation
- Role-Based Access Control (RBAC) - Fine-grained permissions for backup operations
- Scheduling - Built-in job scheduling for regular backups
- Audit Logging - Complete audit trail of all backup operations
- Credential Management - Secure storage and injection of credentials
- Notifications - Integration with email, Slack, PagerDuty, and other systems
- Job Templates - Reusable backup configurations
- Workflow Automation - Chain backup with other operations (e.g., validation, upload)
To use AAP for MAS backups:
- Configure the Execution Environment - Set up AAP to use the
quay.io/ibmmas/ansible-devops-eeimage (see Execution Environment setup guide) - Create a Project - Point to your playbook repository (or use the sample playbooks as a starting point)
- Create Job Templates - Configure job templates for backup operations using the
ibm.mas_devops.br_coreplaybook - Configure Credentials - Set up OpenShift credentials and any cloud storage credentials
- Schedule Backups - Set up recurring schedules for automated backups
- Configure Notifications - Set up alerts for backup success/failure
Example AAP Job Template Variables:
mas_instance_id: inst1
br_action: backup
mas_backup_dir: /backup/mas
backup_version: "{{ ansible_date_time.date }}-{{ ansible_date_time.hour }}{{ ansible_date_time.minute }}"
include_sls: true
mongodb_namespace: mongoce
For detailed information on setting up and using Ansible Automation Platform with MAS DevOps, see: - MAS DevOps Execution Environment - Complete AAP setup guide - Backup and Restore Playbook - Playbook documentation and examples
Tip
AAP is recommended for production environments where you need enterprise features like RBAC, audit logging, and centralized management. For simpler use cases, the MAS CLI with shell scripts may be sufficient.
Troubleshooting¶
Common Issues¶
Issue: "No MAS instances were detected on the cluster"
- Verify you're connected to the correct OpenShift cluster
- Ensure MAS is installed and the Suite CR exists
- Check that you have permissions to view Suite resources
Issue: "OpenShift Pipelines Operator installation failed"
- Verify cluster admin permissions
- Check cluster connectivity and operator hub availability
- Review operator installation logs
Issue: "Insufficient storage for backup PVC"
- Increase
--backup-storage-sizeparameter - Verify storage class has available capacity
- Check cluster storage quotas
Issue: "MongoDB backup failed"
- Verify MongoDB namespace and instance name are correct
- Ensure MongoDB is running and accessible
- Check MongoDB provider is set to
community
Issue: "SLS backup failed"
- Verify SLS namespace is correct
- Ensure SLS is running and accessible
- Consider using
--exclude-slsif SLS is external
Issue: "Upload to S3 failed"
- Verify AWS credentials are correct
- Check S3 bucket exists and is accessible
- Verify network connectivity to AWS
- Ensure IAM permissions allow PutObject operations
Issue: "Manage application backup failed"
- Verify Manage workspace ID is correct
- Ensure ManageWorkspace CR exists in the cluster
- Check that Manage pods are running and healthy
- Verify persistent volumes are properly configured in ManageWorkspace CR
- Ensure sufficient storage space in backup PVC
Issue: "Db2 backup failed"
- Verify Db2 namespace and instance name are correct
- Ensure Db2 instance is running and accessible
- Check backup type matches Db2 logging configuration (use offline for circular logging)
- Verify sufficient storage space in Db2 backup PVC
- Review Db2 pod logs for database-specific errors
Issue: "Manage persistent volume backup is slow"
- PV backup duration depends on data volume
- Large JMS server or attachment PVCs can take significant time
- Monitor backup progress in Tekton pipeline logs
- Consider scheduling backups during maintenance windows
- Ensure network bandwidth is sufficient for data transfer
Restore Overview¶
The MAS restore process uses Tekton pipelines to orchestrate the restoration of MAS instances from backup archives. The restore operation can recover a complete MAS environment or selectively restore components based on your requirements. The restore process provides extensive configuration flexibility, allowing you to modify key settings during restoration such as domain names, SLS/DRO URLs, and storage classes.
Restore Components¶
The restore process handles the following components:
- IBM Operator Catalogs - Restores catalog source definitions
- Certificate Manager - Restores certificate configurations (RedHat only)
- MongoDB - Restores MongoDB instance with SLS & MAS databases (Community Edition only)
- Suite License Service (SLS) - Restores SLS instance with license server data (optional, can use external SLS)
- MAS Suite Configuration - Restores core MAS instance configuration and custom resources
- Suite-level SLSCfg - Restores or provides custom Suite-level SLS configuration with optional URL override
- Suite-level BASCfg/DROCfg - Restores or provides custom Suite-level DRO/BAS configuration with optional URL override
- Manage Database - Optionally restores incluster Db2 database associated with Manage workspace
- Manage Application - Optionally restores Manage application namespace resources and persistent volume data
- Grafana - Optionally installs Grafana for monitoring (not part of backup)
- Data Reporter Operator (DRO) - Optionally installs DRO (not part of backup), when DRO is installed, an auto-generated Suite-level BASCfg CR will be applied automatically.
Restore Limitations¶
Warning
Be aware of the following limitations before performing a restore:
- Restoring from S3 or Artifactory Only - When using the pipeline, the restore process is limited to restoring from S3 or Artifactory. Restoring from a local backup file is not supported yet.
- MongoDB Community Edition only - Restore supports only in-cluster MongoDB Community Edition. Restoring to an external or enterprise MongoDB deployment is not supported.
- Db2 standalone operator only - The restore process supports only the in-cluster standalone Db2 operator. Other Db2 operator implementations are not included.
- Db2uInstance not supported, only Db2uCluster - The restore process does not support Db2uInstance for now. Will be supported in future release.
- Certificate Manager (RedHat only) - Certificate Manager restore is supported only for RedHat Certificate Manager. Other implementations are not handled during restore.
- Same MAS version required - Restoring a backup to a cluster running a different MAS version may result in incompatibilities. It is strongly recommended to restore to the same MAS version as the backup source.
- Same MAS Instance ID required - It is strongly recommended to restore to the same MAS instance ID as the backup source.
- Manage application only for app restore - Only the Manage application is supported. Other MAS applications will be supported in future releases.
- Tekton pipeline dependency - The restore process requires Tekton pipelines to be available and functional on the target cluster.
- Target cluster must be pre-provisioned - The restore process does not provision a new OpenShift cluster. A running, accessible cluster with sufficient resources must already exist.
- Storage class compatibility - The target cluster must have compatible storage classes. If storage classes differ from the source cluster, overrides must be explicitly configured.
- No partial component restore - Individual components cannot be selectively restored in isolation without running the full pipeline; component selection is configured at pipeline launch time.
- Manual Certificate Management Restriction: Certificates and secrets from backups will be restored. However, changing the domain during the restoration process will cause issues with manual certificates/secrets, and manual updates of certificates and secrets are required.
- Domain changes require DNS updates - If restoring with a domain change, DNS records and TLS certificates must be updated manually outside of the restore process.
- Single MAS instance per restore - Each restore operation targets a single MAS instance. Restoring multiple instances requires separate restore runs.
- Grafana and DRO are not restored from backup - Grafana and DRO are optionally installed fresh during restore; their previous configurations are not recovered from the backup archive. However, Suite-level BASCFG CR resource is backed up and can be restored.
- No support for CP4D - The restore process does not support restoring CP4D environments.
Tip
We are working on reducing the limitations of the restore process and will be adding new capabilties and support for other MAS applications in future releases.
Configuration Flexibility¶
The restore process supports several configuration overrides to adapt the restored environment to new infrastructure:
- Domain Configuration - Change the MAS domain in the Suite CR during restore
- SLS Configuration - Restore Suite-level SLSCfg from backup or provide custom configuration file, with optional SLS URL override
- DRO/BAS Configuration - Restore Suite-level BASCfg from backup or provide custom configuration file, with optional DRO URL override
- Storage Class Override - Override storage classes for all components (MongoDB, Manage app, Manage DB) when restoring to clusters with different storage providers
- SLS Domain Override - Change the SLS domain used in the License Service CR
- Backup Download - Download backup archives from S3 or Artifactory before restore (useful for cross-cluster restores)
Backup Archive Management¶
The restore process can work with backup archives in multiple ways:
- Local Backup - Restore from backup archives already present in the cluster
- S3 Download - Download backup archives from S3-compatible storage before restore
- Artifactory Download - Download backup archives from Artifactory (development mode only)
- Custom Archive Names - Support for custom backup archive naming conventions
- Automatic Cleanup - Optional cleanup of downloaded archives after successful restore
When downloading from S3 or Artifactory, the download_backup_archive role selectively downloads only the archives required for the restore operation. The following archive selection parameters control which archives are downloaded:
| Parameter | Default | Description |
|---|---|---|
include_sls_archive |
false |
Download the SLS backup archive |
include_manage_db_archive |
false |
Download the Manage Db2 database backup archive |
include_manage_app_archive |
false |
Download the Manage application backup archive |
These parameters are automatically set by the restore pipeline based on the restore configuration (e.g. --restore-manage-app, --restore-manage-db, --include-sls), so you do not need to set them manually when using the mas restore command.
Ansible DevOps Integration¶
The mas restore command launches a Tekton pipeline that executes the following Ansible roles from the IBM MAS DevOps Collection:
ibm.mas_devops.ibm_catalogs- Restores IBM Operator Catalog definitionsibm.mas_devops.cert_manager- Restores Certificate Manager configurationsibm.mas_devops.mongodb- Restores MongoDB Community Edition instance and databaseibm.mas_devops.sls- Restores Suite License Service dataibm.mas_devops.suite_restore- Restores MAS Core configurationibm.mas_devops.db2- Restores Db2u instance and databaseibm.mas_devops.suite_app_restore- Restores supported MAS Application configurationibm.mas_devops.grafana- Installs Grafana (optional)ibm.mas_devops.dro- Installs Data Reporter Operator (optional)
Restore Modes¶
Interactive Mode¶
Interactive mode guides you through the restore process with prompts for all required configuration. This is the recommended approach for manual restores.
docker run -ti --rm quay.io/ibmmas/cli mas restore
The interactive session will:
- Prompt for OpenShift cluster connection
- Request MAS instance ID (must match backup)
- Request backup version to restore
- Configure MongoDB storage class override
- Configure Grafana installation
- Configure SLS restoration
- Configure DRO installation
- Configure MAS domain settings
- Configure SLS and DRO configuration options
- Configure Manage application restore
- Configure Manage Db2 restore
- Request backup storage size
- Offer optional download from S3 or Artifactory
Non-Interactive Mode¶
Non-interactive mode is ideal for automation, scheduled restores, and CI/CD pipelines. All required parameters must be provided via command-line arguments.
docker run -ti --rm quay.io/ibmmas/cli mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--no-confirm
Restore Process Details¶
Pipeline Execution¶
When you run mas restore, the following occurs:
- Validation - Verifies cluster connectivity and prerequisites
- Namespace Preparation - Creates/updates
mas-{instance-id}-pipelinesnamespace - OpenShift Pipelines - Validates or installs OpenShift Pipelines Operator
- PVC Creation - Provisions persistent volume for backup storage
- Tekton Pipeline Launch - Submits PipelineRun with configured parameters
- Pre-Restore Check - Validates cluster readiness
- Download (optional) - Downloads backup archive from S3 or Artifactory
- Component Restore - Executes restore tasks in sequence:
- IBM Catalogs restore
- Certificate Manager restore
- Grafana installation (if enabled)
- MongoDB restore (with optional storage class override)
- SLS restore (if included)
- DRO installation (if enabled)
- Suite Restore - Restores MAS core configuration with optional domain/URL overrides
- Manage Application Restore (if enabled) - Restores Manage application and database
- Post-Restore Verification - Validates restored MAS instance
- Workspace Cleanup (optional, default: enabled) - Cleans backup and config workspaces
Monitoring Progress¶
After launching the restore, a URL to the Tekton PipelineRun is displayed:
View progress:
https://console-openshift-console.apps.cluster.example.com/k8s/ns/mas-inst1-pipelines/tekton.dev~v1beta1~PipelineRun/mas-restore-20260117-191701-YYMMDD-HHMM
Use this URL to:
- Monitor real-time restore progress
- View logs from individual restore tasks
- Troubleshoot any failures
- Verify successful completion
Configuration Flexibility¶
The restore process provides several options for handling configurations:
MAS Domain Configuration¶
- From Backup (default) - Uses the domain stored in the Suite backup
- Override - Specify
--mas-domain-restoreto change the domain during restore
SLS Configuration¶
- From Backup (default) - Restores SLSCfg from backup with
--include-slscfg-from-backup - Custom File - Use
--exclude-slscfg-from-backupand provide--sls-cfg-file - Change URL - Use
--sls-url-restoreto modify the SLS URL while keeping other configuration
DRO Configuration¶
- From Backup (default) - Restores BASCfg from backup with
--include-drocfg-from-backup - Custom File - Use
--exclude-drocfg-from-backupand provide--dro-cfg-file - Change URL - Use
--dro-url-restoreto modify the DRO URL while keeping other configuration
Restore Scenarios - Non-Interactive Mode¶
Scenario 1: Basic Restore from Local Backup¶
Environment: - Backup archive already present in the cluster PVC - Standard restore with all defaults
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--no-confirm
Scenario 2: Restore with S3 Download¶
Environment: - Backup stored in AWS S3 - Need to download before restore
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--download-backup \
--aws-access-key-id AKIAIOSFODNN7EXAMPLE \ #pragma: allowlist secret
--aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \ #pragma: allowlist secret
--s3-bucket-name mas-backups-prod \
--s3-region us-east-1 \
--no-confirm
Scenario 3: Restore with Domain Change¶
Environment: - Restoring to a different cluster with new domain - Need to update MAS domain
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--mas-domain-restore new-cluster.example.com \
--no-confirm
Scenario 4: Restore with External SLS¶
Environment: - Using external SLS instance - Skip SLS restore but provide custom SLS configuration
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--exclude-sls \
--exclude-slscfg-from-backup \
--sls-cfg-file /path/to/custom-sls-config.yaml \
--no-confirm
Scenario 5: Restore with SLS URL Override¶
Environment: - Restore SLS from backup but change the URL - SLS moved to different endpoint
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--include-sls \
--include-slscfg-from-backup \
--sls-url-restore https://new-sls.example.com \
--no-confirm
Scenario 6: Restore with DRO Installation¶
Environment: - Install new DRO instance during restore - Provide DRO configuration details
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--include-dro \
--ibm-entitlement-key YOUR_ENTITLEMENT_KEY \ #pragma: allowlist secret
--contact-email admin@example.com \
--contact-firstname John \
--contact-lastname Doe \
--dro-namespace redhat-marketplace \
--no-confirm
Scenario 7: Restore Without Grafana¶
Environment: - Skip Grafana installation - Monitoring not required
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--exclude-grafana \
--no-confirm
Scenario 8: Complete Restore with All Options¶
Environment: - Download from S3 - Change domain and SLS URL - Install DRO and Grafana - Custom storage size
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--backup-storage-size 100Gi \
--mas-domain-restore new-cluster.example.com \
--include-sls \
--include-slscfg-from-backup \
--sls-url-restore https://new-sls.example.com \
--include-drocfg-from-backup \
--dro-url-restore https://new-dro.example.com \
--include-grafana \
--include-dro \
--ibm-entitlement-key YOUR_ENTITLEMENT_KEY \ #pragma: allowlist secret
--contact-email admin@example.com \
--contact-firstname John \
--contact-lastname Doe \
--dro-namespace redhat-marketplace \
--download-backup \
--aws-access-key-id AKIAIOSFODNN7EXAMPLE \ #pragma: allowlist secret
--aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \ #pragma: allowlist secret
--s3-bucket-name mas-backups-prod \
--s3-region us-east-1 \
--no-confirm
Scenario 9: Restore for Troubleshooting (No Cleanup)¶
Environment: - Need to inspect workspace contents after restore - Workspace cleanup disabled
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--no-clean-backup \
--no-confirm
Note
Use --no-clean-backup when you need to inspect the restore workspace contents for troubleshooting. Remember to manually clean up the workspaces later to free up storage.
Scenario 10: Emergency Restore (Skip Pre-Check)¶
Environment: - Emergency restore scenario - Skip pre-restore validation for speed
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--skip-pre-check \
--no-confirm
Warning
Use --skip-pre-check only in emergency situations. Pre-restore checks validate cluster readiness and can prevent restore failures.
Scenario 11: Restore with MongoDB Storage Class Override¶
Environment: - Restoring to a cluster with different storage classes - Need to override MongoDB storage class
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--override-mongodb-storageclass \
--mongodb-storageclass-name custom-rwo-storage \
--no-confirm
Scenario 12: Restore with Manage Application¶
Environment: - Need to restore Manage application in addition to MAS Suite - Restore Manage namespace resources and persistent volume data
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--restore-manage-app \
--no-confirm
Scenario 13: Restore with Manage Application and Database¶
Environment: - Restore both Manage application and its incluster Db2 database - Complete Manage workspace restoration
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--restore-manage-app \
--restore-manage-db \
--no-confirm
Warning
Manage database restore is an offline operation. The Manage application will be unavailable during the restore process.
Scenario 14: Restore Manage with Custom Storage Classes¶
Environment: - Restoring to a cluster with different storage infrastructure - Need to override storage classes for both Manage app and Db2
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--restore-manage-app \
--restore-manage-db \
--override-manage-app-storageclass \
--manage-app-storage-class-rwx custom-rwx-storage \
--manage-app-storage-class-rwo custom-rwo-storage \
--override-manage-db-storageclass \
--manage-db-storage-class-rwx custom-rwx-storage \
--manage-db-storage-class-rwo custom-rwo-storage \
--no-confirm
Note
The Manage Db2 storage class override now uses a single ReadWriteMany (--manage-db-storage-class-rwx) and ReadWriteOnce (--manage-db-storage-class-rwo) storage class, applied across all Db2 persistent volumes based on the access modes. The previous per-volume flags (--manage-db-meta-storage-class, --manage-db-data-storage-class, --manage-db-backup-storage-class, --manage-db-logs-storage-class, --manage-db-temp-storage-class) have been removed.
Scenario 15: Complete Restore with MongoDB Override and Manage¶
Environment: - Comprehensive restore with all new features - Override MongoDB storage class - Restore Manage application and database - Download from S3
Restore Command:
mas restore \
--instance-id inst1 \
--restore-version 20260117-191701 \
--backup-storage-size 100Gi \
--override-mongodb-storageclass \
--mongodb-storageclass-name custom-rwo-storage \
--restore-manage-app \
--restore-manage-db \
--override-manage-db-storageclass \
--manage-db-storage-class-rwx custom-rwx-storage \
--manage-db-storage-class-rwo custom-rwo-storage \
--download-backup \
--aws-access-key-id AKIAIOSFODNN7EXAMPLE \ #pragma: allowlist secret
--aws-secret-access-key wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \ #pragma: allowlist secret
--s3-bucket-name mas-backups-prod \
--s3-region us-east-1 \
--no-confirm
Restore Best Practices¶
Pre-Restore Checklist¶
- Verify Backup Integrity - Ensure backup archives are complete and accessible
- Check Cluster Resources - Verify sufficient CPU, memory, and storage
- Review Target Environment - Confirm cluster version and configuration compatibility
- Plan Domain Changes - Determine if domain or URL changes are needed
- Prepare External Services - Ensure external SLS/DRO are accessible if used
- Review Storage Classes - Identify if MongoDB or Manage storage class overrides are needed
- Plan Manage Restore - Determine if Manage application and database should be restored
- Document Configuration - Record any custom configurations or overrides
During Restore¶
- Monitor Pipeline - Watch the Tekton PipelineRun for any issues
- Check Logs - Review task logs if any failures occur
- Verify Components - Ensure each component restores successfully
- Note Timing - Track restore duration for future planning
Post-Restore Verification¶
- Validate Suite Status - Confirm MAS Suite CR is ready
- Check Application Access - Verify MAS applications are accessible
- Test Integrations - Validate connections to databases and external services
- Verify MongoDB - Confirm MongoDB is running with correct storage class if overridden
- Validate Manage Application - If restored, verify Manage application is accessible and functional
- Check Manage Database - If restored, confirm Db2 database is running with correct storage classes
- Review Configurations - Confirm all configurations are correct
- Update DNS - Update DNS records if domain changed
- Test Functionality - Perform smoke tests on critical functions
Common Restore Scenarios¶
Disaster Recovery¶
- Use latest backup from off-site storage
- May require domain and URL changes
- Verify all external dependencies are available
- Consider MongoDB storage class override if infrastructure changed
- Include Manage application and database restore if needed
Cluster Migration¶
- Download backup from source cluster storage
- Change domain to match new cluster
- Update SLS and DRO URLs if needed
- Override MongoDB and Manage storage classes for different infrastructure
- Verify network connectivity and routes
- Plan for Manage database downtime during restore
Environment Cloning¶
- Use production backup for dev/test
- Change domain to avoid conflicts
- Consider using external SLS to share licenses
- May exclude DRO for non-production environments
- Override storage classes to use lower-cost storage in non-production
- Optionally restore Manage application for testing
Restore Troubleshooting¶
Common Restore Issues¶
Issue: "Backup archive not found"
- Verify backup archive exists in PVC or download location
- Check backup version matches the archive name
- Ensure download credentials are correct if downloading from S3/Artifactory
Issue: "Pre-restore check failed"
- Review cluster resource availability
- Check OpenShift version compatibility
- Verify required operators are available
- Use
--skip-pre-checkonly if necessary
Issue: "MongoDB restore failed"
- Verify MongoDB namespace and instance name match backup
- Ensure sufficient storage for MongoDB data
- Check MongoDB operator is installed and ready
- If using storage class override, verify the storage class exists and is accessible
- Ensure the specified storage class supports ReadWriteOnce access mode
Issue: "SLS restore failed"
- Verify SLS namespace is correct
- Check if using
--include-slsor--exclude-slsappropriately - Ensure SLS configuration file is valid if using custom config
Issue: "Suite restore failed with domain mismatch"
- Use
--mas-domain-restoreto override domain from backup - Verify DNS records are updated for new domain
- Check certificate configurations match new domain
Issue: "DRO installation failed"
- Verify IBM entitlement key is valid
- Check DRO namespace has sufficient permissions
- Ensure contact information is provided correctly
Issue: "Download from S3 failed"
- Verify AWS credentials are correct
- Check S3 bucket exists and is accessible
- Verify network connectivity to AWS
- Ensure IAM permissions allow GetObject operations
Issue: "Configuration file not found" Issue: "Manage application restore failed"
- Verify Manage workspace exists in the backup
- Ensure sufficient storage for Manage application persistent volumes
- Check that storage class overrides (if specified) are valid and accessible
- Verify both ReadWriteMany and ReadWriteOnce storage classes are available if using overrides
- Review Manage namespace for any conflicting resources
Issue: "Manage Db2 database restore failed"
- Verify Db2 instance exists in the backup
- Ensure sufficient storage for all Db2 persistent volumes (meta, data, backup, logs, temp)
- Check that all specified storage classes exist and support required access modes
- Verify Db2 operator is installed and ready
- Review Db2 pod logs for specific error messages
- Note: Db2 restore is an offline operation - ensure no active connections during restore
Issue: "Storage class not found during restore"
- Verify the specified storage class exists in the target cluster:
oc get storageclass - Check storage class supports the required access mode (RWO or RWX)
- If using cluster defaults, ensure default storage classes are configured
-
Review storage class provisioner compatibility with the cluster infrastructure
-
Verify custom config file paths are correct
- Ensure files are accessible from the CLI container
- Check file format is valid YAML
Additional Resources¶
MAS CLI Documentation¶
- Backup Command Reference - Complete backup command-line options and usage
- Restore Command Reference - Complete restore command-line options and usage
Ansible DevOps Collection¶
- Backup and Restore Playbook - Detailed Ansible playbook documentation
- Execution Environment - Ansible Automation Platform setup guide
- IBM Catalogs Role - IBM Operator Catalog backup/restore
- Certificate Manager Role - Certificate Manager backup/restore
- MongoDB Role - MongoDB backup/restore
- SLS Role - Suite License Service backup/restore
- Suite Backup Role - MAS Core backup
- Suite Restore Role - MAS Core restore
- Suite App Backup Role - MAS application backup (generic)
- Db2 Role - Db2 database backup/restore
- Grafana Role - Grafana installation
- DRO Role - Data Reporter Operator installation
External Documentation¶
- MAS Documentation - Official IBM Maximo Application Suite documentation
- OpenShift Pipelines - Tekton pipeline documentation
- Ansible DevOps Collection - Complete Ansible automation documentation
- Red Hat Ansible Automation Platform - Enterprise automation platform