MAS DevOps Ansible Collection Ansible CLI
Home Ansible Automation Platform OCP Install Cloud Pak For Data Install Core Add AIBroker Add IoT Add Manage Add Monitor Add Optimizer Add Predict Add Visual Inspection Update Upgrade Uninstall Core Backup & Restore ocp_cluster_monitoring ocp_config ocp_deprovision ocp_efs ocp_github_oauth ocp_login ocp_node_config ocp_provision ocp_roks_upgrade_registry_storage ocp_upgrade ocp_verify appconnect aws_bucket_access_point aws_documentdb_user aws_policy aws_route53 aws_user_creation aws_vpc cert_manager cis common-services configure_manage_eventstreams cos cos_bucket cp4d_admin_pwd_update cp4d cp4d_service db2 dro eck grafana ibm_catalogs kafka nvidia_gpu mongodb ocs sls turbonomic uds mirror_case_prepare mirror_extras_prepare mirror_images mirror_ocp ocp_idms ocp_simulate_disconnected_network registry suite_app_config suite_app_install suite_app_uninstall suite_app_upgrade suite_app_rollback suite_app_backup_restore suite_certs suite_config suite_db2_setup_for_manage suite_dns suite_install suite_manage_attachments_config suite_manage_birt_report_config suite_manage_bim_config suite_manage_customer_files_config suite_manage_imagestitching_config suite_manage_import_certs_config suite_manage_load_dbc_scripts suite_manage_logging_config suite_manage_pvc_config suite_uninstall suite_upgrade suite_rollback suite_verify suite_backup_restore ansible_version_check entitlement_key_rotation gencfg_jdbc gencfg_watsonstudio gencfg_workspace gencfg_mongo

nvidia_gpu¤

This role installs the NVIDIA Graphical Processing Unit (GPU) operator and its prerequisite Node Feature Discovery (NFD) operator in an IBM Cloud Openshift cluster console. The role first installs the NFD operator and continues with the final step to install the NVIDIA GPU Operator. The NFD Operator is installed using the Red Hat Operators catalog source and the GPU operator is installed using the Certified Operators catalog source.

Role Variables¤

nfd_namespace¤

The namespace where the node feature discovery operator will be deployed.

nfd_channel¤

The channel to subscribe to for the nfd operator installation and updates. Available channels may be found in the package manifest of nfd operator in openshift.

gpu_namespace¤

The namespace where the NVIDIA GPU operator will be deployed. For version 1.8.x, use of single namespace is not supported, therefore use openshift-operators.

gpu_channel¤

The channel to subscribe to for the gpu operator installation and updates. Available channels may be found in the package manifest of gpu-operator-certified operator in openshift.

gpu_driver_version¤

By default, it will pull the latest version and the environment variable is not needed. If a specific version is needed (due to OS version compatibilities), specify the following environment variable.

See the attached links for more information and to decide which driver version to use. https://catalog.ngc.nvidia.com/orgs/nvidia/containers/driver/tags

gpu_driver_repository_path¤

The gpu driver repository. If using a different repository, you can set the value for this repo. We only support public repositories at the moment.

For more information on the NVIDIA GPU and NFD operators, visit https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/openshift/install-gpu-ocp.html

Example Playbook¤

- hosts: localhost
  any_errors_fatal: true
  roles:
    - ibm.mas_devops.nvidia_gpu

License¤

EPL-2.0