Skip to content

Add Troubleshooting Guide and Improve Installer Script Variable Validation#234

Open
taliandre49 wants to merge 16 commits into
ocp-power-automation:develfrom
taliandre49:update_readme_fix
Open

Add Troubleshooting Guide and Improve Installer Script Variable Validation#234
taliandre49 wants to merge 16 commits into
ocp-power-automation:develfrom
taliandre49:update_readme_fix

Conversation

@taliandre49
Copy link
Copy Markdown

This PR enhances the installer script to improve usability and reduce common setup issues.

Key Updates:

  • Added clearer validation and guidance for required environment variables (IBMCLOUD_API_KEY, RELEASE_VER, etc.)
  • Introduced detailed echo messages explaining defaults and how to override them (e.g., RHCOS version).
  • Clarified the default RHCOS image version (8.3) and provided explicit guidance on changing image versions.
  • New troubleshooting.md under docs/
    • documents common OpenShift on PowerVS installation issues, their causes, and step-by-step resolutions.
    • Covers Terraform state issues, Bastion OS compatibility, reinstallation conflicts, remote-exec failures, LPAR health issues, and missing image errors.
    • References verified solutions and relevant IBM/PowerVS documentation for easier user debugging.

Testing:

  • Verified successful execution by running the installer script end-to-end after these updates.

Context:

These changes address issues observed when users and/or customers encounter missing or outdated image references during installation. The improved checks and messages help users identify configuration gaps earlier and adjust environment variables accordingly.

@ppc64le-cloud-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: taliandre49
Once this PR has been reviewed and has the lgtm label, please assign yussufsh for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ppc64le-cloud-bot
Copy link
Copy Markdown
Contributor

ppc64le-cloud-bot commented Oct 15, 2025

@taliandre49: PR is not mergeable.

Details

The PR state is: blocked

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ppc64le-cloud-bot
Copy link
Copy Markdown
Contributor

Welcome @taliandre49! It looks like this is your first PR to ocp-power-automation/openshift-install-power 🎉

Comment thread docs/troubleShooting.md
Comment thread docs/troubleShooting.md
Comment thread docs/troubleShooting.md Outdated
Comment thread docs/troubleShooting.md Outdated
Comment thread docs/troubleShooting.md Outdated
Comment thread docs/troubleShooting.md Outdated
Comment thread docs/troubleShooting.md Outdated
Comment thread openshift-install-powervs Outdated
Comment thread openshift-install-powervs Outdated
@taliandre49 taliandre49 force-pushed the update_readme_fix branch 2 times, most recently from 2de912e to 06665e3 Compare October 30, 2025 15:40
Comment thread openshift-install-powervs Outdated
Copy link
Copy Markdown
Collaborator

@yussufsh yussufsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not gone through all of the points yet, but have some comments already, hence posting to get the gist of it.

Comment thread README.md Outdated
Comment thread docs/troubleShooting.md Outdated
Comment thread openshift-install-powervs Outdated
Comment thread docs/troubleShooting.md Outdated
Comment thread docs/troubleShooting.md Outdated
@yussufsh
Copy link
Copy Markdown
Collaborator

yussufsh commented Jan 6, 2026

Also @taliandre49, I see it is specifically addressing issues with terraform apply flow, then the best place is to update the known issues in https://github.com/ocp-power-automation/ocp4-upi-powervs itself. https://github.com/ocp-power-automation/openshift-install-power project is a wrapper script having no exposure to terraform, ibmcloud, etc. for the end user, the issues should be specific to the ./openshift-install-powervs script.

@taliandre49
Copy link
Copy Markdown
Author

@yussufsh

Thanks for the great feedback!

I’ve updated the docs so that all Terraform-specific commands are now clearly under an “Advanced Recovery / Developers Only” section with a warning that they should only be run if you fully understand Terraform.
The main resolution steps for users now rely solely on the wrapper (./openshift-install-powervs create), keeping the supported workflow safe and simple.

This keeps the doc user-friendly while still documenting advanced options for developers who need them. Happy to hear your thoughts or suggestions on this approach, or if you’d like me to adjust how this is presented further!

@yussufsh
Copy link
Copy Markdown
Collaborator

yussufsh commented Jan 9, 2026

@yussufsh

Thanks for the great feedback!

I’ve updated the docs so that all Terraform-specific commands are now clearly under an “Advanced Recovery / Developers Only” section with a warning that they should only be run if you fully understand Terraform. The main resolution steps for users now rely solely on the wrapper (./openshift-install-powervs create), keeping the supported workflow safe and simple.

This keeps the doc user-friendly while still documenting advanced options for developers who need them. Happy to hear your thoughts or suggestions on this approach, or if you’d like me to adjust how this is presented further!

Instead of an Advanced section for Terraform, how about adding that to https://github.com/ocp-power-automation/ocp4-upi-powervs/blob/main/docs/known_issues.md list? This way you can refer to that link and keep this repository's known issues specific to openshift-install-power issues.

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
…ding more clarity behind RHOCS version and switching to desired release version

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
…dating order for terraform commands to be placed inside advanced section

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
…issues moving terraform issues to terrrafrom repo

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
…t enhancements

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
@taliandre49
Copy link
Copy Markdown
Author

@yussufsh completed the suggestions separating out the terraform sections

Comment thread README.md Outdated
Comment thread openshift-install-powervs Outdated
Comment thread openshift-install-powervs Outdated
Comment thread openshift-install-powervs Outdated
Comment thread openshift-install-powervs Outdated
Comment thread openshift-install-powervs Outdated
…ing redundant function

Signed-off-by: Natalia Jordan <natalia.jordan@ibm.com>
@taliandre49
Copy link
Copy Markdown
Author

taliandre49 commented Mar 19, 2026

Thanks for the feedback @yussufsh, I removed the check_required_env_vars. However, to preserve the original intent of giving users clearer feedback, I've instead improved the error messages at the two existing validation points you referenced (L675, L681) to include actionable guidance on how to resolve the issue either via environment variable or var file.

The RELEASE_VER informational log has been moved into setup_occli where the version is actually used, which felt like the most natural place for it.

Summary of changes:

  • Updated link to troubleShooting doc in Readme

  • Removed check_required_env_vars function entirely

  • Removed redundant if [[ "$ACTION" != "help" ]] guard in main

  • Improved error messages at existing RHEL and API key validation points with clear resolution steps

  • Moved RELEASE_VER info log into setup_occli

@taliandre49
Copy link
Copy Markdown
Author

@yussufsh @Prajyot-Parab Could you please provide the needed labels to progress the PR. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants