RHOSP13 (Queens) Deployment Troubleshooting
Mixing things up a little, let’s take a look at how we can troubleshoot Red Hat OpenStack Platform 13 deployment errors.
Red Hat OpenStack Platform 13, and indeed TripleO Queens, uses os-collect-config to pull configuration from the Director node and apply it:
https://github.com/openstack/os-collect-config/tree/stable/queens
Where my particular environment was failing:
https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/overcloud.j2.yaml#L906-L942
It was running docker-puppet.py, and ultimately failing when Puppet tried to get a list of Cinder Volumes:
https://github.com/openstack/puppet-cinder/blob/stable/queens/spec/unit/provider/cinder_type/openstack_spec.rb#L64
We look at how to troubleshoot such an issue from both the Director side, and also by logging into the failing node. We’ll discuss how to determine which node is failing as well.
Ideally, we want to reproduce the failure on the node and resolve the issue before re-running a overcloud deployment which will save us lots of time and guessing.
0:00 Intro
0:22 The error output
1:07 Get more details
1:41 Find the failing task on the node
2:40 Re-run the task that failed
4:33 Re-run after fixing the issue
4:56 Overview of os-collect-config
8:00 Looking at it from the Heat side on Director
9:25 Software Config
11:40 Re-run the deployment
12:15 Example of Ansible run from Director
13:25 Recap
by TripleWho?
redhat openstack