Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Klusterlet registration pod crashlooping during clusteradm join for Oracle Kubernetes Engine cluster #273

Open
mikeshng opened this issue Oct 6, 2022 · 4 comments

Comments

@mikeshng
Copy link
Member

mikeshng commented Oct 6, 2022

When a user try to get a manage cluster to join, the klusterlet-registration-agent goes into CrashLoopBackOff for an Oracle Kubernetes Engine cluster.

When looking at the logs:

builder.go:230] unable to get owner reference (falling back to namespace): pods is forbidden: User "system:serviceaccount:open-cluster-management-agent:klusterlet-registration-sa" 
cannot list resource "pods" in API group "" in the namespace "open-cluster-management-agent"

It seems like the error might be related to the usage of the code here
https://github.com/openshift/cluster-etcd-operator/blob/release-4.13/vendor/github.com/openshift/library-go/pkg/operator/events/recorder.go#L52-L56

It assumes that the POD_NAME env var is populated and if it isn't then it will try to inspect the POD itself to find whatever information it needs. Since the klusterlet agent rbac does not include pod related accesses so this error occurs and blocks the cluster from joining. The current workaround is adding a new role/rolebinding and the error is bypassed.

kubectl version: v1.23.7+1
clusteradm version:
client          version :v0.3.1
server release  version :v1.23.4
default bundle  version :0.8.0
Kubernetes version: v1.23.4
Distribution: Oracle Kubernetes Engine

Reported by @hyder

@qiujian16
Copy link
Member

it will not cause the crashloopback. if the pod name can not be found, namespace based event will be used.

@hyder
Copy link

hyder commented Oct 10, 2022

Hi @qiujian16

The crashloopback is actually happening and is only resolved when I add the RBAC recommended by @mikeshng.

@mikeshng
Copy link
Member Author

@hyder if it's not too difficult to reproduce the issue, could you please recreate it and paste the log in here?

The error might not be coming from that library and somewhere else. I forgot how I found that library in the first place.

@hyder
Copy link

hyder commented Oct 11, 2022

@mikeshng sure, I'll do that tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants