Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] failed to verify certificate when using internal grafana #1712

Closed
amimof opened this issue Oct 10, 2024 · 1 comment · Fixed by #1713
Closed

[Bug] failed to verify certificate when using internal grafana #1712

amimof opened this issue Oct 10, 2024 · 1 comment · Fixed by #1713
Labels
bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@amimof
Copy link
Contributor

amimof commented Oct 10, 2024

Describe the bug

Since v5.13.0 grafana-operator is unable to connect to grafana internally. The operator seem to be using the wrong URL. Should be grafana-service.grafana.svc not grafana-service.grafana. The logs show the following error

Get \"https://grafana-service.grafana:3000/api/folders?limit=1000&page=1\": tls: failed to verify certificate: x509: certificate is valid for grafana-service.grafana.svc, grafana-service.grafana.svc.cluster.local, not grafana-service.grafana

We downgraded to 5.12.0 which resolved the issue however the URL is still wrong. The reason why we get an error in 5.13 and not 5.12 is because the HTTP client no longer skips TLS verification due to a recent patch. The PR #1628 introduced the TLS block which which now no longer sets InsecureSkipVerify to true and causes connections to the wrong URL to fail.

Version
v5.13.0

To Reproduce
Steps to reproduce the behavior:

  1. Deploy grafana-operator 5.13.0
  2. Deploy an instance of Grafana using internal connection through a service
  3. Check the logs of the operator
  4. See the status field of the Grafana instance.
  5. See logs for error

Expected behavior
status.adminUrl whould be https://grafana-service.grafana.svc:3000 not https://grafana-service.grafana:3000

Suspect component/Location where the bug might be occurring
https://github.com/grafana/grafana-operator/pull/1628/files#diff-6cdc29acf17bf0a9db52d78749fa9a849649cff43cfd3db067e1e4868d8c40caL22

Screenshots
If applicable, add screenshots to help explain your problem.

Runtime (please complete the following information):

  • OS: Linux
  • Grafana Operator Version: 5.13.0
  • Environment: OpenShift 4.16
  • Deployment type: Openshift OLM
  • Other:

Additional context

operator logs

2024-10-10T12:38:13Z    ERROR   GrafanaDashboardReconciler      error reconciling dashboard     {"controller": "grafanadashboard", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaDashboard", "GrafanaDashboard": {"name":"mimir-top-tenants","namespace":"grafana"}, "namespace": "grafana", "name": "mimir-top-tenants", "reconcileID": "15e56867-8e69-4b8c-b150-f2468a643c1a", "dashboard": "mimir-top-tenants", "grafana": "grafana", "error": "Get \"https://grafana-service.grafana:3000/api/folders?limit=1000&page=1\": tls: failed to verify certificate: x509: certificate is valid for grafana-service.grafana.svc, grafana-service.grafana.svc.cluster.local, not grafana-service.grafana"}
github.com/grafana/grafana-operator/v5/controllers.(*GrafanaDashboardReconciler).Reconcile
        github.com/grafana/grafana-operator/v5/controllers/dashboard_controller.go:268
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
        sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler

Grafana.yaml

kind: Grafana
metadata:
  labels:
    app: grafana
    dashboards: grafana
  name: grafana
  namespace: grafana
spec:
  config:
    analytics:
      check_for_updates: "false"
      reporting_enabled: "false"
    auth:
      disable_login_form: "false"
      oauth_allow_insecure_email_lookup: "true"
    auth.basic:
      enabled: "true"
    auth.generic_oauth:
      allow_assign_grafana_admin: "true"
      allow_sign_up: "true"
      api_url: https://kubernetes.default.svc/apis/user.openshift.io/v1/users/~
      auth_url: https://oauth-openshift.apps.${CLUSTER_ROUTES_BASE}/oauth/authorize
      client_id: system:serviceaccount:grafana:grafana
      client_secret: ${OAUTH_CLIENT_SECRET}
      email_attribute_path: metadata.name
      empty_scopes: "false"
      enabled: "true"
      icon: signin
      login_attribute_path: metadata.name
      name: OpenShift
      role_attribute_path: contains(groups[*], 'system:cluster-admins') && 'GrafanaAdmin'
        || contains(groups[*], 'cluster-admins') && 'GrafanaAdmin' || contains(groups[*],
        'cluster-admin') && 'GrafanaAdmin'  || contains(groups[*], 'dedicated-admin')
        && 'GrafanaAdmin' || 'Viewer'
      scopes: user:info user:check-access user:list-projects role:logging-grafana-alertmanager-access:grafana
      tls_client_ca: /run/secrets/kubernetes.io/serviceaccount/ca.crt
      tls_client_cert: /etc/tls/private/tls.crt
      tls_client_key: /etc/tls/private/tls.key
      token_url: https://oauth-openshift.apps.${CLUSTER_ROUTES_BASE}/oauth/token
      use_pkce: "true"
    dataproxy:
      logging: "true"
    grafana_net:
      enabled: "true"
      url: https://grafana.net
    log:
      level: info
      mode: console
    paths:
      data: /var/lib/grafana
      logs: /var/log/grafana
      plugins: /var/lib/grafana/plugins
      provisioning: /etc/grafana/provisioning
    security:
      cookie_secure: "true"
    server:
      cert_file: /etc/tls/private/tls.crt
      cert_key: /etc/tls/private/tls.key
      protocol: https
      root_url: https://grafana-route-grafana.apps.${CLUSTER_ROUTES_BASE}/
    users:
      default_theme: dark
      viewers_can_edit: "true"
  deployment:
    metadata:
      labels:
        app: grafana
    spec:
      template:
        metadata: {}
        spec:
          containers:
          - env:
            - name: GF_SECURITY_ADMIN_USER
              valueFrom:
                secretKeyRef:
                  key: admin-user
                  name: grafana-credentials
            - name: GF_SECURITY_ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: admin-password
                  name: grafana-credentials
            - name: LDAP_ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: ldap-bind-password
                  name: grafana-credentials
            - name: OAUTH_CLIENT_SECRET
              valueFrom:
                secretKeyRef:
                  key: token
                  name: grafana-token
            - name: CLUSTER_ROUTES_BASE
              value: "<REDACTED>"
            imagePullPolicy: Always
            name: grafana
            ports:
            - containerPort: 3000
              name: http-grafana
              protocol: TCP
            readinessProbe:
              failureThreshold: 3
              httpGet:
                path: /robots.txt
                port: 3000
                scheme: HTTPS
              initialDelaySeconds: 5
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 2
            resources:
              limits:
                memory: 1Gi
              requests:
                cpu: 250m
                memory: 1Gi
            volumeMounts:
            - mountPath: /etc/tls/private
              name: secret-grafana-tls
            - mountPath: /etc/grafana/provisioning/dashboards
              name: grafana-dashboards
          serviceAccountName: grafana
          volumes:
          - name: secret-grafana-tls
            secret:
              defaultMode: 420
              secretName: grafana-tls
          - configMap:
              name: grafana-dashboards
            name: grafana-dashboards
          - name: grafana-data
            persistentVolumeClaim:
              claimName: grafana-pvc
  persistentVolumeClaim:
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      storageClassName: thin-csi
  route:
    metadata: {}
    spec:
      port:
        targetPort: http-grafana
      tls:
        insecureEdgeTerminationPolicy: Redirect
        termination: reencrypt
      to:
        kind: Service
        name: grafana-service
        weight: 100
      wildcardPolicy: None
  service:
    metadata:
      annotations:
        service.beta.openshift.io/serving-cert-secret-name: grafana-tls
      labels:
        app: grafana
    spec:
      ports:
      - name: http-grafana
        port: 3000
        protocol: TCP
        targetPort: http-grafana
      selector:
        app: grafana
      sessionAffinity: None
      type: ClusterIP
  version: 10.4.3
status:
  adminUrl: https://grafana-service.grafana:3000
  dashboards:
  - REDACTED
  datasources:
  - REDACTED
  stage: complete
  stageStatus: success
@amimof amimof added bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 10, 2024
amimof added a commit to amimof/grafana-operator that referenced this issue Oct 10, 2024
@theSuess
Copy link
Member

Thanks for the report! I'll close this issue in favor of #1675 as we're tracking TLS incompatibilities there.

We'll discuss your PR in our next maintainer meeting, but I think this is a good call!

@theSuess theSuess closed this as not planned Won't fix, can't repro, duplicate, stale Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants