Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: add healthcheck #1559

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

MNThomson
Copy link

Context

Added a Docker HEALTHCHECK to both the production & dev Dockerfiles.

Now ok, this bash is absolutely cursed and could be removed by just running apt install curl. I didn't install curl since no other apt packages are installed in the runtime image and I'm assuming we want to keep the production image small. If that's acceptable though, adding curl is probably a better route.

Test

$ docker build -t libsql . && docker run --name libsql libsql

# another terminal
$ docker ps
CONTAINER ID   IMAGE    COMMAND                  CREATED              STATUS
6ac6656767ec   libsql   "/usr/local/bin/dock…"   About a minute ago   Up About a minute (healthy)

$ docker inspect --format='{{json .State.Health}}' libsql | jq
{
  "Status": "healthy",
  "FailingStreak": 0,
  "Log": [
    {
      "Start": "2024-07-15T19:43:48.64977628-07:00",
      "End": "2024-07-15T19:43:48.703180793-07:00",
      "ExitCode": 0,
      "Output": ""
    }
  ]
}

Modifying docker-health.sh to point to a port that clearly isn't running or to /doesnotexist will cause the healthcheck to fail (to simulate the /health endpoint not responding):

$ docker ps
CONTAINER ID   IMAGE    COMMAND                  CREATED              STATUS
9c889dd1ef0d   libsql   "/usr/local/bin/dock…"   About a minute ago   Up About a minute (unhealthy)

@haaawk
Copy link
Contributor

haaawk commented Jul 25, 2024

What's the benefit of having those health checks @MNThomson ?

@haaawk
Copy link
Contributor

haaawk commented Jul 25, 2024

Could you please rebase the PR and resolve conflicts @MNThomson ?

@flexchar
Copy link

I can chime in regards to benefits of health check - it enables modern reverse proxies such as Træfik to postpone routing traffic to unhealthy containers and monitoring services to listen and send notifications when container becomes unhealthy.

On Kubernetes environment, it is how K knows when to restart the container or which one to route from service.

It's actually a good thing. :)

@haaawk
Copy link
Contributor

haaawk commented Jul 30, 2024

Thanks @flexchar . It seems that we could merge this once @MNThomson resolves the conflicts

@MNThomson
Copy link
Author

Thanks @flexchar! Healthchecks help the container scheduler know when to recreate containers (and especially help in container dependency graphs).

Rebased @haaawk, should be good to go!


COPY --from=gosu /usr/local/bin/gosu /usr/local/bin/gosu
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY --from=builder /target/release/sqld /bin/sqld

USER root

HEALTHCHECK --interval=2s CMD /usr/local/bin/docker-healthcheck.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@athoscouto I would like to merge this PR. Just want to make sure this won't cause us any troubles on Fly. Please confirm.

Copy link
Contributor

@haaawk haaawk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @sivukhin you recently did some work around Dockerfiles. Could you please have a look as well?

@haaawk haaawk requested a review from sivukhin July 31, 2024 10:12

COPY --from=gosu /usr/local/bin/gosu /usr/local/bin/gosu
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY --from=builder /target/release/sqld /bin/sqld

USER root

HEALTHCHECK --interval=2s CMD /usr/local/bin/docker-healthcheck.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MNThomson, sorry for late response - let's make HEALTHCHECK parameters configurable to simplify build of image with tuned parameters like this:

ARG HEALTHCHECK_INTERVAL=2s
ARG HEALTHCHECK_TIMEOUT=5s
ARG HEALTHCHECK_START_PERIOD=0s
ARG HEALTHCHECK_RETRIES=3
...
HEALTHCHECK --interval=$HEALTHCHECK_INTERVAL --timeout=$HEALTHCHECK_TIMEOUT --start-period=$HEALTHCHECK_START_PERIOD --retries=$HEALTHCHECK_RETRIES CMD ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be even better to have a conditional check and add healthcheck only if those parameters are set

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A healthcheck should always run, please do not provide a footgun.

However this could be best managed by tuning it correctly by default.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth a mention that a user can easily disable a healthcheck using docker CLI or docker compose or any other container execution framework. But if that one is correct in the first place, it is highly unlikely and provides a lot of value. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not every execution framework gives that option so it would be great to be able to control it with env vars

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR; ship it without the healthcheck enabled by default, and update the dockerhub README with an example on how to enable it using docker compose.

Agreed with @haaawk on conditionally enabling, either via cmd and/or env. I'm fighting github actions right now at 2AM to get the go-gorm db wrapper shipped, and it was all because there was no way to configure a healthcheck, and the service was taking too long to come up.

For now I'm putting the pipeline to sleep to give the service enough time to come up, but it would be a nightmare if the health check always killed the container before the server fully started only because the host was slow. While I understand @Everspace's concern, neither to mysql nor postgres docker images have default healthcheck; but both images do offer facilities/cli commands to enable them if needed. I suspect this is to prevent containers from spontaneously ejecting or self-destructing even though the user hasn't set a health check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants