Skip to content
This repository has been archived by the owner on Aug 27, 2024. It is now read-only.

Commit

Permalink
Created docker image and compose configuration for MeMaS (#39)
Browse files Browse the repository at this point in the history
  • Loading branch information
maxyu1115 authored Sep 7, 2023
1 parent f02355e commit 8ccf77a
Show file tree
Hide file tree
Showing 13 changed files with 342 additions and 24 deletions.
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
memas/memas-config.yml
2 changes: 1 addition & 1 deletion .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
- name: Run integration tests
run: |
source setup-env.sh
docker compose up --detach --wait --wait-timeout 60
docker compose up --build --detach --wait --wait-timeout 60
python3 -m pytest integration-tests
docker compose down --volumes
Expand Down
32 changes: 19 additions & 13 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,17 @@ If you are working using WSL, follow this guide to [configure Docker](https://do

Run `source setup-env.sh`, this will install all the needed development tools, as well as setup the needed environment variables.

**And please run `source format.sh` before each commit!**

**NOTE that this command needs to be ran for each new shell instance, since it sets up environment variables.**
### Using Docker
In the top level of this repo, run `docker compose up`, and it will spin up 1 es nodes, 1 scylla nodes, 1 milvus node, and a few more. This is a very basic development setup.
In the top level of this repo, run

```bash
docker compose --profile dev up --build
```

This will spin up 1 MeMaS instance running in gunicorn, 1 es nodes, 1 scylla nodes, 1 milvus node, and a few more. This is a very basic development setup.

To stop docker execution, run Control+C in the terminal you are running `docker compose up`, or run `docker compose down`.

Expand All @@ -28,30 +36,28 @@ docker compose down --volumes

FYI you may need to run `sysctl -w vm.max_map_count=262144` if you get an error when trying to start elasticsearch.

### First time initializing the MeMaS server
**NOTE: Only run this phase when you are working with a clean set of docker dependencies, aka a fresh start or after `docker compose down --volumes`.**
### Developing with local MeMaS outside of Docker
If you only need the MeMaS dependencies and want to run flask/gunicorn locally outside of docker, run this instead to bring up the dependencies in docker:

Due to the service dependencies, the first time running MeMaS, we need to use a special command to initialize and configure the dependencies.
```bash
docker compose up
```

After `source setup-env.sh` and `docker compose up`, wait till the services are fully started.
If this is your first time initializing the MeMaS server, after `docker compose up` and wait till the dependencies are fully started, run `source setup-env.sh`, then

Then run
```bash
flask --app 'memas.app:create_app(config_filename="memas-config.yml", first_init=True)' run
```

This will run for a while then exit. Upon exit, your MeMaS is properly setup.
This will run for a while then exit. Upon exit, your MeMaS is properly setup. **NOTE: Only run this phase when you are working with a clean set of docker dependencies, aka a fresh start or after `docker compose down --volumes`.**

### Running the MeMaS server
After `source setup-env.sh` and `docker compose up`, wait till the services are fully started.

Then run
After MeMaS is properly initialized, run `source setup-env.sh`, then:
```bash
flask --app 'memas.app:create_app(config_filename="memas-config.yml")' run
```
to start the memas server
to start the memas server.

To run the app with wsgi server, run
And to run the app with wsgi server, run
```bash
gunicorn -w 1 -k eventlet 'memas.app:create_app(config_filename="memas-config.yml")'
```
Expand Down
37 changes: 37 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
FROM python:3.10

# Create app directory
WORKDIR /memas

# Install Universal Sentence Encoder
RUN wget https://tfhub.dev/google/universal-sentence-encoder/4?tf-hub-format=compressed -O use4.tar
RUN mkdir -p encoder/universal-sentence-encoder_4
RUN tar -xf use4.tar -C encoder/universal-sentence-encoder_4
RUN rm use4.tar

# Install app dependencies
COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt
RUN python3 -c "import nltk; nltk.download('punkt')"


# Bundle app source
COPY logging.ini ./logging.ini
COPY memas ./memas
COPY --chmod=0755 memas-docker/wait-for-it.sh ./wait-for-it.sh
COPY --chmod=0755 memas-docker/init.sh ./init.sh


# Copy in the default config
ARG conf_file=memas-config.yml
ENV conf_file=${conf_file}
COPY memas-docker/${conf_file} ./memas/${conf_file}
# TODO: provide way to use custom configs in docker compose


# Set the python path to include memas, since memas isn't technically a python package
ENV PYTHONPATH "$PYTHONPATH:memas"


EXPOSE 8010
CMD gunicorn -b :8010 -w 1 -k eventlet "memas.app:create_app(config_filename=\"${conf_file}\")"
56 changes: 51 additions & 5 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,14 +1,56 @@
version: '3'

services:

memas-init:
build:
context: .
image: memas:latest
container_name: memas-init
depends_on:
scylla:
condition: service_healthy
milvus:
condition: service_started
es01:
condition: service_started
env_file:
- .env
volumes:
- memas_data:/memas
command: /memas/init.sh 30
profiles: ["dev"]

memas:
build:
context: .
image: memas:latest
container_name: memas
depends_on:
memas-init:
condition: service_completed_successfully
env_file:
- .env
volumes:
- memas_data:/memas
ports:
- 8010:8010
# command: ./wait-for-it.sh milvus-standalone:19530 -t 300 -- gunicorn -w 1 -k eventlet 'memas.app:create_app(config_filename="memas-config.yml")'
profiles: ["dev"]

scylla:
image: scylladb/scylla
container_name: scylla
command: --smp=2
ports:
- "9042:9042"
- 9042:9042
volumes:
- scylla_data:/var/lib/scylla
healthcheck:
test: ["CMD-SHELL", "[ $$(nodetool statusgossip) = running ]"]
interval: 10s
timeout: 5s
retries: 10

etcd:
container_name: milvus-etcd
Expand Down Expand Up @@ -37,7 +79,7 @@ services:
timeout: 20s
retries: 3

standalone:
milvus:
container_name: milvus-standalone
image: milvusdb/milvus:v2.2.8
command: ["milvus", "run", "standalone"]
Expand All @@ -47,13 +89,14 @@ services:
volumes:
- milvus_data:/var/lib/milvus
ports:
- "19530:19530"
- "9091:9091"
- 19530:19530
- 9091:9091
depends_on:
- "etcd"
- "minio"

es01:
container_name: memas-es01
image: elasticsearch:${ES_VERSION}
volumes:
- esdata01:/usr/share/elasticsearch/data
Expand All @@ -71,6 +114,9 @@ services:


volumes:
memas_data:
driver: local

esdata01:
driver: local

Expand All @@ -88,4 +134,4 @@ volumes:

networks:
default:
name: milvus_dev
name: memas_dev
4 changes: 3 additions & 1 deletion integration-tests/corpus/test_basic_corpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@

corpus_name = "test corpus1"


def test_save_then_search_one_corpus(es_client):
test_corpus = basic_corpus.BasicCorpus(uuid.uuid4(), corpus_name, ctx.corpus_metadata, ctx.corpus_doc, ctx.corpus_vec)
test_corpus = basic_corpus.BasicCorpus(
uuid.uuid4(), corpus_name, ctx.corpus_metadata, ctx.corpus_doc, ctx.corpus_vec)

text1 = "The sun is high. California sunshine is great. "
text2 = "I picked up my phone and then dropped it again. I cant seem to get a good grip on things these days. It persists into my everyday tasks"
Expand Down
29 changes: 29 additions & 0 deletions memas-docker/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/bash
# Init script for MeMaS. Sleeps for x seconds to wait for service initialization

# Check if an argument is provided
if [ "$#" -ne 1 ]; then
echo "Usage: $0 <number-of-secs-to-sleep>"
exit 1
fi

num=$1
version="2023-09-06"

# Check if the entered value is a valid number
if ! [[ "$num" =~ ^[0-9]+$ ]]; then
echo "Please enter a valid number."
exit 1
fi

# TODO: introduce actual way of waiting for the dependencies reliably, instead of sleeping. (Note even after the current health checks path, scylla is still not ready)
echo "sleeping $num"
sleep $num


if [ ! -e /memas/first-init.lock ]
then
# If initialization succeeded, create the lock file, and write our current version to it
# FIXME: is running flask instead of gunicorn a security concern? Gunicorn keeps on trying to restart the worker thread despite we're intentionally exiting
flask --app "memas.app:create_app(config_filename=\"$conf_file\", first_init=True)" run && touch /memas/first-init.lock; echo $version > /memas/first-init.lock
fi
13 changes: 13 additions & 0 deletions memas-docker/memas-config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CASSANDRA:
ip: "scylla"
port: 9042
keyspace: "memas"
replication_factor: 1

ELASTICSEARCH:
ip: "memas-es01"
port: 9200

MILVUS:
ip: "milvus-standalone"
port: 19530
Loading

0 comments on commit 8ccf77a

Please sign in to comment.