Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ci cd build #85

Merged
merged 294 commits into from
Jan 31, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
294 commits
Select commit Hold shift + click to select a range
f8b8610
adding flush
callahantiff Jan 14, 2021
6f911dd
trying sysout
callahantiff Jan 14, 2021
fc48776
adding logginfg
callahantiff Jan 14, 2021
ef9dadf
fixing directory structure
callahantiff Jan 14, 2021
31ab6f7
repairing file_loc default
callahantiff Jan 14, 2021
58388c8
fixing bad filepath
callahantiff Jan 14, 2021
a9d4c7a
adding owltools
callahantiff Jan 14, 2021
cc53ce4
sub subprocess
callahantiff Jan 14, 2021
ee04bea
fixing owltoolsdownload link
callahantiff Jan 14, 2021
01a11f6
adding memory
callahantiff Jan 14, 2021
3184a72
adding cpu
callahantiff Jan 14, 2021
822f56a
fixing memory spec
callahantiff Jan 14, 2021
ee3cf62
trying 7.99G
callahantiff Jan 14, 2021
ff230dd
adding gib
callahantiff Jan 14, 2021
47c73f8
adding backslash
callahantiff Jan 14, 2021
63806a2
increasing timeout
callahantiff Jan 14, 2021
2c86da0
updated all build scripts
callahantiff Jan 14, 2021
26bb3cf
commenting gce workflow
callahantiff Jan 14, 2021
7e070da
switching to beta for longer timeout
callahantiff Jan 14, 2021
1b5b020
adding quiet
callahantiff Jan 14, 2021
4c1a1fd
adding line wrap
callahantiff Jan 14, 2021
df0e1b4
preparing for pypi release
callahantiff Jan 19, 2021
ff144b3
updating docs for pypi pub prac
callahantiff Jan 19, 2021
d24f5ec
silencing build workflow
callahantiff Jan 19, 2021
35fb048
updating workflow
callahantiff Jan 19, 2021
0a040a7
removing unused workflow line
callahantiff Jan 19, 2021
ad7833a
refreshing docker container
callahantiff Jan 19, 2021
5ab2c0b
removing tagline
callahantiff Jan 19, 2021
18a0af2
testing build
callahantiff Jan 19, 2021
f56d47f
fixing owltools link
callahantiff Jan 19, 2021
3c04d10
fixing filepath error
callahantiff Jan 20, 2021
6da515f
updating github actions
callahantiff Jan 20, 2021
3a31302
removing dash in job name
callahantiff Jan 20, 2021
894b3af
removing colon from job name
callahantiff Jan 20, 2021
0f422f4
adding another step
callahantiff Jan 20, 2021
dd42181
testing wait
callahantiff Jan 20, 2021
bed3bf1
adding job monitoring
callahantiff Jan 20, 2021
21e3a81
take 2
callahantiff Jan 20, 2021
b941c40
adding file directory
callahantiff Jan 20, 2021
ca19828
adding credential
callahantiff Jan 20, 2021
a8a4379
fixing parameter name
callahantiff Jan 20, 2021
eceec03
writing file
callahantiff Jan 20, 2021
3e2c5b1
write key
callahantiff Jan 20, 2021
8ce2663
wrapping secret
callahantiff Jan 20, 2021
fe63601
passing file
callahantiff Jan 20, 2021
148cb0e
setting python env var
callahantiff Jan 20, 2021
603efd9
removed google dep
callahantiff Jan 20, 2021
03ce9ad
adding python back
callahantiff Jan 20, 2021
954c661
combining monitoring and run steps
callahantiff Jan 20, 2021
9a3f6b1
updating builds
callahantiff Jan 20, 2021
ca8fe54
updating workflow
callahantiff Jan 20, 2021
a171fb9
removing uneeded file
callahantiff Jan 20, 2021
eb9b10f
adding pypi push
callahantiff Jan 20, 2021
a0a68dd
fixing import
callahantiff Jan 20, 2021
3e5939c
updating package publish
callahantiff Jan 20, 2021
79a49ae
fixing import error
callahantiff Jan 20, 2021
c6d6a2a
testing phase 2
callahantiff Jan 20, 2021
84da26e
finalized workflow 1
callahantiff Jan 20, 2021
be0de8c
adding print statements
callahantiff Jan 20, 2021
6e207a3
fixing yml error
callahantiff Jan 20, 2021
cf4766c
removing typo in build name
callahantiff Jan 20, 2021
4bdd79e
fixing directory issue
callahantiff Jan 20, 2021
36d2efe
adding missing library
callahantiff Jan 20, 2021
15f4c3e
only testing remaining methods
callahantiff Jan 20, 2021
973f7f8
extending owltools memory
callahantiff Jan 20, 2021
f1e4184
bumping memory
callahantiff Jan 20, 2021
4c30b36
memory bump
callahantiff Jan 20, 2021
ca6d444
updating download logic
callahantiff Jan 20, 2021
49dacfc
updating bucket search criteria
callahantiff Jan 20, 2021
c29ea80
fixing file reading error
callahantiff Jan 21, 2021
537cf46
updating download logic
callahantiff Jan 21, 2021
65c802f
updated metdata logic
callahantiff Jan 21, 2021
7df32d2
adding missing return statement
callahantiff Jan 21, 2021
99cff5a
adding owltools path
callahantiff Jan 21, 2021
8d6eb08
removing directory
callahantiff Jan 21, 2021
7cda24a
updating merge func with owl_tools
callahantiff Jan 21, 2021
b7af7db
adding merge method
callahantiff Jan 21, 2021
d9d2b63
updating merge func
callahantiff Jan 21, 2021
41ef57c
fixing typo
callahantiff Jan 21, 2021
46e052f
adding missing gene key
callahantiff Jan 21, 2021
d3f1ef5
catching errors
callahantiff Jan 21, 2021
7a3aacd
cleaning up work flow
callahantiff Jan 21, 2021
6d5ed82
testing final part of phase 2
callahantiff Jan 21, 2021
3515b65
updating logic
callahantiff Jan 21, 2021
745ddea
added missing argument
callahantiff Jan 21, 2021
bc2a787
updating workflow
callahantiff Jan 21, 2021
a0a97a0
renaming endpoint script
callahantiff Jan 21, 2021
16cc530
finalizing file
callahantiff Jan 21, 2021
3ac9699
adding phase 3
callahantiff Jan 21, 2021
1d3cb4c
addingmissing argument
callahantiff Jan 21, 2021
ca8693f
adding startup script
callahantiff Jan 21, 2021
238babd
specifying zone
callahantiff Jan 21, 2021
7d7106d
not stopping
callahantiff Jan 21, 2021
1203ce4
testing ssh
callahantiff Jan 21, 2021
fadf703
calling startup from ssh command
callahantiff Jan 21, 2021
6a44301
adding curly braces
callahantiff Jan 22, 2021
4ec7b8f
changing container name
callahantiff Jan 22, 2021
f048f81
switch to image
callahantiff Jan 22, 2021
005db3c
fixing end of line error
callahantiff Jan 22, 2021
8f9e0e2
adding auth to bash script
callahantiff Jan 22, 2021
8f298be
trying different strategy
callahantiff Jan 22, 2021
7a7bf9e
fixinf typo
callahantiff Jan 22, 2021
693b7ea
changing instance name
callahantiff Jan 22, 2021
331b86d
adding image
callahantiff Jan 22, 2021
aac88f7
fixinf syntax
callahantiff Jan 22, 2021
ce4f9bf
syntax again...
callahantiff Jan 22, 2021
797b165
chaining arguments
callahantiff Jan 22, 2021
6ee6da8
trying another approach
callahantiff Jan 22, 2021
c3aeb8f
changing approach
callahantiff Jan 22, 2021
842c82d
adding missing file
callahantiff Jan 22, 2021
2627547
adding missing files
callahantiff Jan 22, 2021
cd6a572
fixing errors
callahantiff Jan 22, 2021
3455d84
adding scope arg
callahantiff Jan 22, 2021
8afa9bc
changing job name
callahantiff Jan 22, 2021
73515aa
renamed and moved files
callahantiff Jan 22, 2021
fac2260
adding logging
callahantiff Jan 22, 2021
632ada1
updated file
callahantiff Jan 22, 2021
4af1274
testing phases 1-2
callahantiff Jan 22, 2021
4b2ba60
fixing logging error
callahantiff Jan 22, 2021
b6d1453
dup code removed
callahantiff Jan 22, 2021
261c036
uncommenting code
callahantiff Jan 22, 2021
720c4f3
logging error
callahantiff Jan 22, 2021
9f22acb
fixing write path
callahantiff Jan 22, 2021
4012f78
adding logging library
callahantiff Jan 22, 2021
15c31e3
adding formal logging
callahantiff Jan 22, 2021
08cd608
adding missing module
callahantiff Jan 22, 2021
45360ec
fixing filepath
callahantiff Jan 22, 2021
8411fd1
updating filepath
callahantiff Jan 22, 2021
f7ceba8
removing duplicate resource
callahantiff Jan 22, 2021
ade8d75
moved entry point
callahantiff Jan 23, 2021
1f29225
updated documentation
callahantiff Jan 23, 2021
4539a0f
testing job monitoring
callahantiff Jan 23, 2021
e209079
moving import
callahantiff Jan 23, 2021
11f36eb
installing pkt_kg
callahantiff Jan 23, 2021
268194a
updating logic
callahantiff Jan 23, 2021
5dc7b37
test phase 3 monitoring
callahantiff Jan 23, 2021
c3aaa4c
adding missing module
callahantiff Jan 23, 2021
5881fec
adding ga workflow jobs
callahantiff Jan 23, 2021
d362334
updating code
callahantiff Jan 23, 2021
8beb53d
Merge branch 'ci_cd_build' of github.com:callahantiff/PheKnowLator in…
callahantiff Jan 23, 2021
a148763
testing phase 1 console log
callahantiff Jan 23, 2021
0cf44ca
updating yml
callahantiff Jan 23, 2021
65943c5
adding missing module
callahantiff Jan 23, 2021
42ac482
phase 3 test
callahantiff Jan 23, 2021
e8572e3
adding note about finding build logs
callahantiff Jan 23, 2021
c0fc738
updating logging
callahantiff Jan 23, 2021
c0d3285
Merge branch 'ci_cd_build' of github.com:callahantiff/PheKnowLator in…
callahantiff Jan 23, 2021
3e532ee
fixing logger call
callahantiff Jan 23, 2021
48b9117
take 2
callahantiff Jan 23, 2021
4e78229
testing step 3
callahantiff Jan 23, 2021
954fdb3
test both steps
callahantiff Jan 23, 2021
ce60756
updating logging
callahantiff Jan 23, 2021
bfa1d8c
moving timestamp
callahantiff Jan 23, 2021
d0ed684
fixing log naming issue
callahantiff Jan 23, 2021
2bee40c
adding container restart policy
callahantiff Jan 23, 2021
0257018
testing env vars
callahantiff Jan 24, 2021
7252e0f
removing needs
callahantiff Jan 24, 2021
a9de7ac
changing syntax
callahantiff Jan 24, 2021
110d140
exp 2
callahantiff Jan 24, 2021
cf486ef
fixing value error
callahantiff Jan 24, 2021
4388634
exp3
callahantiff Jan 24, 2021
947ccf3
testing build with env vars
callahantiff Jan 24, 2021
22f1134
remove uneeded scripts
callahantiff Jan 24, 2021
8b15c80
updating env
callahantiff Jan 24, 2021
2ee9f7a
exp5
callahantiff Jan 24, 2021
602ae1f
undoing formatting
callahantiff Jan 24, 2021
4f5f121
fixing EOF error
callahantiff Jan 24, 2021
03343cf
editing job name var
callahantiff Jan 24, 2021
28e97d7
again ...
callahantiff Jan 24, 2021
f169960
again .... 2
callahantiff Jan 24, 2021
aa18a70
again ... 3
callahantiff Jan 24, 2021
1e6582e
rolling back job_name env
callahantiff Jan 24, 2021
11bd68e
changing defaults
callahantiff Jan 24, 2021
9415020
fixing missing filename
callahantiff Jan 24, 2021
a08d947
fixing missing log reference
callahantiff Jan 24, 2021
4f39c34
fixing queueing
callahantiff Jan 24, 2021
9757e21
fixing logic
callahantiff Jan 24, 2021
3745370
adding addiitonal logging
callahantiff Jan 24, 2021
396b534
adding log pushing method
callahantiff Jan 24, 2021
97b84c4
uncommenting methods
callahantiff Jan 24, 2021
0e01fa9
removing untracked files
callahantiff Jan 24, 2021
0bb3d41
testing overhaul for directory re-org
callahantiff Jan 24, 2021
f65ab2e
updating build directory structure
callahantiff Jan 24, 2021
c9b91d7
updating flawed logic
callahantiff Jan 24, 2021
a9d546b
Merge branch 'ci_cd_build' of github.com:callahantiff/PheKnowLator in…
callahantiff Jan 24, 2021
75baa4d
cleaning logs dir before run
callahantiff Jan 24, 2021
44b0763
updated file paths
callahantiff Jan 25, 2021
40d2dbd
adding build utilities
callahantiff Jan 25, 2021
2fcb5d2
testing p1/p2
callahantiff Jan 25, 2021
221ce44
updating env path
callahantiff Jan 25, 2021
092c03a
adding missing library
callahantiff Jan 25, 2021
4e66138
changing filepath
callahantiff Jan 25, 2021
cc8a210
updating dockerfile
callahantiff Jan 25, 2021
65c3169
adding log files
callahantiff Jan 25, 2021
c96bac5
fixing path issues
callahantiff Jan 25, 2021
0bdda38
fixing pythonpath
callahantiff Jan 25, 2021
1488538
removed uneeded script
callahantiff Jan 25, 2021
850572d
updating owltools loc
callahantiff Jan 25, 2021
cd40c93
re-running job
callahantiff Jan 25, 2021
f516024
fixing removed file
callahantiff Jan 25, 2021
fa140cd
fixing logic
callahantiff Jan 25, 2021
981ff35
verified workflow
callahantiff Jan 25, 2021
1510fae
adding google api exceptions
callahantiff Jan 25, 2021
7775f0a
simlifying downloader
callahantiff Jan 25, 2021
c6acedf
removing redundant func
callahantiff Jan 25, 2021
439d7fa
reordering method
callahantiff Jan 25, 2021
ad8ae6e
adding logging steps
callahantiff Jan 25, 2021
4027e33
adding missing info
callahantiff Jan 25, 2021
3f8fb9a
updating curated data path
callahantiff Jan 25, 2021
c03bb46
adding command to delete existing logs
callahantiff Jan 25, 2021
6faa7dc
evaluation args
callahantiff Jan 25, 2021
abf0d11
forcce std out to message
callahantiff Jan 25, 2021
aa7e8f8
adding phase 1 back
callahantiff Jan 25, 2021
8af54a8
making exit more explicit
callahantiff Jan 25, 2021
fe9b613
fixing broken method
callahantiff Jan 25, 2021
5a8f326
fixing broken links
callahantiff Jan 25, 2021
74183f1
patching dl
callahantiff Jan 25, 2021
ab1b810
dedup github action stdout
callahantiff Jan 25, 2021
12258da
fixing owltools ref
callahantiff Jan 26, 2021
c51f4ff
fixing owltools
callahantiff Jan 26, 2021
6d94aef
removing comments
callahantiff Jan 26, 2021
639685e
testing exit method
callahantiff Jan 26, 2021
c0ddfec
testing quit method
callahantiff Jan 26, 2021
53a7f87
reverse commenting
callahantiff Jan 26, 2021
80d901c
update yml to test step 3
callahantiff Jan 26, 2021
a3d1192
adding async process runner
callahantiff Jan 26, 2021
2eddb70
adding timer
callahantiff Jan 26, 2021
4676112
fixing docker spec and script name
callahantiff Jan 26, 2021
466496b
adding logging
callahantiff Jan 26, 2021
c11baee
test kg build
callahantiff Jan 26, 2021
c41ef46
repairing line wrap
callahantiff Jan 26, 2021
df77487
adding missing module
callahantiff Jan 26, 2021
013271b
specifying boot disc for instance
callahantiff Jan 27, 2021
fc936b0
fixing container specs
callahantiff Jan 27, 2021
f23eda8
fixing docker curl calls
callahantiff Jan 27, 2021
3ce2ab1
removed extra carriage return
callahantiff Jan 27, 2021
937af7d
adding kg stats to logging
callahantiff Jan 27, 2021
d9eb3ce
fixing docker dl
callahantiff Jan 27, 2021
89d1234
extending functionality
callahantiff Jan 27, 2021
ff24f2e
Jan kg build stats
callahantiff Jan 27, 2021
801c567
blocked yml for all kg builds
callahantiff Jan 27, 2021
e8c9e19
fixing download assumption
callahantiff Jan 27, 2021
e1eb3a3
updating stats function
callahantiff Jan 27, 2021
a7fd7d1
re-arranged funcs
callahantiff Jan 27, 2021
85cd868
adding additional logging
callahantiff Jan 27, 2021
c7a792c
updating github workflow
callahantiff Jan 28, 2021
cdde933
adding note to documentation for clarification
callahantiff Jan 28, 2021
b5d7e03
adding logo
callahantiff Jan 29, 2021
bd4a1d1
finalizing draft build
callahantiff Jan 30, 2021
ff6a8bc
Merge branch 'master' into ci_cd_build
callahantiff Jan 31, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions .github/workflows/build-qa.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

publish_docker_container:
if: github.event_name == 'release' && startsWith(github.ref, 'refs/tags')
if: github.event_name == 'push'
needs: build
name: Push Docker Image to Docker Hub
runs-on: ubuntu-latest
Expand All @@ -78,13 +78,23 @@ jobs:
run: echo ${{ steps.docker_build.outputs.digest }}

publish_pypi_library:
if: github.event_name == 'release' && startsWith(github.ref, 'refs/tags')
if: github.event_name == 'release'
needs: build
name: Publishes pkt_kg to PyPI
runs-on: ubuntu-latest
steps:
- name: Publish package
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags')
- uses: actions/checkout@v2
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: 3.6
- name: Install Requirements and Dependencies
run: pip install --upgrade pip setuptools wheel
- name: Create Package Distribution
run: python setup.py sdist bdist_wheel
- name: Publish Package tp PyPi
uses: pypa/gh-action-pypi-publish@master
with:
user: __token__
Expand Down
509 changes: 482 additions & 27 deletions .github/workflows/kg-build.yml

Large diffs are not rendered by default.

13 changes: 8 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
__pycache__
__pycache__/
**/__pycache__/*
build/**
pkt_kg.egg*
build/*
dist/*
pkt_kg.egg-info/*

#### Testing
.single_run
Expand All @@ -20,10 +20,13 @@ coverage.xml
\.vscode/
/tests/archive/

#### Logs
logs/*.log
**/logs/*.log

#### CI/CD Builds
builds/temp/*
builds/GitHub Action Workflow Build Jobs.xlsx

/builds/GitHub Action Workflow Build Jobs.xlsx

#### API Keys/Project Passwords
/resources/project_keys/*
Expand Down Expand Up @@ -63,4 +66,4 @@ scratch*.py
!/resources/relations_data/README.md

## Releases
/releases/*
/releases/*
6 changes: 3 additions & 3 deletions Data_Preparation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -944,7 +944,7 @@
"outputs": [],
"source": [
"# download data\n",
"url = 'https://storage.googleapis.com/pheknowlator/release_v2.0.0/curated_data/genomic_typing_dict.pkl'\n",
"url = 'https://storage.googleapis.com/pheknowlator/curated_data/genomic_typing_dict.pkl'\n",
"if not os.path.exists(unprocessed_data_location + 'genomic_typing_dict.pkl'):\n",
" data_downloader(url, unprocessed_data_location)\n",
"\n",
Expand Down Expand Up @@ -2547,7 +2547,7 @@
"outputs": [],
"source": [
"# download data\n",
"url='https://storage.googleapis.com/pheknowlator/release_v2.0.0/curated_data/zooma_tissue_cell_mapping_04JAN2020.xlsx'\n",
"url='https://storage.googleapis.com/pheknowlator/curated_data/zooma_tissue_cell_mapping_04JAN2020.xlsx'\n",
"if not os.path.exists(unprocessed_data_location + 'zooma_tissue_cell_mapping_04JAN2020.xlsx'):\n",
" data_downloader(url, unprocessed_data_location)\n",
" \n",
Expand Down Expand Up @@ -3055,7 +3055,7 @@
"outputs": [],
"source": [
"# download data\n",
"url='https://storage.googleapis.com/pheknowlator/release_v2.0.0/curated_data/genomic_sequence_ontology_mappings.xlsx'\n",
"url='https://storage.googleapis.com/pheknowlator/curated_data/genomic_sequence_ontology_mappings.xlsx'\n",
"if not os.path.exists(unprocessed_data_location + 'genomic_sequence_ontology_mappings.xlsx'):\n",
" data_downloader(url, unprocessed_data_location)\n",
"\n",
Expand Down
41 changes: 30 additions & 11 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/local/bin/docker
# -*- version: 19.03.8, build afacb8b -*-
# -*- version: 20.10.2 -*-

############################################
## MULTI-STAGE CONTAINER CONFIGURATION ##
Expand All @@ -19,25 +19,44 @@ RUN wget -O- https://apt.corretto.aws/corretto.key | apt-key add - && \
## PHEKNOWLATOR (PKT_KG) PROJECT SETTINGS ##
# create needed project directories
WORKDIR /PheKnowLator
# I THINK WE NEED LINE 23 -- which I had to delete to get it to run past this point
#RUN mkdir /PheKnowLator
RUN mkdir /PheKnowLator/resources
RUN mkdir /PheKnowLator/resources/edge_data
RUN mkdir /PheKnowLator/resources/knowledge_graphs
RUN mkdir /PheKnowLator/resources/ontologies

# copy pkt_kg scripts
COPY pkt_kg /PheKnowLator/pkt_kg
RUN mkdir -p /PheKnowLator
RUN mkdir -p /PheKnowLator/resources
RUN mkdir -p /PheKnowLator/resources/construction_approach
RUN mkdir -p /PheKnowLator/resources/edge_data
RUN mkdir -p /PheKnowLator/resources/knowledge_graphs
RUN mkdir -p /PheKnowLator/resources/node_data
RUN mkdir -p /PheKnowLator/resources/ontologies
RUN mkdir -p /PheKnowLator/resources/processed_data
RUN mkdir -p /PheKnowLator/resources/relations_data

# copy scripts/files needed to run pkt_kg
COPY pkt_kg /PheKnowLator/pkt_kg
COPY Main.py /PheKnowLator
COPY setup.py /PheKnowLator
COPY README.rst /PheKnowLator
COPY resources /PheKnowLator/resources

# download and copy needed data
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/edge_source_list.txt && mv edge_source_list.txt resources/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/ontology_source_list.txt && mv ontology_source_list.txt resources/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/resource_info.txt && mv resource_info.txt resources/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/subclass_construction_map.pkl && mv subclass_construction_map.pkl resources/construction_approach/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/PheKnowLator_MergedOntologies.owl && mv PheKnowLator_MergedOntologies.owl resources/knowledge_graphs/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/node_metadata_dict.pkl && mv node_metadata_dict.pkl resources/node_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/DISEASE_MONDO_MAP.txt && mv DISEASE_MONDO_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/ENSEMBL_GENE_ENTREZ_GENE_MAP.txt && mv ENSEMBL_GENE_ENTREZ_GENE_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/ENTREZ_GENE_PRO_ONTOLOGY_MAP.txt && mv ENTREZ_GENE_PRO_ONTOLOGY_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/GENE_SYMBOL_ENSEMBL_TRANSCRIPT_MAP.txt && mv GENE_SYMBOL_ENSEMBL_TRANSCRIPT_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/HPA_GTEx_TISSUE_CELL_MAP.txt && mv HPA_GTEx_TISSUE_CELL_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/MESH_CHEBI_MAP.txt && mv MESH_CHEBI_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/PHENOTYPE_HPO_MAP.txt && mv PHENOTYPE_HPO_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/STRING_PRO_ONTOLOGY_MAP.txt && mv STRING_PRO_ONTOLOGY_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/UNIPROT_ACCESSION_PRO_ONTOLOGY_MAP.txt && mv UNIPROT_ACCESSION_PRO_ONTOLOGY_MAP.txt resources/processed_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/INVERSE_RELATIONS.txt && mv INVERSE_RELATIONS.txt resources/relations_data/
RUN curl -O https://storage.googleapis.com/pheknowlator/current_build/dependencies/RELATIONS_LABELS.txt && mv RELATIONS_LABELS.txt resources/relations_data/

# install needed python libraries
RUN pip install --upgrade pip setuptools
# BILL -- CAN WE REMOVE LINE 41 if we had to add line 21 to get the container to run
WORKDIR /PheKnowLator
RUN pip install .

Expand Down
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include pkt_kg/libs/owltools
19 changes: 9 additions & 10 deletions Main.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,9 @@ def main():
parser.add_argument('-a', '--app', help='construction approach to use (i.e. instance or subclass)', required=True)
parser.add_argument('-t', '--res', help='name/path to text file containing resource_info', required=True)
parser.add_argument('-b', '--kg', help='build type: "partial", "full", or "post-closure"', required=True)
parser.add_argument('-n', '--nde', help='yes/no - non-ontology node metadata directory location', required=True)
parser.add_argument('-r', '--rel', help='yes/no - adding inverse relations to knowledge graph', required=True)
parser.add_argument('-s', '--owl', help='yes/no - removing OWL Semantics from knowledge graph', required=True)
parser.add_argument('-m', '--kgm', help='yes/no - adding node metadata to knowledge graph', required=True)
parser.add_argument('-m', '--nde', help='yes/no - adding node metadata to knowledge graph', required=True)
parser.add_argument('-o', '--out', help='name/path to directory where to write knowledge graph', required=True)

args = parser.parse_args()
Expand All @@ -40,44 +39,44 @@ def main():
# see the 'Data_Preparation.ipynb' and 'Ontology_Cleaning.ipynb' file for examples and guidelines

# STEP 3: DOWNLOAD ONTOLOGIES
print('\n' + '=' * 33 + '\nDOWNLOADING DATA: ONTOLOGY DATA\n' + '=' * 33 + '\n')
print('\n' + '=' * 33 + '\nPKT: DOWNLOADING DATA: ONTOLOGY DATA\n' + '=' * 33 + '\n')
start = time.time()
ont = OntData(data_path=args.onts, resource_data=args.res)
# ont = OntData(data_path='resources/ontology_source_list.txt', resource_data='resources/resource_info.txt')
ont.downloads_data_from_url()
end = time.time()
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print('\nTOTAL SECONDS TO DOWNLOAD ONTOLOGIES: {} @ {}'.format(end - start, timestamp))
print('\nPKT: TOTAL SECONDS TO DOWNLOAD ONTOLOGIES: {} @ {}'.format(end - start, timestamp))

# STEP 4: DOWNLOAD EDGE DATA SOURCES
print('\n' + '=' * 33 + '\nDOWNLOADING DATA: CLASS DATA\n' + '=' * 33 + '\n')
print('\n' + '=' * 33 + '\nPKT: DOWNLOADING DATA: CLASS DATA\n' + '=' * 33 + '\n')
start = time.time()
ent = LinkedData(data_path=args.edg, resource_data=args.res)
# ent = LinkedData(data_path='resources/edge_source_list.txt', resource_data='resources/resource_info.txt')
ent.downloads_data_from_url()
end = time.time()
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print('\nTOTAL SECONDS TO DOWNLOAD NON-ONTOLOGY DATA: {} @ {}'.format(end - start, timestamp))
print('\nPKT: TOTAL SECONDS TO DOWNLOAD NON-ONTOLOGY DATA: {} @ {}'.format(end - start, timestamp))

#####################
# CREATE EDGE LISTS #
#####################

print('\n' + '=' * 33 + '\nPROCESSING EDGE DATA\n' + '=' * 33 + '\n')
print('\n' + '=' * 33 + '\nPKT: PROCESSING EDGE DATA\n' + '=' * 33 + '\n')
start = time.time()
combined_edges = dict(ent.data_files, **ont.data_files)
master_edges = CreatesEdgeList(data_files=combined_edges, source_file=args.res)
# master_edges = CreatesEdgeList(data_files=combined_edges, source_file='resources/resource_info.txt')
master_edges.creates_knowledge_graph_edges()
end = time.time()
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print('\nTOTAL SECONDS TO BUILD THE MASTER EDGE LIST: {} @ {}'.format(end - start, timestamp))
print('\nPKT: TOTAL SECONDS TO BUILD THE MASTER EDGE LIST: {} @ {}'.format(end - start, timestamp))

#########################
# BUILD KNOWLEDGE GRAPH #
#########################

print('\n' + '=' * 33 + '\nBUILDING KNOWLEDGE GRAPH\n' + '=' * 33 + '\n')
print('\n' + '=' * 33 + '\nPKT: BUILDING KNOWLEDGE GRAPH\n' + '=' * 33 + '\n')
start = time.time()

if args.kg == 'partial':
Expand All @@ -102,7 +101,7 @@ def main():

end = time.time()
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print('\nTOTAL SECONDS TO CONSTRUCT A KG: {} @ {}'.format(end - start, timestamp))
print('\nPKT: TOTAL SECONDS TO CONSTRUCT A KG: {} @ {}'.format(end - start, timestamp))


if __name__ == '__main__':
Expand Down
28 changes: 16 additions & 12 deletions Ontology_Cleaning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
"**Dependencies:** \n",
"- <u>Scripts</u>: This notebook utilizes several helper functions, which are stored in [utility scripts](https://github.com/callahantiff/PheKnowLator/blob/master/pkt_kg/utils) and the main code to perform the error checks in the [`builds/temp/ontology_cleaning.py`](https://github.com/callahantiff/PheKnowLator/blob/master/builds/ontology_cleaning.py) script.\n",
"- <u>Software</u>:[`OWLTools`](https://github.com/owlcollab/owltools) \n",
"- <u>Data</u>: [`Merged_gene_rna_protein_identifiers.pkl`](https://storage.googleapis.com/pheknowlator/release_v2.0.0/current_build/data/processed_data/Merged_gene_rna_protein_identifiers.pkl), which is automatically downloaded to the `./resources/ontologies` directory \n",
"- <u>Data</u>: [`Merged_gene_rna_protein_identifiers.pkl`](https://storage.googleapis.com/pheknowlator/current_build/data/processed_data/Merged_gene_rna_protein_identifiers.pkl), which is automatically downloaded to the `./resources/ontologies` directory \n",
"\n",
"<br>\n",
"\n",
Expand Down Expand Up @@ -194,6 +194,7 @@
"import datetime\n",
"import glob\n",
"import pickle\n",
"import shutil\n",
"\n",
"from rdflib import Graph\n",
"from tqdm import tqdm\n",
Expand Down Expand Up @@ -299,7 +300,10 @@
"ont_data = OntologyCleaner('', '', '', write_location)\n",
"\n",
"# updating ontology info dictionary\n",
"ont_data.ontology_info = {k.split('/')[-1]: {} for k, v in ont_data.ontology_info.items()}"
"ont_data.ontology_info = {k.split('/')[-1]: {} for k, v in ont_data.ontology_info.items()}\n",
"\n",
"# set owl tools location\n",
"ont_data.owltools_location = './pkt_kg/libs/owltools'"
]
},
{
Expand Down Expand Up @@ -426,15 +430,6 @@
"ontology_file_formatter(knowledge_graphs_location, '/' + ont_data.ont_file_location, ont_data.owltools_location)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ontology_file_formatter(knowledge_graphs_location, '/' + ont_data.ont_file_location, ont_data.owltools_location)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -480,6 +475,11 @@
" else: o.write('\\t\\t- {}\\n'.format(x['Normalized - Duplicates']))\n",
" o.write('\\t\\t- Other Classes that May Need Normalization: {}\\n'.format(x['Normalized - NonOnt']))\n",
" o.write('\\t\\t- Normalized HGNC IDs: {}\\n'.format(x['Normalized - Gene IDs']))\n",
" o.write('\\t- Deprecated Ontology HGNC Identifiers Needing Alignment:\\n')\n",
" if x['Normalized - Dep'] != 'None':\n",
" for i in x['Normalized - Dep']: o.write('\\t\\t- {}\\n'.format(i))\n",
" else: o.write('\\t\\t- {}\\n'.format(x['Normalized - Dep']))\n",
" \n",
"o.close()"
]
},
Expand All @@ -498,7 +498,11 @@
"source": [
"# remove temp file in resources/ontologies\n",
"os.remove(write_location + '/' + ont_data.ont_file_location)\n",
"os.remove(write_location + '/Merged_gene_rna_protein_identifiers.pkl')"
"os.remove(write_location + '/Merged_gene_rna_protein_identifiers.pkl')\n",
"\n",
"# # remove logs directory\n",
"# logs = glob.glob('*/logs/*.log')\n",
"# shutil.rmtree('/'.join(logs[0].split('/')[:-1]))"
]
},
{
Expand Down
9 changes: 5 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
|logo|
|logo|


|github_action| |mypy|
|github_action| |mypy|

|sonar_quality| |code_climate_maintainability| |codacy| |code_climate_coverage| |coveralls|

Expand Down Expand Up @@ -325,11 +325,12 @@ Citing this Work
url = {https://doi.org/10.5281/zenodo.3401437}}




.. |logo| image:: https://user-images.githubusercontent.com/8030363/106306246-01df9100-621b-11eb-81c3-d1f2c2e124a6.png
:target: https://github.com/callahantiff/PheKnowLator

.. |logo| image:: https://user-images.githubusercontent.com/8030363/106306246-01df9100-621b-11eb-81c3-d1f2c2e124a6.png
:target: https://github.com/callahantiff/PheKnowLator

.. |ABRA| image:: https://img.shields.io/badge/ReproducibleResearch-AbraCollaboratory-magenta.svg
:target: https://github.com/callahantiff/Abra-Collaboratory

Expand Down
2 changes: 0 additions & 2 deletions builds/.dockerignore

This file was deleted.

13 changes: 0 additions & 13 deletions builds/Dockerfile

This file was deleted.

Loading