Skip to content

Releases: hackoregon/data-science-pet-containers

More bug fixes

03 May 05:35
Compare
Choose a tag to compare

Closing issue #76, #79 and #81

Bug fixes

05 Apr 06:24
Compare
Choose a tag to compare
  1. I introduced a bug in the refactoring that prevented automatic restores of .backup files (issue #75). That's fixed in this release.
  2. The script pull-postgis-raw.bash was missing the execute bits (issue #76). That's fixed too.

Refactored Jupyter image

03 Apr 03:11
Compare
Choose a tag to compare
  1. The Jupyter image was over 4 gigabytes. This is unworkable, so I've dropped back to just the Jupyter notebook server on the image. The data science stack can be installed by executing a script the first time you start the service.
  2. All of the examples have been refactored to be runnable from a Linux host command line, including WSL Ubuntu on Windows 10 Pro.
  3. @nam20485 enhanced the documentation for the Windows 10 Pro WSL operations. Many thanks!!
  4. Miscellaneous bug fixes.

Tooling for database backup wrangling

26 Mar 20:22
Compare
Choose a tag to compare
  1. I've "upgraded" the users in the PostGIS container defined by DB_USERS_TO_CREATE. They are now Linux users and they are now database superusers with "trust" database authentication. This is the traditional modus operandi for a Debian / Ubuntu desktop.
  2. I've added an example, reowning_a_database, that demonstrates how to automate some common database operations with Docker's host-container interface tools, docker cp and docker exec.

Bug fixes, documentation cleanup

21 Mar 21:04
9d47b8f
Compare
Choose a tag to compare
  1. Emacs Speaks Statistics (ESS) is now installed from source on both the rstats and postgis images (issue #54)
  2. The user creation step of the automatic restores is cleaned up and the documentation in the README enhanced (issue #53)
  3. Linux Mint scripts are now available for both PostgreSQL and Docker hosting, and there are scripts for both Linux Mint 18 and 17 (issues #51 and #52)
  4. There is now a .Rprofile for the dbsuper Linux user. This fixes a bug in the ODOT crash data migration example (issue #50)
  5. Issue #49 is fixed.

Many new features!

18 Mar 08:51
Compare
Choose a tag to compare

New features:

  1. PostGIS:
    • For compatibility with our AWS deployment, I've dropped back from 10.3 to 9.6. The other components, PostGIS and pgRouting, are still the latest stable versions.
    • The PostGIS command line now includes R and Emacs Speaks Statistics (ESS). And, of course, Emacs.
    • In addition to pg_dump "custom" format files (.backup), automatic restores now work for .sql and .sql.gz files (plain text and compressed plain text). In fact, this is now the preferred format for database dumps given that we've been unable to restore a .backup to the AWS server.
    • The Hack Oregon database users defined in our AWS server are now available in the postgis server.
    • Raw data can be loaded onto the postgis image at build time.
  2. Jupyter (formerly Miniconda and before that Jupyter)
    • This is the name that makes sense. Jupyter is the tool; Miniconda is how it gets there.
    • Cookiecutter is now installed by default.
    • Analysis packages now include:
      • geopandas,
      • jupyter,
      • matplotlib,
      • pandas,
      • psycopg2,
      • requests,
      • seaborn,
      • statsmodels,
      • cookiecutter, and
      • osmnx.
  3. Rstats (formerly RStudio)
    • Name change: we're planning to push these images to a Docker repository to cut down on how long it takes to bring up the containers. If this is a public repository, there's a potential issue with the RStudio trademark. To avoid that I've renamed the image.
    • I've added Emacs Speaks Statistics (ESS). And, of course, Emacs.
  4. Amazon: I've put this image in the collection primarily for backup file restore testing. It looks mostly like the postgis image but not much of the infrastructure is duplicated. There's just enough to bring up a database with the Hack Oregon database users and verify that databases can be restored.
  5. Windows 10 Pro Windows Subsystem for Linux Docker interface: It works! It really does! There were a couple of bugs but I've fixed them and for all practical purposes it looks just like running Docker Community Edition from an Ubuntu 16.04 LTS "Xenial Xerus" terminal.
  6. Examples: I've added the raw-data-to-database-backup scripts for the Transportation Systems passenger census and ODOT crash data as examples. They run inside a postgis container.

More Miniconda packages, more TriMet data

13 Mar 09:06
Compare
Choose a tag to compare
  • Closed issues: 26 - 29
  • The Miniconda image now includes pandas, geopandas, requests, psycopg2, statsmodels and seaborn.
  • The PostGIS image has a new database superuser, Linux and PostgreSQL user dbsuper. Use this instead of logging in as postgres.
  • The PostGIS image now includes virtualenvwrapper.
  • The OpenStreetMap example now includes all TriMet public GIS data, not just the service area boundary.
  • All the images now have a clone-me.bash script that clones this repository! So you can build a Docker network, clone this repository and run the examples. I'm open to suggestions for more examples as long as they use publicly-downloadable data only.

v1.1.0 - OpenStreetMap example and WSL usage notes

11 Mar 05:47
Compare
Choose a tag to compare

New features:

v1.0.0 - Initial release

09 Mar 09:41
Compare
Choose a tag to compare

A Docker environment for data science, including GIS

  1. PostgreSQL 10, PostGIS 2.4 and pgRouting 2.5, plus numerous command-line ETL and GIS utilities,
  2. A Jupyter notebook server, and
  3. An RStudio server.

Found a bug? Documentation unclear? Want a new feature? File an issue at https://github.com/hackoregon/data-science-pet-containers/issues/new