dataforgoodfr · njouanin · Mar 24, 2024 · Mar 23, 2024 · Mar 23, 2024 · Mar 23, 2024
diff --git a/README.md b/README.md
@@ -1,19 +1,34 @@
-
-
 ![banner](src/images/banner.png)
 
 ## What is Trawl Watch
 
-**[Trawl Watch](https://twitter.com/TrawlWatch)** is an initiative launched by the **[Bloom Association](https://www.bloomassociation.org/en/)** to track and expose the most destructive fishing vessels. Inspired by L’[Avion de Bernard](https://www.instagram.com/laviondebernard/), which monitors the movements of private jets, **Trawl Watch** aims to make visible the impact of these massive trawlers on our oceans. These vessels, often referred to as _mégachalutiers_, deploy gigantic nets that can engulf marine life from the surface down to the ocean floor. The consequences are both ecological—as they devastate crucial nursery and breeding areas for marine animals—and social, as they deprive artisanal fishermen of a healthy marine ecosystem. The solution proposed by **Bloom** is to dismantle these industrial fishing ships and redistribute their quotas to small-scale fishers. A petition has been launched, and **Bloom** continues to track these megatrawlers while awaiting action from European institutions.
-
-**Did you know that, in Europe, the largest fishing vessels, which represent 1% of the fleet, catch half of the fish?** These factory-vessels can measure up to 144 meters in length and catch 400,000 kilos of fish per day! This is as much as 1,000 small-scale fishing vessels in one day at sea.
-
-**These veritable sea monsters are devastating Europe’s biodiversity and coastlines.** It is important to measure the scale of the damage: about 20 of these factory-vessels can obliterate hundreds of thousands of marine animals and biodiversity treasures in one day, including in the so-called ‘Marine Protected Areas’ of French territorial waters, which are not protected at all.
+**[Trawl Watch](https://twitter.com/TrawlWatch)** is an initiative launched by the *
+*[Bloom Association](https://www.bloomassociation.org/en/)** to track and expose the most destructive fishing vessels.
+Inspired by L’[Avion de Bernard](https://www.instagram.com/laviondebernard/), which monitors the movements of private
+jets, **Trawl Watch** aims to make visible the impact of these massive trawlers on our oceans. These vessels, often
+referred to as _mégachalutiers_, deploy gigantic nets that can engulf marine life from the surface down to the ocean
+floor. The consequences are both ecological—as they devastate crucial nursery and breeding areas for marine animals—and
+social, as they deprive artisanal fishermen of a healthy marine ecosystem. The solution proposed by **Bloom** is to
+dismantle these industrial fishing ships and redistribute their quotas to small-scale fishers. A petition has been
+launched, and **Bloom** continues to track these megatrawlers while awaiting action from European institutions.
+
+**Did you know that, in Europe, the largest fishing vessels, which represent 1% of the fleet, catch half of the fish?**
+These factory-vessels can measure up to 144 meters in length and catch 400,000 kilos of fish per day! This is as much as
+1,000 small-scale fishing vessels in one day at sea.
+
+**These veritable sea monsters are devastating Europe’s biodiversity and coastlines.** It is important to measure the
+scale of the damage: about 20 of these factory-vessels can obliterate hundreds of thousands of marine animals and
+biodiversity treasures in one day, including in the so-called ‘Marine Protected Areas’ of French territorial waters,
+which are not protected at all.
 
 ## What is Bloom Association
 
-**BLOOM** is a non-profit organization founded in 2005 that works to preserve the marine environment and species from unnecessary destruction and to increase social benefits in the fishing sector. **BLOOM** wages awareness and advocacy campaigns in order to accelerate the adoption of concrete solutions for the ocean, humans and the climate. **BLOOM** carries out scientific research projects, independent studies and evaluations that highlight crucial and unaddressed issues such as the financing mechanisms of the fishing sector. **BLOOM**’s actions are meant for the general public as well as policy-makers and economic stakeholders.
-
+**BLOOM** is a non-profit organization founded in 2005 that works to preserve the marine environment and species from
+unnecessary destruction and to increase social benefits in the fishing sector. **BLOOM** wages awareness and advocacy
+campaigns in order to accelerate the adoption of concrete solutions for the ocean, humans and the climate. **BLOOM**
+carries out scientific research projects, independent studies and evaluations that highlight crucial and unaddressed
+issues such as the financing mechanisms of the fishing sector. **BLOOM**’s actions are meant for the general public as
+well as policy-makers and economic stakeholders.
 
 **Table of contents**
 
@@ -22,6 +37,7 @@
 - [Getting started](#getting-started)
     - [Installation with Docker/Docker Compose stack (Recommended)](#installation-with-docker-docker-compose-stack-recommended)
     - [Installation on local machine](#installation-on-local-machine)
+    - [Database migration](#database-migration)
 - [Official source code](#official-source-code)
 - [Contributing](#contributing)
 - [Who uses Trawl Watch?](#who-uses-trawl-watch)
@@ -38,14 +54,15 @@
 
 Bloom is tested with:
 
-|             | Main version (dev)           | Stable version (1.0.0) |
-|-------------|------------------------------|------------------------|
-| Python      | 3.8, 3.9, 3.10, 3.11         | 3.8, 3.9, 3.10, 3.11   |
-| Platform    | AMD64/ARM64(\*)              | AMD64/ARM64(\*)        |
-| Docker      | 24                           | 24                     |
-| PostgreSQL  | 13                           | 13                     |
+|            | Main version (dev)   | Stable version (1.0.0) |
+|------------|----------------------|------------------------|
+| Python     | 3.8, 3.9, 3.10, 3.11 | 3.8, 3.9, 3.10, 3.11   |
+| Platform   | AMD64/ARM64(\*)      | AMD64/ARM64(\*)        |
+| Docker     | 24                   | 24                     |
+| PostgreSQL | 14                   | 14                     |
 
 ## Getting started
+
 ### Clone the Bloom application repository
 
 ```bash
@@ -54,7 +71,9 @@ Bloom is tested with:
 ```
 
 ### Installation with Docker/Docker Compose stack (Recommended)
+
 #### Prerequistes
+
 * **Docker Engine** (version >= **18.06.0**) with **Compose** plugin
 
 #### Building image
@@ -63,36 +82,45 @@ Bloom is tested with:
     docker compose build
 ```
 
-> When official Docker image will be available, the building step could be optionnal for user as docker compose up will pull official image from repository
+> When official Docker image will be available, the building step could be optionnal for user as docker compose up will
+> pull official image from repository
 
 #### Starting the application
 
 ```bash
     docker compose up
 ```
+
 #### Load demonstration data
-To use Trawl Watch application, some data have to be initialy loaded for demonstration. As these data are protected and can't be publicly published, you just have to contact the Trawl Watch application team. Informations on [Who maintains Trawl Watch?](#who-maintains-traw-watch)
+
+To use Trawl Watch application, some data have to be initialy loaded for demonstration. As these data are protected and
+can't be publicly published, you just have to contact the Trawl Watch application team. Informations
+on [Who maintains Trawl Watch?](#who-maintains-traw-watch)
 
 After having filled 12_bloom/data folder with data files get from project team, rename files as following:
+
 * data/chalutiers_pelagiques.csv
 * data/spire_positions_subset.csv
 * data/vessels_subset.csv
 * data/zones_subset.csv
 * data/ports.csv
-* data/ports_rad3000_res10.csv
 
 Then launch docker compose stack using docker compose file extension to add loading data service
 
     docker compose -f docker-compose.yaml -f docker-compose-load-data.yaml up
+
 You can now jump to [Use the Bloom Application](#use-the-bloom-application)
 
 ### Installation on local machine
+
 #### Prerequistes
+
 * **Python**: 3.9, 3.10, 3.11
 * **Python-pip**: >=20
-* **Postgresql**: 12, 13, 14, 15, 16
+* **Postgresql**: 14, 15, 16
 
-You must have a functionnal PostgreSQL instance with connexion informations (database server hostname or ip, user, password, database name, port to use)
+You must have a functionnal PostgreSQL instance with connexion informations (database server hostname or ip, user,
+password, database name, port to use)
 
 #### Install with Poetry
 
@@ -110,8 +138,9 @@ You must have a functionnal PostgreSQL instance with connexion informations (dat
     # Enable virtual poetry project environment
     poetry shell
 ```
+
 #### Initial configuration
- 
+
 ```bash
     # Create initial ocnfiguration
     cp .env.template .env
@@ -127,13 +156,14 @@ You must have a functionnal PostgreSQL instance with connexion informations (dat
     # * data/spire_positions_subset.csv
     # * data/vessels_subset.csv
     # * data/zones_subset.csv
-    python src/alembic/init_script/load_vessels_data.py
-    python src/alembic/init_script/load_positions_data.py
-    python src/alembic/init_script/load_amp_data.py
-    python src/alembic/init_script/load_ports_data.py
+    $ python3 src/bloom/tasks/load_dim_vessel_from_csv.py 
+    $ python3 src/bloom/tasks/load_dim_port_from_csv.py
+    $ python3 src/bloom/tasks/load_dim_zone_amp_from_csv.py
+    $ python3 src/bloom/tasks/compute_port_geometry_buffer.py
 ```
-    
+
 #### Starting the application
+
 ```bash
     # Enable virtual poetry project environment
     poetry shell
@@ -143,28 +173,68 @@ You must have a functionnal PostgreSQL instance with connexion informations (dat
 
 You can now jump to [Use the Bloom Application](#use-the-bloom-application)
 
+### Database migration
+
+Trawlwatch DB model has been refactored during DataForGood season 12. If you run a version of Trawlwactch using the old
+model follow next steps to upgrade.
+
+- Upgrade DB model:
+
+```
+$ alembic upgrade head
+```
+
+- Run data conversion from the old model to the new model (actually copy data from `spire_vessel_positions`
+  to `spire_ais_data`). This may take long if you have a long positions history:
+
+```
+$ python src/bloom/tasks/convert_spire_vessels_to_spire_ais_data.py
+```
+
+- Load new references data (AMP zone, ports, vessels):
+
+```
+$ /venv/bin/python3 src/bloom/tasks/load_dim_vessel_from_csv.py 
+$ /venv/bin/python3 src/bloom/tasks/load_dim_port_from_csv.py
+$ /venv/bin/python3 src/bloom/tasks/load_dim_zone_amp_from_csv.py
+$ /venv/bin/python3 src/bloom/tasks/compute_port_geometry_buffer.py
+```
+
+- If you feel it, drop old tables:
+
+```
+DROP TABLE mpa_fr_with_mn;
+DROP TABLE spire_vessel_positions;
+DROP TABLE vessels;
+```
+
 ### Use the Bloom Application
 
 #### Access Web Interface
-After having succeed with [With Docker/Docker Compose stack](#with-docker) or [On local machine](#on-local-machine) installation and managed to [Load demonstration data](#load-demonstration-data) you should now access the Bloom application with you favorite web browser 
+
+After having succeed with [With Docker/Docker Compose stack](#with-docker) or [On local machine](#on-local-machine)
+installation and managed to [Load demonstration data](#load-demonstration-data) you should now access the Bloom
+application with you favorite web browser
+
 * Access to http://localhost:8501
-![Home](docs/images/trawlwatch_home.png)
+  ![Home](docs/images/trawlwatch_home.png)
 * Navigate to "Vessel Exploration"
 * Enter MMSI 261084090 as example
 * Clic on "Load"
 * You can select voyage_id and view track of vessel
-![Loaded](docs/images/trawlwatch_loaded_voyage.png)
-
+  ![Loaded](docs/images/trawlwatch_loaded_voyage.png)
 
 ## Official source code
 
 You cna find official source code on [Github Repository](https://github.com/dataforgoodfr/12_bloom/)
 
 ## Contributing
 
-Want to help build Bloom Application Check out our [contributing documentation](https://github.com/dataforgoodfr/12_bloom/tree/main/docs/contributing/README.md).
+Want to help build Bloom Application Check out
+our [contributing documentation](https://github.com/dataforgoodfr/12_bloom/tree/main/docs/contributing/README.md).
 
-Official Docker (container) images for Bloom Application are described in [images](https://github.com/dataforgoodfr/12_bloom/tree/main/docker/).
+Official Docker (container) images for Bloom Application are described
+in [images](https://github.com/dataforgoodfr/12_bloom/tree/main/docker/).
 
 ## Who uses Trawl Watch?
 
@@ -180,7 +250,6 @@ Official Docker (container) images for Bloom Application are described in [image
 
 #TODO
 
-
 ## More information can be found there
 
 1. [Database initialisation](./docs/notes/database.initialisation.md)

diff --git a/docs/notes/tasks.md b/docs/notes/tasks.md
@@ -1,9 +1,22 @@
 # Liste et description des traitements batch
 
- * `src/bloom/tasks/load_spire_data_from_api.py`: Appel l'API Spire et enregistre les données AIS receuillies dans la table `spire_ais_data`
-   * le paramètre `-d <dump_path>` permet de stocker la réponse SPIRE dans un fichier JSON. Dans ce cas la réponse n'est pas enregistrée en base. Cas d'usage: récupération de jeu de données SPIRE depuis la production.
- * `src/bloom/tasks/load_spire_data_from_json.py`: Lit un fichier JSON et enregistre les données AIS receuillies dans la table `spire_ais_data`
- * `src/bloom/tasks/load_dim_port_from_csv.py`: Charge dans la table `dim_port` les données provenant d'un fichier csv au format `url;country;port;locode;latitude;longitude;geometry_point`
- *  `src/bloom/tasks/compute_port_geometry_buffer.py`: Calcule la zone tampon autour des ports stockées en base et pour lesquels cette zone n'est pas encore calculée
- * `src/bloom/tasks/convert_spire_vessels_to_spire_ais_data.py`: convertit les données de l'ancienne table `spire_vessel_positions` et les charge dans la table `spire_ais_data`.
- * `load_dim_vessel_from_csv.py`: Charge le contenu d'un fichier CSV dans la table `dim_vessel`. Le format du fichier CSV attendu est `country_iso3;cfr;IMO;registration_number;external_marking;ship_name;ircs;mmsi;loa;type;mt_activated;comment`
+Liste et description des traitements disponibles dans le répertoire `src/bloom/tasks/`:
+
+* `load_spire_data_from_api.py`: Appel l'API Spire et enregistre les données AIS receuillies dans la
+  table `spire_ais_data`
+    * le paramètre `-d <dump_path>` permet de stocker la réponse SPIRE dans un fichier JSON. Dans ce cas la réponse
+      n'est pas enregistrée en base. Cas d'usage: récupération de jeu de données SPIRE depuis la production.
+* `load_spire_data_from_json.py`: Lit un fichier JSON et enregistre les données AIS receuillies dans la
+  table `spire_ais_data`
+* `load_dim_port_from_csv.py`: Charge dans la table `dim_port` les données provenant d'un fichier csv au
+  format `url;country;port;locode;latitude;longitude;geometry_point`
+* `compute_port_geometry_buffer.py`: Calcule la zone tampon autour des ports stockées en base et pour lesquels cette
+  zone n'est pas encore calculée
+* `convert_spire_vessels_to_spire_ais_data.py`: convertit les données de l'ancienne table `spire_vessel_positions` et
+  les charge dans la table `spire_ais_data`.
+* `load_dim_vessel_from_csv.py`: Charge le contenu d'un fichier CSV dans la table `dim_vessel`. Le format du fichier CSV
+  attendu
+  est `country_iso3;cfr;IMO;registration_number;external_marking;ship_name;ircs;mmsi;loa;type;mt_activated;comment`
+* `load_dim_zone_amp_from_csv.py`: Charge le contenu d'un fichier CSV des AMP dans la table `dim_zone`. Le format
+  attendu
+  est `"index","WDPAID","name","DESIG_ENG","DESIG_TYPE","IUCN_CAT","PARENT_ISO","ISO3","geometry","Benificiaries"`.
diff --git a/src/alembic/versions/2d54ea045ff3_create_vessel_data_tables.py b/src/alembic/versions/2d54ea045ff3_create_vessel_data_tables.py
@@ -0,0 +1,102 @@
+"""create_vessel_data_tables
+
+Revision ID: 2d54ea045ff3
+Revises: 41c412cf8175
+Create Date: 2024-03-22 21:48:06.473750
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+from bloom.config import settings
+from geoalchemy2 import Geometry
+from sqlalchemy.sql import func
+
+
+# revision identifiers, used by Alembic.
+revision = "2d54ea045ff3"
+down_revision = "41c412cf8175"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.create_table(
+        "vessel_positions",
+        sa.Column("id", sa.Integer, sa.Identity(), primary_key=True),
+        sa.Column("timestamp", sa.DateTime(timezone=True), nullable=False),
+        sa.Column("accuracy", sa.String),
+        sa.Column("collection_type", sa.String),
+        sa.Column("course", sa.Double),
+        sa.Column("heading", sa.Double),
+        sa.Column(
+            "position", Geometry(geometry_type="POINT", srid=settings.srid), nullable=False
+        ),
+        sa.Column("latitude", sa.Double),
+        sa.Column("longitude", sa.Double),
+        sa.Column("maneuver", sa.String),
+        sa.Column("navigational_status", sa.String),
+        sa.Column("rot", sa.Double),
+        sa.Column("speed", sa.Double),
+        sa.Column("vessel_id", sa.Integer, sa.ForeignKey("dim_vessel.id"), nullable=False),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            nullable=False,
+            server_default=func.now(),
+        ),
+    )
+    op.create_index("i_vessel_positions_timesamp", "vessel_positions", ["timestamp"])
+    op.create_index("i_vessel_positions_vessel_id", "vessel_positions", ["vessel_id"])
+    op.create_index("i_vessel_positions_created_updated", "vessel_positions", ["created_at"])
+
+    op.create_table(
+        "vessel_data",
+        sa.Column("id", sa.Integer, sa.Identity(), primary_key=True),
+        sa.Column("timestamp", sa.DateTime(timezone=True), nullable=False),
+        sa.Column("ais_class", sa.String),
+        sa.Column("flag", sa.String),
+        sa.Column("name", sa.String),
+        sa.Column("callsign", sa.String),
+        sa.Column("ship_type", sa.String),
+        sa.Column("sub_ship_type", sa.String),
+        sa.Column("mmsi", sa.Integer),
+        sa.Column("imo", sa.Integer),
+        sa.Column("width", sa.Integer),
+        sa.Column("length", sa.Integer),
+        sa.Column("vessel_id", sa.Integer, sa.ForeignKey("dim_vessel.id"), nullable=False),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            nullable=False,
+            server_default=func.now(),
+        ),
+    )
+    op.create_index("i_vessel_data_timesamp", "vessel_data", ["timestamp"])
+    op.create_index("i_vessel_data_vessel_id", "vessel_data", ["vessel_id"])
+    op.create_index("i_vessel_data_created_updated", "vessel_data", ["created_at"])
+
+    op.create_table(
+        "vessel_voyage",
+        sa.Column("id", sa.Integer, sa.Identity(), primary_key=True),
+        sa.Column("timestamp", sa.DateTime(timezone=True), nullable=False),
+        sa.Column("destination", sa.String),
+        sa.Column("draught", sa.Double),
+        sa.Column("eta", sa.DateTime(timezone=True)),
+        sa.Column("vessel_id", sa.Integer, sa.ForeignKey("dim_vessel.id"), nullable=False),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            nullable=False,
+            server_default=func.now(),
+        ),
+    )
+    op.create_index("i_vessel_voyage_timesamp", "vessel_data", ["timestamp"])
+    op.create_index("i_vessel_voyage_vessel_id", "vessel_data", ["vessel_id"])
+    op.create_index("i_vessel_voyage_created_updated", "vessel_data", ["created_at"])
+
+
+def downgrade() -> None:
+    op.drop_table("vessel_positions")
+    op.drop_table("vessel_data")
+    op.drop_table("vessel_voyage")