Skip to content

Releases: Netflix/metaflow

2.2.11 (Apr 30th, 2021)

30 Apr 21:25
8fac145
Compare
Choose a tag to compare

Metaflow 2.2.11 Release Notes

The Metaflow 2.2.11 release is a minor patch release.

Bug Fixes

Fix regression that broke compatibility with Python 2.7

shlex.quote, introduced in #493, is not compatible with Python 2.7. pipes.quote is now used for Python 2.7.

Fix a corner case when converting options to CL arguments

Some plugins may need to escape shell variables when using them in command lines. This patch allows this to work.

Fix a bug in case of a hard crash in a step

In some cases, a hard crash in a step would cause the status of the step to not be properly reported.

The Conda environment now delegates to the default environment properly for get_environment_info

The Conda environment now delegates get_environment_info to the DEFAULT_ENVIRONMENT as opposed to the MetaflowEnvironment. This does not change the current default behavior.

2.2.10 (Apr 22nd, 2021)

22 Apr 20:56
015e1c9
Compare
Choose a tag to compare

Metaflow 2.2.10 Release Notes

The Metaflow 2.2.10 release is a minor patch release.

Features

AWS Logs Group, Region and Stream are now available in metadata for tasks executed on AWS Batch

For tasks that execute on AWS Batch, Metaflow now records the location where the AWS Batch instance writes the container logs in AWS Logs. This can be handy in locating the logs through the client API -

Step('Flow/42/a').task.metadata_dict['aws-batch-awslogs-group']
Step('Flow/42/a').task.metadata_dict['aws-batch-awslogs-region']
Step('Flow/42/a').task.metadata_dict['aws-batch-awslogs-stream']

PR: #478

Execution logs are now available for all tasks in Metaflow universe

All Metaflow runtime/task logs are now published via a sidecar process to the datastore. The user-visible logs on the console are streamed directly from the datastore. For Metaflow's integrations with the cloud (AWS at the moment), the compute tasks logs (AWS Batch) are directly written by Metaflow into the datastore (Amazon S3) independent of where the flow is launched from (User's laptop or AWS Step Functions). This has multiple benefits

  • Metaflow no longer relies on AWS Cloud Watch for fetching the AWS Batch execution logs to the console - AWS Cloud Watch has rather low global API limits which have caused multiple issues in the past for our users
  • Logs for AWS Step Functions executions are now also available in Amazon S3 and can be easily fetched by simply doing python flow.py logs 42/start or Step('Flow/42/start').task.stdout. PR: #449

Bug Fixes

Fix regression with ping/ endpoint for Metadata service

Fix a regression introduced in v2.2.9 where the endpoint responsible for ascertaining the version of the deployed Metadata service was erroneously moved to ping/ from ping PR: #484

Fix the behaviour of --namespace= CLI args when executing a flow

python flow.py run --namespace= now correctly makes the global namespace visible within the flow execution. PR: #461

Metaflow 2.2.9 (April 19th, 2021)

19 Apr 22:28
Compare
Choose a tag to compare

Metaflow 2.2.9 Release Notes

The Metaflow 2.2.9 release is a minor patch release.

Bugs

Remove pinned pylint dependency

Pylint dependency was unpinned and made floating. See PR #462.

Improve handling of / in image parameter for batch

You are now able to specify docker images of the form foo/bar/baz:tag in the batch decorator. See PR #466.

List custom FlowSpec parameters in the intended order

The order in which parameters are specified by the user in the FlowSpec is now preserved when displaying them with --help. See PR #456.

2.2.8 (Mar 15th, 2021)

15 Mar 19:56
dac1301
Compare
Choose a tag to compare

Metaflow 2.2.8 Release Notes

The Metaflow 2.2.8 release is a minor patch release.

Bugs

Fix @environment behavior for conflicting attribute values

Metaflow was incorrectly handling environment variables passed through the @environment decorator in some specific instances. When @environment decorator is specified over multiple steps, the actual environment that's available to any step is the union of attributes of all the @environment decorators; which is incorrect behavior. For example, in the following workflow -

from metaflow import FlowSpec, step, batch, environment
import os
class LinearFlow(FlowSpec):
    @environment(vars={'var':os.getenv('var_1')})
    @step
    def start(self):
        print(os.getenv('var'))
        self.next(self.a)
    @environment(vars={'var':os.getenv('var_2')})
    @step
    def a(self):
        print(os.getenv('var'))
        self.next(self.end)
    @step
    def end(self):
        pass
if __name__ == '__main__':
    LinearFlow()
var_1=foo var_2=bar python flow.py run

will result in

Metaflow 2.2.7.post10+gitb7d4c48 executing LinearFlow for user:savin
Validating your flow...
    The graph looks good!
Running pylint...
    Pylint is happy!
2021-03-12 20:46:04.161 Workflow starting (run-id 6810):
2021-03-12 20:46:04.614 [6810/start/86638 (pid 10997)] Task is starting.
2021-03-12 20:46:06.783 [6810/start/86638 (pid 10997)] foo
2021-03-12 20:46:07.815 [6810/start/86638 (pid 10997)] Task finished successfully.
2021-03-12 20:46:08.390 [6810/a/86639 (pid 11003)] Task is starting.
2021-03-12 20:46:10.649 [6810/a/86639 (pid 11003)] foo
2021-03-12 20:46:11.550 [6810/a/86639 (pid 11003)] Task finished successfully.
2021-03-12 20:46:12.145 [6810/end/86640 (pid 11009)] Task is starting.
2021-03-12 20:46:15.382 [6810/end/86640 (pid 11009)] Task finished successfully.
2021-03-12 20:46:15.563 Done!

Note the output for the step a which should have been bar. PR #452 fixes the issue.

Fix environment is not callable error when using @environment

Using @environment would often result in an error from pylint - E1102: environment is not callable (not-callable). Users were getting around this issue by launching their flows with --no-pylint. PR #451 fixes this issue.

2.2.7 (Feb 8th, 2021)

09 Feb 00:45
c395e26
Compare
Choose a tag to compare

Metaflow 2.2.7 Release Notes

The Metaflow 2.2.7 release is a minor patch release.

Bugs

Handle for-eaches properly for AWS Step Functions workflows running on AWS Fargate

Workflows orchestrated by AWS Step Functions were failing to properly execute for-each steps on AWS Fargate. The culprit was lack of access to instance metadata for ECS. Metaflow instantiates a connection to Amazon DynamoDB to keep track of for-each cardinality. This connection requires knowledge of the region that the job executes in and is made available via instance metadata on EC2; which unfortunately is not available on ECS (for AWS Fargate). This fix introduces the necessary checks for inferring the region correctly for tasks executing on AWS Fargate. Note that after the recent changes to Amazon S3's consistency model, the Amazon DynamoDB dependency is no longer needed and will be done away in a subsequent release. PR: #436

2.2.6 (Jan 26th, 2021)

26 Jan 21:51
20f584f
Compare
Choose a tag to compare

Metaflow 2.2.6 Release Notes

The Metaflow 2.2.6 release is a minor patch release.

Features

Support AWS Fargate as compute backend for Metaflow tasks launched on AWS Batch

At AWS re:invent 2020, AWS announced support for AWS Fargate as a compute backend (in addition to EC2) for AWS Batch. With this feature, Metaflow users can now submit their Metaflow jobs to AWS Batch Job Queues which are connected to AWS Fargate Compute Environments as well. By setting the environment variable - METAFLOW_ECS_FARGATE_EXECUTION_ROLE , users can configure the ecsTaskExecutionRole for the AWS Batch container and AWS Fargate agent. PR: #402

Support shared_memory, max_swap, swappiness attributes for Metaflow tasks launched on AWS Batch

The @batch decorator now supports shared_memory, max_swap, swappiness attributes for Metaflow tasks launched on AWS Batch to provide a greater degree of control for memory management. PR: #408

Support wider very-wide workflows on top of AWS Step Functions

The tag metaflow_version: and runtime: is now available for all packaged executions and remote executions as well. This ensures that every run logged by Metaflow will have metaflow_version and runtime system tags available. PR: #403

Bug Fixes

Assign tags to Run objects generated through AWS Step Functions executions

Run objects generated by flows executed on top of AWS Step Functions were missing the tags assigned to the flow; even though the tags were correctly persisted to tasks. This release fixes and brings inline the tagging behavior as observed with local flow executions. PR: #386

Pipe all workflow set-up logs to stderr

Execution set-up logs for @conda and IncludeFile were being piped to stdout which made manipulating the output of commands like python flow.py step-functions create --only-json a bit difficult. This release moves the workflow set-up logs to stderr. PR: #379

Handle null assignment to IncludeFile properly

A workflow executed without a required IncludeFile parameter would fail when the parameter was referenced inside the flow. This release fixes the issue by assigning a null value to the parameter in such cases. PR: #421

2.2.5 (Nov 11th, 2020)

11 Nov 10:43
6977633
Compare
Choose a tag to compare

Metaflow 2.2.5 Release Notes

The Metaflow 2.2.5 release is a minor patch release.

  • Features

    • Log metaflow_version: and runtime: tag for all executions
  • Bug Fixes

    • Handle inconsistently cased file system issue when creating @conda environments on macOS for linux-64

Features

Log metaflow_version: and runtime: tag for all executions

The tag metaflow_version: and runtime: is now available for all packaged executions and remote executions as well. This ensures that every run logged by Metaflow will have metaflow_version and runtime system tags available. PR: #376, #375

Bug Fixes

Handle inconsistently cased file system issue when creating @conda environments on macOS for linux-64

Conda fails to correctly set up environments for linux-64 packages on macOS at times due to inconsistently cased filesystems. Environment creation is needed to collect the necessary metadata for correctly setting up the conda environment on AWS Batch. This fix simply ignores the error-checks that conda throws while setting up the environments on macOS when the intended destination is AWS Batch. PR: #377

2.2.4 (Oct 28th, 2020)

28 Oct 23:30
c3aab7e
Compare
Choose a tag to compare

Metaflow 2.2.4 Release Notes

The Metaflow 2.2.4 release is a minor patch release.

  • Features

    • Metaflow is now compliant with AWS GovCloud & AWS CN regions
  • Bug Fixes

    • Address a bug with overriding the default value for IncludeFile
    • Port AWS region check for AWS DynamoDb from curl to requests

Features

Metaflow is now compliant with AWS GovCloud & AWS CN regions

AWS GovCloud & AWS CN users can now enjoy all the features of Metaflow within their region partition with no change on their end. PR: #364

Bug Fixes

Address a bug with overriding the default value for IncludeFile

Metaflow v2.1.0 introduced a bug in IncludeFile functionality which prevented users from overriding the default value specified. PR: #346

Port AWS region check for AWS DynamoDb from curl to requests

Metaflow's AWS Step Functions' integration relies on AWS DynamoDb to manage foreach constructs. Metaflow was leveraging curl at runtime to detect the region for AWS DynamoDb. Some docker images don't have curl installed by default; moving to requests (a metaflow dependency) fixes the issue. PR: #343

2.2.3 (Sep 8th, 2020)

08 Sep 18:22
6b87fdc
Compare
Choose a tag to compare

Metaflow 2.2.3 Release Notes

The Metaflow 2.2.3 release is a minor patch release.

  • Bug Fixes
    • Fix #305 : Default 'help' for parameters was not handled properly
    • Pin the conda library versions for metaflow default dependencies based on the Python version
    • Add conda bin path to the PATH environment variable during Metaflow step execution
    • Fix a typo in metaflow/debug.py

Bug Fixes

Fix #305 : Default 'help' for parameters was not handled properly

Fix the issue where default help for parameters was not handled properly. #305 Flow fails because IncludeFile's default value for the help argument is None. PR: #318

Pin the conda library versions for metaflow default dependencies based on the Python version.

The previously pinned library version does not work with python 3.8. Now we have two sets of different version combinations which should work for python 2.7, 3.5, 3.6, 3.7, and 3.8. PR: #308

Add conda bin path to the PATH environment variable during Metaflow step execution

Previously the executable installed in conda environment was not visible inside metaflow steps. Fixing this issue by appending conda bin path to the PATH environment variable PR: #307

Fix a typo in metaflow/debug.py

A typo fix. PR: #304

2.2.2 (Aug 20th, 2020)

20 Aug 08:00
bfe41df
Compare
Choose a tag to compare

Metaflow 2.2.2 Release Notes

The Metaflow 2.2.2 release is a minor patch release.

  • Bug Fixes
    • Fix a regression introduced in 2.2.1 related to Conda environments
    • Clarify Pandas requirements for Tutorial Episode 04
    • Fix an issue with the metadata service

Bug Fixes

Fix a regression with Conda

Metaflow 2.2.1 included a commit which was merged too early and broke the use of Conda. This release reverses this patch.

Clarify Pandas version needed for Episode 04

Recent versions of Pandas are not backward compatible with the one used in the tutorial; a small comment was added to warn of this fact.

Fix an issue with the metadata service

In some cases, the metadata service would not properly create runs or tasks.

PRs #296, #297, #298