Skip to content

2.4.0 (Oct 4th 2021)

Compare
Choose a tag to compare
@romain-intel romain-intel released this 04 Oct 22:31
· 742 commits to master since this release
0d3ef33

Metaflow 2.4.0 Release Notes

The Metaflow 2.4.0 release is a minor release and includes a breaking change

Breaking Changes

Change return type of created_at/finished_at in the client (#692)

Prior to this release, the return type for created_at and finished_at properties in the Client API was a timestamp
string. This release changes this to a datetime object, as the old behavior is considered an unintentional mis-feature
(see below for details).

How to retain the old behavior

To keep the old behavior, append an explicit string conversion, .strftime('%Y-%m-%dT%H:%M:%SZ'), to
the created_at and finshed_at calls, e.g.

run.created_at.strftime('%Y-%m-%dT%H:%M:%SZ')

Background

The first versions of Metaflow (internal to Netflix) returned a datetime object in all calls dealing with timestamps in
the Client API to make it easier to perform operations between timestamps. Unintentionally, the return type was changed
to string in the initial open-source release. This release introduces a number of internal changes, removing all
remaining discrepancies between the legacy version of Metaflow that was used inside Netflix and the open-source version.

The timestamp change is the only change affecting the user-facing API. While Metaflow continues to make a strong promise
of backwards compatibility of user-facing features and APIs, the benefits of one-time unification outweigh the cost of this
relatively minor breaking change.

Bug Fixes

Better error messages in case of a Conda issue (#706)

Conda errors printed to stderr were not surfaced to the user; this release addresses this issue.

Fix error message in Metadata service (#690)

The code responsible for printing error messages from the metadata service had a problem that could cause it to be unable to print the correct error message and would instead raise another error that obfuscated the initial error. This release addresses this issue and errors from the metadata service are now properly printed.

New Features

S3 retry counts are now configurable (#700)

This release allows you to set the number of times S3 access are retried (the default is 7). The relevant environment variable is: METAFLOW_S3_RETRY_COUNT.

New datastore implementation resulting in improved performance (#580)

The datastore implementation was reworked to make it easier to extend in the future. It also now uploads artifacts in parallel to S3 (as opposed to sequentially) which can lead to better performance. The changes also contribute to a notable improvement in the speed of resume which can now start resuming a flow twice as fast as before. Documentation can be found here.

S3 datatools performance improvements (#697)

The S3 datatools better handles small versus large files by using the download_file command for larger files and using get_object for smaller files to minimize the number of calls made to S3.