Experiments in dispatching Metaflow flows to Flyte.
- Clone this repo
- Create a new virtualenv (I recommend using pyenv and pyenv-virtualenv);
poetry install
Metaflow has specific “extension points” built in. Any third-party package, such as ours, that conforms to their expected conventions and protocols will automatically get “injected” into `metaflow` top-level imports.
Some more details here: Netflix/metaflow-extensions-template.
This lets a package like ours define a custom decorator FooDecorator
with
name @foo
and have it importable via:
from metaflow import foo
The same goes for CLI sub-commands, allowing us to define custom CLI trees
such as my_subcmd
and have them accessible from the typical python
my/metaflow/flow.py
command tree **with no user intervention**.
In general, the following observation holds particular value:
Metaflow allows for seamless “injection” of extensions without an associated need for wrapper CLIs, companion SDKs, or sub-classing of specialized types. These extensions range from: custom types (exceptions, decorators, etc.) to entire conventions, e.g. company/platform-specific default values.
The user stories provided at Netflix/metaflow-extensions-template are particularly worth reading.
First, install =flytectl=:
brew install flyteorg/homebrew-tap/flytectl
flytectl sandbox start --source .
To register the workflow (WIP):
python flows/00-skeleton.py flyte register
To execute the workflow (WIP):
python flows/00-skeleton.py flyte compile
This doesn’t do much currently; it:
- Converts the Metaflow flow to an imperatively defined Flyte workflow (though current “convert” just means create an empty Flyte workflow…);
- Gets the default launch plan for that Flyte workflow.
We plan to have it do the following:
- Execute on a remote Flyte cluster, mapping the project and branch from the @project decorator to the Flyte project and domain respectively.
Could we use the Metaflow default @schedule decorator here? Or is that coupled to AWS Step?
- Managing launch plans associated w/ Metaflow flows
Domains are, from what I can tell, intended to be defined as a finite set at
the control-plane layer. In other words, a user can’t arbitrarily create
domains such as test.foo
.
This design decision stands somewhat at odds with Metaflow’s approach to
namespacing, centered around the --branch
(not to mention user-specific
defaults such as user.jjin
).
Related: flyteorg/flyte#1813.
This issue might impede: taking the Metaflow graph; converting it to a
Flyte workflow flyte_wf
via the imperative API; and registering flyte_wf
in the same function/command.