Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for different LoRA formats #47

Open
filipstrand opened this issue Sep 13, 2024 · 6 comments
Open

Support for different LoRA formats #47

filipstrand opened this issue Sep 13, 2024 · 6 comments
Assignees

Comments

@filipstrand
Copy link
Owner

filipstrand commented Sep 13, 2024

As discussed in #40 there are different types of LoRA formats. To keep track of what works and what doesn't I have added a table in the README called Supported LoRA formats. I am not sure what the best way is to categorize the formats, but I thought this issue can serve as a place for anyone to post what they have tried and if it worked or not, along with any other information. It would also be really interesting to hear from people which are the most common sources for fine-tuning online (e.g like civitai.com or fal.ai), so we can prioritize support for these weights.

@CharafChnioune
Copy link

Did someone got lora's to work on 4 and 8 bit models? it worked for the dev en schnell but all quantized models give format errors

@kaimerklein
Copy link

I‘m using lora.safetensors created on replicate.com with dev and schnell, works like a charme.

@CharafChnioune
Copy link

Yes but doesn't work on the quantized models

I‘m using lora.safetensors created on replicate.com with dev and schnell, works like a charme.

@cocktailpeanut
Copy link

Just my two cents. Without the support for quantized models, this is meaningless.

On the other hand, once quantized model lora support is added, it will be a game changer and I'm eagerly waiting for Mflux to add, because currently there's no efficient way to run Flux on Macs with lora support in an efficient way.

I work on pinokio.computer (which lets people run AI tools locally easiliy), and I can tell you there are TONS of people waiting for this kind of thing to come out (flux1-schnell/dev-fp8 + full LoRA support). Because there are no good alternatives.

You can do ComfyUI using fp8, but I have been personally looking forward to a native MLX support that also includes LoRA support. When that happens, and the flux1-dev-fp8 + LoRA runs faster than flux1-dev-fp8 + LoRA on ComfyUI, this will be a huge reason to start using MLX.

Really hope this is prioritized over anything else. Thank you.

@filipstrand
Copy link
Owner Author

@CharafChnioune What kind of error did you get? Was it perhaps similar to this #49?

Like @kaimerklein said, the LoRA feature should typically work with the quantized models (assuming we have support for the given LoRA format). However, at the moment it does not support loading in a pre-quantised model together with non-quantized lora weights. But if you load the original weights and simply pass the -q 8 flag, then the model will be quantized on the fly, for example as shown here. But one obvious downside is that it requires the full non-quantized weights. This is the typical workflow I use.

This can probably be fixed, but might require some restructuring/rethinking on the backend side. Will have this in mind as a thing to update since more people have been requesting it.

@cocktailpeanut Very interesting to hear how you are using the project, will check it out! For now, I think you are stuck with the solution described above, but at least it should work. And even though you have to store the full weights, you should still see the speedup that quantization brings.

@CharafChnioune
Copy link

CharafChnioune commented Oct 5, 2024

@filipstrand Yes exactly the same error the Loras work great on the normal schnell and dev models but for the 4 and 8 bit I always get that shape error.

Wil try test it when I get home wil let you know if that fixed the error

Update: worked thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants