Ask HN: I built an ML pipeline automation framework – how can I improve it?
Hi HN,
I’m working on an open-source ML automation framework called MLFCrafter.
It’s designed to help build modular, reusable machine learning pipelines using components like CleanerCrafter, ScalerCrafter, ModelCrafter, all connected via a MLFChain.
Here’s the repo: https://github.com/brkcvlk/MLFCrafter
I’d love feedback on:
What features would make this more useful for real-world ML projects?
How can I improve usability and adoption?
Tips on growing an open-source project from solo development to a widely used framework?
Thanks in advance for any advice or thoughts!
Not an ML guy, but getting people to use a new framework is going to be super hard in a field like ML that is churning at such an insane rate. I’m not an insider but I just can’t imagine anyone with real skin in the game and deadlines has the time to jump in.
It’s tough but would you be willing to consider adding value to the metaflow ecosystem instead?
Hey, really appreciate your thoughtful comment and you're absolutely right.
The ML space is evolving at a crazy pace, and I totally get that people who are deep in the field with real deadlines don’t have the bandwidth to adopt yet another framework. That’s why I’m not aiming to build a framework from scratch to compete, but rather to build a productivity and automation layer on top of existing tools like Metaflow.
Yeah really didnt want to discourage you but I feel like its totally greenfield in metaflow land (they just open sourced right?) and if you improved stuff there you’d have superb market fit as teams tried it out.
Cargo culting netflix’s tech is usually a bit questionable but maybe not this time!
> The ML space is evolving at a crazy pace, and I totally get that people who are deep in the field with real deadlines don’t have the bandwidth to adopt yet another framework. That’s why I’m not aiming to build a framework from scratch to compete, but rather to build a productivity and automation layer on top of existing tools like Metaflow.
Sorry looks like this was my Lack of proper reading!
More concrete feedback:
- Docs link in readme 404s
- get to code/demonstration or your unique advantage as fast as possible in README (“complete pipeline code” on this page: https://brkcvlk.github.io/MLFCrafter/getting-started/first-p.... ?)
- if its built on metaflow be upfront about it? The value prop would be saving time/enforcing a golden path?
- “<x>Crafter” is probably noise that should be deleted
- can you collapse some of the setup lines
- it might be nice if you could feed inputs to the pipeline as a whole, then you could use the context management APIs in Python (“with” IIRC?)
[EDIT] yeah metaflow has definitely been around for a while!