-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduced a model store #805
base: main
Are you sure you want to change the base?
Introduced a model store #805
Conversation
Reviewer's Guide by SourceryThis pull request introduces a Updated class diagram for ModelStoreclassDiagram
class ModelStore {
- _store_base_path: Path
- _model_name: str
- _model_organization: str
- _model_registry: ModelRegistry
+ store_path: str
+ model_name: str
+ model_organization: str
+ model_registry: ModelRegistry
+ model_base_directory: str
+ blob_directory: str
+ ref_directory: str
+ snapshot_directory: str
+ get_ref_file_path(model_tag: str) : str
+ get_snapshot_directory(hash: str) : str
+ get_blob_file_path(hash: str) : str
+ get_snapshot_file_path(hash: str, filename: str) : str
+ resolve_model_directory(model_tag: str) : str
+ ensure_directory_setup() : None
+ exists(model_tag: str) : bool
+ get_cached_files(model_tag: str) : Tuple[str, list[str], bool]
+ prepare_new_snapshot(model_tag: str, snapshot_hash: str, snapshot_files: list[SnapshotFile]) : None
+ new_snapshot(model_tag: str, snapshot_hash: str, snapshot_files: list[SnapshotFile]) : None
}
class SnapshotFile {
+ url: str
+ header: Dict
+ hash: str
+ name: str
+ should_show_progress: bool
+ should_verify_checksum: bool
+ required: bool
}
class ModelRegistry {
+ HUGGINGFACE
+ OLLAMA
+ OCI
+ URL
}
ModelStore -- ModelRegistry
ModelStore -- SnapshotFile
Updated class diagram for Ollama ModelclassDiagram
class Ollama {
- model: str
- model_tag: str
- directory: str
- filename: str
- store: ModelStore
+ __init__(model: str, store_path: str = "")
+ pull(debug: bool = False) : str
}
class Model {
<<Abstract>>
- model: str
- type: str
+ login(args: any)
+ logout(args: any)
+ pull(args: any)
+ push(source: str, args: any)
+ is_symlink_to(file_path: str, target_path: str) : bool
+ garbage_collection(args: any)
+ setup_container(args: any)
+ exec_model_in_container(model_path: str, cmd_args: list[str], args: any)
+ build_exec_args_perplexity(args: any, model_path: str) : list[str]
+ check_name_and_container(args: any)
+ build_prompt(args: any) : str
+ execute_model(model_path: str, exec_args: list[str], args: any)
+ validate_args(args: any)
+ build_exec_args_serve(args: any, exec_model_path: str) : list[str]
+ execute_command(model_path: str, exec_args: list[str], args: any)
+ serve(args: any)
+ inspect(args: any)
+ get_model_registry(args: any) : str
}
Ollama -- ModelStore
Ollama --|> Model
Updated class diagram for Huggingface ModelclassDiagram
class Huggingface {
- model: str
- model_tag: str
- directory: str
- filename: str
- store: ModelStore
- hf_cli_available: bool
+ __init__(model: str, store_path: str = "")
+ login(args: any)
+ logout(args: any)
+ pull(debug: bool = False) : str
+ push(source: str, args: any)
}
class Model {
<<Abstract>>
- model: str
- type: str
+ login(args: any)
+ logout(args: any)
+ pull(args: any)
+ push(source: str, args: any)
+ is_symlink_to(file_path: str, target_path: str) : bool
+ garbage_collection(args: any)
+ setup_container(args: any)
+ exec_model_in_container(model_path: str, cmd_args: list[str], args: any)
+ build_exec_args_perplexity(args: any, model_path: str) : list[str]
+ check_name_and_container(args: any)
+ build_prompt(args: any) : str
+ execute_model(model_path: str, exec_args: list[str], args: any)
+ validate_args(args: any)
+ build_exec_args_serve(args: any, exec_model_path: str) : list[str]
+ execute_command(model_path: str, exec_args: list[str], args: any)
+ serve(args: any)
+ inspect(args: any)
+ get_model_registry(args: any) : str
}
Huggingface -- ModelStore
Huggingface --|> Model
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
@rhatdan @ericcurtin Could you have a brief look at this? Its still in draft, but I wanted to get your opinion if this is going in a good direction or not. |
e2cd7ac
to
c3fc4fd
Compare
""" | ||
h = hashlib.new("sha256") | ||
h.update(to_hash.encode("utf-8")) | ||
return f"sha256:{h.hexdigest()}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this function, actually thinking about making a breaking change soon and changing ':' character to '-' on the filesystem like Ollama, just not for this PR. For one, on some filesystems ':' is an illegal character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switching :
with -
makes much sense.
Since this PR would introduce a breaking regarding the file storage anyway, I can include this as well.
if e.code == HTTP_RANGE_NOT_SATISFIABLE: # "Range Not Satisfiable" error (file already downloaded) | ||
return # No need to retry | ||
# "Range Not Satisfiable" error (file already downloaded) | ||
if e.code in [HTTP_RANGE_NOT_SATISFIABLE, HTTP_NOT_FOUND]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #818
return True | ||
else: | ||
return False | ||
return available("huggingface-cli") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
HUGGINGFACE = "huggingface" | ||
OLLAMA = "ollama" | ||
OCI = "oci" | ||
URL = "url" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could see the community adding even more here, such as s3:// , just something to keep in mind, I wonder should we continue with https, etc. We do have to code up any protocol we support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Its already different with the default https download and the huggingface-/ollama-cli and oci- although they use https as well, the "library" is different. Maybe I can inject a class/callable - which specifies HOW to download - to the model stores new_snapshot
function. Then this would be easily extensible.
The code looks great to me so far, don't see any significant issues at the moment at a high-level. |
Added a model store to standardize pulling, storing and using models across the different repositories. Signed-off-by: Michael Engel <[email protected]>
c3fc4fd
to
5f826a6
Compare
Since this is a breaking change I guess that's the main question, do we want to break all users now on upgrade? I just want to be sure the benefits are worth it. If we gain some functional benefits I think it's a good idea. What I'm less of a fan of is non-functional breaking changes like renaming "repos/models" -> "store" . But if we need to break everything for functional reasons anyway, now is certainly the time to do renames and the rename is fine. I also think it's possible to do this in a non-breaking way in the existing ~/.local/share/ramalama/repos directory. ~/.local/share/ramalama/models/ is just a directory intended to be a more presentable way for humans and for "ramalama ls" to parse, etc. The metadata, messy implementation details were intended to go to ~/.local/share/ramalama/repos/ directory. It's worth passing this by @swarajpande5 also, he wrote a fair bit of this. If we decide we are doing breaking changes, another one we should do is add .gguf extension to the Ollama .gguf files, some tools refused to load them in the past without that file extension. Example that we ended up fixing vllm-side: vllm-project/vllm#7993 But just fair warning @engelmi there's many tests, assumptions built up around the existing technique. This could be tough to get through CI. You will likely spend a lot of time massaging the tests here, etc. If you do this in a non-breaking way it will be less effort to code and less impact on the users when multi-file models such as the safetensors one are enabled. |
The "store" directory is just an intermediate for local development at the moment. It would probably be best to switch to the model directory, i.e. having
WDYT? @ericcurtin @swarajpande5
Yes, it should be possible to support multiple files with the current approach. I am wondering, though, if it makes things harder to understand and maintain - code-wise as well as from the directories and files. The messy implementation details should be only in code where we can encapsulate them (i.e. in the Ollama/HF/URL/etc. Model) and the file storage should be as standardized as possible, I think.
Sounds good to me!
Yes, that is true and will be quite cumbersome to change. |
I really like this PR, but it needs a rebase. We need to get this in ASAP, and probably need a mechanism to upgrade people, rather then break their stores. |
Added a model store to standardize pulling, storing and using models across the different repositories.
A key goal of this store is to support downloading multiple files, e.g. the chat template from ollama models or additional metadata from non-GGUF models. In addition, this is probably a first step towards enabling the usage of safetensors (#642) where a model consists of multiple files.
The proposed structure is inspired by how the huggingface-cli stores its files and extends it for the multi-source usage in ramalama.
The proposed storage structure looks like this after running
ramalama pull ollama://tinyllama
ramalama pull hf://ibm-granite/granite-3b-code-base-2k-GGUF/granite-3b-code-base.Q4_K_M.gguf
Summary by Sourcery
New Features: