Skip to content

Commit

Permalink
Model: Fix generate window fallback
Browse files Browse the repository at this point in the history
Use max_seq_len as the numerator, not the max_tokens. Mismatched
parameter.

Signed-off-by: kingbri <[email protected]>
  • Loading branch information
kingbri1 committed Feb 6, 2024
1 parent 543a9b6 commit fedebad
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion backends/exllamav2/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -515,7 +515,7 @@ def generate_gen(self, prompt: str, **kwargs):
max_tokens = unwrap(kwargs.get("max_tokens"), 150)
stream_interval = unwrap(kwargs.get("stream_interval"), 0)
generate_window = max(
unwrap(kwargs.get("generate_window"), 512), max_tokens // 8
unwrap(kwargs.get("generate_window"), 512), self.config.max_seq_len // 8
)

# Sampler settings
Expand Down

0 comments on commit fedebad

Please sign in to comment.