Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows-MTL-NPU]: OSError: [WinError -529697949] Windows Error 0xe06d7363 #12762

Open
raj-ritu17 opened this issue Jan 29, 2025 · 2 comments
Open
Assignees

Comments

@raj-ritu17
Copy link

Device: MTL Core Ultra 165U
OS: Windows 11

Objective: tried to run the llm model on NPU using ipex-llm, and got following error:

python llama3.py --repo-id-or-model-path "meta-llama/Llama-3.2-1B-Instruct" --save-directory "llama-3.2-1B-Inst"
2025-01-29 16:58:29,747 - INFO - Converting model, it may takes up to several minutes ...
2025-01-29 16:58:39,059 - INFO - Finish to convert model
start compiling
Model saved to llama-3.2-1B-Inst\lm_head.xml
start compiling
start compiling
Model saved to llama-3.2-1B-Inst\embedding_post.xml
start compiling
Model saved to llama-3.2-1B-Inst\embedding_post_prefill.xml
start compiling
Model saved to llama-3.2-1B-Inst\decoder_layer_0.xml
start compiling
Model saved to llama-3.2-1B-Inst\decoder_layer_1.xml
**start compiling
Traceback (most recent call last):
  File "C:\Users\intel\ritu\ipex-llm\python\llm\example\NPU\HF-Transformers-AutoModels\LLM\llama3.py", line 73, in <module>
    model = AutoModelForCausalLM.from_pretrained(**
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\unittest\mock.py", line 1378, in patched
    return func(*newargs, **newkeywargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\ipex_llm\transformers\npu_model.py", line 243, in from_pretrained
    model = cls.optimize_npu_model(*args, **optimize_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\ipex_llm\transformers\npu_model.py", line 317, in optimize_npu_model
    optimize_llm_single_process(
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\ipex_llm\transformers\npu_models\convert.py", line 458, in optimize_llm_single_process
    convert_llm(model,
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\ipex_llm\transformers\npu_pipeline_model\convert_pipeline.py", line 215, in convert_llm
    convert_llm_for_deploy(model,
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\ipex_llm\transformers\npu_pipeline_model\convert_pipeline.py", line 549, in convert_llm_for_deploy
    convert_llama_layer(model, 0, n_splits_linear, n_splits_down_proj,
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\ipex_llm\transformers\npu_pipeline_model\llama.py", line 294, in convert_llama_layer
    single_decoder = LowBitLlamaMultiDecoderlayer(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\ipex_llm\transformers\npu_models\llama_mp.py", line 216, in __init__
    self.compile(npu_dpu_groups=6)
  File "C:\Users\intel\miniforge3\envs\llm-npu\Lib\site-packages\intel_npu_acceleration_library\backend\factory.py", line 1054, in compile
    backend_lib.compile(self._mm, npu_dpu_groups)
**OSError: [WinError -529697949] Windows Error 0xe06d7363**
@Goldlionren
Copy link

I met the same issue last weekend. finally, I find a way to get it running. Use git to clone the entire repo from hugging fack to local drive, like D:\llm-npn\Llama-3.2-1B-Instruct, and create another empty folder D:\llm-npn\Llama-3.2-1B-Instruct-lowbit, then run this command as: python llama3.py --repo-id-or-model-path "D:/llm-npn/Llama-3.2-1B-Instruct" --save-directory "D:/llm-npn/Llama-3.2-1B-Instruct-lowbit".

@plusbang
Copy link
Contributor

plusbang commented Feb 5, 2025

Hi @raj-ritu17 , could you please provide your NPU driver version and ipex-llm version?

As introduced in our doc (https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/npu_quickstart.md), 32.0.100.3104 driver is highly recommended and set IPEX_LLM_NPU_MTL=1 is necessary for MTL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants