Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support for Buildstream (buildstream.build) #1533

Open
jmt-lab opened this issue Dec 11, 2024 · 14 comments
Open

Feature Request: Support for Buildstream (buildstream.build) #1533

jmt-lab opened this issue Dec 11, 2024 · 14 comments

Comments

@jmt-lab
Copy link

jmt-lab commented Dec 11, 2024

Hi, this product is really nice especially since it has an easy configuration to work with. However it runs into some compatibility issues with buildstream (buildstream.build). I figured since there is no issue here yet I'd go ahead and create one. The two issues i've run into and some solutions i've found:

  • Direct use of nativelink as the artifact cache server fails when buildstream tries to initialize remote caches with:
[--:--:--][        ][    main:core activity                 ] WARNING Failed to initialize remote http://localhost:50051: Remote initialisation failed with status UNKNOWN: FetchBlob: 2: Stream removed

I was able to work around this funny enough by putting a buildbox-casd local cache via

buildbox-casd --cas-remote grpc://localhost:50051 --cas-instance main --bind 127.0.0.1:50052 /tmp/casd

Then it looks like everything was happy

  • And similar to with buildbarn it runs into this fun shenanigans:
Staging the input root as the filesystem root[](https://docs.buildstream.build/2.3/using_configuring_remote_execution.html#staging-the-input-root-as-the-filesystem-root)

BuildStream requires that the input root given to the remote execution service be treated as the absolute filesystem root.

This is because BuildStream provides guarantees that all build dependencies, including the base runtime and compilers, are defined by elements and run within a sandboxed and isolated build environment, but the [REAPI](https://github.com/bazelbuild/remote-apis) was originally developped without this determinism and control in mind. Instead, typically it is up to the user to configure a cluster to use a docker image to build payloads with, rather than allowing the REAPI client to control the entire sandbox.

Unfortunately the ability to dictate that the input root be treated as the filesystem root in a container on remote workers in the cluster is not yet standardized in the REAPI protocol.

Note

The input root is referred to as the input_root_digest member of the Action message as defined in the [protocol](https://github.com/bazelbuild/remote-apis/blob/main/build/bazel/remote/execution/v2/remote_execution.proto)

This manifests in having permission denied on the work environment since it expects to use absolute paths, One way to address this could be to allow some sort of method to tell nativelink workers to run everything through buildbox-run-bubblewrap or something of the sort.

@MarcusSorealheis
Copy link
Collaborator

Thank you for raising the issue. We will prioritize this it internally and revert back.

@jmt-lab
Copy link
Author

jmt-lab commented Dec 12, 2024

Thanks! Let me know if there is anything you want me to try, I'm also going to poke around and see what I can find out myself.

@allada
Copy link
Member

allada commented Dec 13, 2024

Any chance you have a cluster configuration yaml (k8 or docker or something) we can use to debug this?

@jmt-lab
Copy link
Author

jmt-lab commented Dec 13, 2024

I used the docker-compose example you have in your repo for my testing and setup buildstream with the following:

~/.config/buildstream.conf

artifacts:
  servers:
  - url: http://localhost:50051 (or cas port)
     type: push
remote-execution:
  execution-service:
    url: http://localhost:50051
  action-cache-service:
    url: http://localhost:50051
  storage-service:
    url: http://localhost:50051

I also used the base usage tutorial for a simplet est project: https://docs.buildstream.build/2.3/using_tutorial.html specifically the running commands hello.c example

Copy link

algora-pbc bot commented Feb 4, 2025

💎 $2,500 bounty • TraceMachina

Steps to solve:

  1. Start working: Comment /attempt #1533 with your implementation plan
  2. Submit work: Create a pull request including /claim #1533 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to TraceMachina/nativelink!

Add a bountyShare on socials

@kernel-loophole
Copy link

hi would love to work on this if anyone is not working.

@MarcusSorealheis
Copy link
Collaborator

@kernel-loophole feel free! The first to complete the issue is the winner.

@kernel-loophole
Copy link

kernel-loophole commented Feb 6, 2025

HI @MarcusSorealheis where i can find artifact cache server implementation. is this is the one https://github.com/TraceMachina/nativelink/blob/main/nativelink-config/src/cas_server.rs

@MarcusSorealheis
Copy link
Collaborator

Yes it's the one

@kernel-loophole
Copy link

kernel-loophole commented Feb 6, 2025

@MarcusSorealheis how to build this locally using nix. i try this https://www.nativelink.com/docs/contribute/nix/ but does not work using this command nix run ./nativelink-config/examples/basic_cas.json in nativelink dir or i build using sudo docker compose up

@kernel-loophole
Copy link

@jmt-lab can you guide me through to replicate this issue

@jmt-lab
Copy link
Author

jmt-lab commented Feb 10, 2025

Hi there sure, I would use the standard C tutorial project from buildstream (https://docs.buildstream.build/2.4/tutorial/running-commands.html) as this is what I used after using my bigger project to test. At that point I spun up nativelink using the docker compose example for running nativelink (https://github.com/TraceMachina/nativelink/tree/main/deployment-examples/docker-compose). I then setup the link in ~/.config/buildstream.conf like so:

artifacts:
  servers:
  - url: http://localhost:50051
    push: true
remote-execution:
  execution-service:
    url: http://localhost:50052
  action-cache-service:
    url: http://localhost:50051
  storage-service:
    url: http://localhost:50051

Then when you try and use bst build hello.bst it will try to initialize the remote cache and the warning that FetchBlob: Stream removed will occur which is the first issue. From the log of local cas:

2025-02-10T19:28:49.589254Z ERROR nativelink_service::bytestream_server: error: status: InvalidArgument, message: 
"Cannot upload same UUID simultaneously : In ByteStreamServer::write", details: [], metadata: MetadataMap { headers: {} }
    at nativelink-service/src/bytestream_server.rs:641
    in nativelink_service::bytestream_server::write with request: Streaming
    in nativelink_util::task::http_executor
    in nativelink::services::http_connection with remote_addr: 127.0.0.1:51486, socket_addr: 0.0.0.0:50051

@kernel-loophole
Copy link

@jmt-lab thanks really appreciated

@Aditya-Choudhry
Copy link

/request can i start working on it
TraceMachina
TraceMachina/nativelink#1533

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants