Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for listing all files in a location, including files in "subdirectories" #73

Open
aucampia opened this issue Jun 20, 2021 · 3 comments

Comments

@aucampia
Copy link
Contributor

aucampia commented Jun 20, 2021

Currently List() only returns files directly under a specific location. For example, with Google Cloud Storage (gs), if I have f0.txt, f1.txt, d0/f0.txt, d0/f1.txt in location loc, doing loc.List() only returns f0.txt, f1.txt, not files under d0/. It would be nice to have a way to find files with arbitrary depth.

Possible names for the method could be Location.ListAll(), any other suggestions would be appreciated.

@funkyshu
Copy link
Member

This is something C2FO hasn't really had a use case for but I'm definitely open to improving it. Personally I've never liked the trio of List functions we provide. It might make more sense to have a generic ListFunc() function that allows some preset constant functions to passed as well as some user provided function. An API might look like:

    dirFiles := myLoc.ListFunc(utils.FileFilter)
    subdirs := myLoc.ListFunc(utils.DirFilter)
    recursiveFiles := myLoc.ListFunc(utils.AllRecursiveFiles)
    etc...

The trick is trying to remain efficient when you don't really intend to do a recursive calls. Also, obviously os is going to recurse differently than S3 (which doesn't need to recurse at all).
Our rigid rules on that a URI ending / is a directory (note that we have terrible support for alternate dir delimiters like Window's \) is actually is helpful in determining type (Location vs File). This could be helpful recursion (without having to stat in os for instance to determine type).

@aucampia
Copy link
Contributor Author

Thanks for the inputs, I am not sure when/if I will work on this, we also do not have a use case for this now but may have in future and will work on it if we need it.

@safaci2000
Copy link
Contributor

You could potentially also add pagination to avoid bogging down the system. Recursive calls are very handy but also if the call returns 50,000 objects it's a bottle neck.

@funkyshu funkyshu added the v7 label Dec 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants