50 clowder20 submit file to extractor #51

tcnichol · 2022-08-15T21:37:45Z

These changes will allow a file to be submitted to an extractor and the metadata will post. I have not yet handled cases where new files are uploaded or tags. Easiest way to test is to use the wordcount extractor.

With this branch, add .env file to the pyclowder directory and put in

clowder_version=2.0

Right now I am sending in the Bearer Token from clowder2.0, and then using the Bearer Token in place of the extractor-key or secretKey. I am not sure that this will be a good strategy long term. If an extractor takes a long time to complete, the token may expire, but this seemed like a good enough approach for now.

The branch this works with for clowder2.0 is

https://github.com/clowder-framework/clowder2/tree/register-extractor-submit-file

at different points in the code, the version will be checked different methods will use different endpoints and will use the bearer token instead of the key

problem - the method in the extractor is not getting the secret key possible fix - use token for secret key

need to add extractor info

more to be added later for datasets once that completed in clowderv2

…o make sure future metadata that matches is not reprocessed)

some routes not implemented in v2 clowder, left for later

tcnichol · 2022-10-06T18:08:26Z

This clowder2 pull request is also reliant on this one:

clowder-framework/clowder2#128

max-zilla · 2023-01-05T15:16:11Z

Ran this with Clowder v1 develop and wordcount worked!

max-zilla

Tested with clowder v1 and v2 both.

tcnichol · 2023-01-17T20:02:20Z

For testing, if you are testing with clowder v2, here is the entry for extractor 'wordcount' you can add to listeners. if you run clowder2 and have wordcount running at the same time, it will submit and post metadata back, which should now be visible on main.

{ "_id": { "$oid": "63b5cd4aeb1180d52266214e" }, "author": "Rob Kooper <[email protected]>", "name": "ncsa.wordcount", "version": "2.0", "description": "WordCount extractor. Counts the number of characters, words and lines in the text file that was uploaded.", "creator": null, "created": { "$date": { "$numberLong": "1672858954451" } }, "modified": { "$date": { "$numberLong": "1672858954451" } }, "properties": { "author": "Rob Kooper <[email protected]>", "process": { "file": [ "text/*", "application/json" ] }, "maturity": "Development", "name": "ncsa.wordcount", "contributors": [], "contexts": [ { "lines": "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#lines", "words": "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#words", "characters": "http://clowder.ncsa.illinois.edu/metadata/ncsa.wordcount#characters" } ], "repository": [ { "id": { "$oid": "63b5cd4aeb1180d52266214d" }, "repository_type": "git", "repository_url": "" } ], "external_services": [], "libraries": [], "bibtex": [], "default_labels": [], "categories": [], "parameters": { "schema": { "X_MIN_START": { "type": "integer", "title": "X_MIN_START" }, "X_MIN_END": { "type": "integer", "title": "X_MIN_END" }, "Y_MIN_START": { "type": "integer", "title": "Y_MIN_START" }, "Y_MIN_END": { "type": "integer", "title": "Y_MIN_END" }, "ZONE": { "type": "string", "title": "ZONE" } } }, "version": "2.0" } }

used in filedigest

upload file for v2 did not properly return the id of the uploaded file. added a text file (looks like the original was not checked in) gave it a more descriptive name.

files.get_summary gets the equivalent of the file info in v2

should fix build errors

…ctor 287 clowder2 test extractor

tcnichol added 6 commits August 11, 2022 15:38

adding new methods for making pyclowder compatible with clowder v2

8838bb1

at different points in the code, the version will be checked different methods will use different endpoints and will use the bearer token instead of the key

works up until upload metadata

be720b0

problem - the method in the extractor is not getting the secret key possible fix - use token for secret key

seems to get rather than post metadata

f30c23a

metadata does post

dd5c4ef

need to add extractor info

removing debug line

3ab801e

extractor info sent to clowder2.0 now fits the ExtractorIn parameters

ff13256

tcnichol requested review from lmarini and max-zilla August 15, 2022 21:37

tcnichol linked an issue Aug 15, 2022 that may be closed by this pull request

clowder2.0 - submit file to extractor #50

Open

tcnichol marked this pull request as draft August 17, 2022 20:22

tcnichol added 6 commits August 17, 2022 16:26

adding new class for api files, metadata

f98a5dc

more to be added later for datasets once that completed in clowderv2

calling v2 files if that is the version

97470eb

back to unprocessable entry problem

fe490f6

partially fixed problem of metadata need better solution (also need t…

d31e78a

…o make sure future metadata that matches is not reprocessed)

calling methods in api v2 for files and datasets

fb8b0b5

some routes not implemented in v2 clowder, left for later

fix typo

518a18d

tcnichol mentioned this pull request Aug 24, 2022

Register extractor submit file clowder-framework/clowder2#66

Merged

tcnichol added 2 commits October 5, 2022 17:20

using token instead of key

dccc321

making sure to use token instead of key

2852221

tcnichol mentioned this pull request Oct 6, 2022

127 add parameters to extractor submit add submit dataset clowder-framework/clowder2#128

Merged

tcnichol marked this pull request as ready for review October 6, 2022 18:07

tcnichol added 7 commits October 12, 2022 13:13

moving methods to api.v1.datasets

50e4a16

file methods in api.v2 and api.v1

cfb5f4a

python-dotenv and not just dotenv

a32a78d

adding imports

94d5648

removing token from v1 datasets

4d34b4f

using datasets v1

06aeed0

fixing token not passed in

fa4dd8b

lmarini requested a review from robkooper October 24, 2022 21:24

cleanup a few client parameters

4147390

max-zilla self-requested a review January 5, 2023 15:16

max-zilla approved these changes Jan 5, 2023

View reviewed changes

lmarini and others added 24 commits January 18, 2023 11:12

Added __init__.py to v2 directories.

6780c28

support clowder2 job_id field

bbcf1b9

making 'contents' 'content' to match v2 metadata

916ec25

both content and contents

2d4a1e3

clowder version passed in as argument

70861e2

use environment variable

7d556a0

new method for getting download url

22c9d6e

used in filedigest

handling cases with contexts for v1 and v2

acfd080

fixing a bug in line

f36a707

fixing, pass in client

42a0974

in v2 files have 'name' not 'filename'

f508bf5

'name' not 'filename' in v2

c856883

using host and key as arguments to match v1

b0a94e2

Initial framework to test extractors

da060b0

A few fixes :

4cb4058

upload file for v2 did not properly return the id of the uploaded file. added a text file (looks like the original was not checked in) gave it a more descriptive name.

adding a new method

5690899

files.get_summary gets the equivalent of the file info in v2

Used new function download_summary to download file summary

55dacf3

different python versions for github actions

5b2c5a2

changing python versions for github actions

392fa2a

should fix build errors

fixing typo (client)

734da54

fixing typo (client)

c7d1926

client.host not host

08b438b

client.host and client.key, not host and key

f668d7e

Merge pull request #54 from clowder-framework/287-clowder2-test-extra…

516fa2f

…ctor 287 clowder2 test extractor

tcnichol closed this Mar 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

50 clowder20 submit file to extractor #51

50 clowder20 submit file to extractor #51

tcnichol commented Aug 15, 2022 •

edited

Loading

tcnichol commented Oct 6, 2022

max-zilla commented Jan 5, 2023

max-zilla left a comment

tcnichol commented Jan 17, 2023

50 clowder20 submit file to extractor #51

50 clowder20 submit file to extractor #51

Conversation

tcnichol commented Aug 15, 2022 • edited Loading

tcnichol commented Oct 6, 2022

max-zilla commented Jan 5, 2023

max-zilla left a comment

Choose a reason for hiding this comment

tcnichol commented Jan 17, 2023

tcnichol commented Aug 15, 2022 •

edited

Loading