-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use regular expressions to resolve literals; URIs, CURIEs #83
Comments
When a property is "@context":
"@sigil": $
$base: https://friends.com/
knows: {$type: $id}
$id: valexiev
name: Vladimir
knows:
$id: gkellogg
name: Gregg Because writing |
I like these example... and in this context, the use of CURIEs. |
I feel concerned about both
But the ussue is there. Perhaps tags like |
@gkellogg wrote
Can you elaborate? Where?
I also like treating URIs and CURIEs in a uniform way (so @anatoly-scherbakov , not using If the context knows that epcis:epcList:
- https://id.gs1.org/01/70614141123451/21/2018
- gtin:70614141123451/21/2018
# OR
epcis:epcList: [https://id.gs1.org/01/70614141123451/21/2018, gtin:70614141123451/21/2018] If the context doesn't know, or epcis:epcList:
- !id https://id.gs1.org/01/70614141123451/21/2018
- !id gtin:70614141123451/21/2018
# OR
epcis:epcList: [!id https://id.gs1.org/01/70614141123451/21/2018, !id gtin:70614141123451/21/2018] |
@VladimirAlexiev I didn't know CURIE. Probably something like the following can work, but this probably requires a specialized namespace.
See security considerations related to tags. They are valid in general and not only in this case. (see https://www.ietf.org/archive/id/draft-ietf-httpapi-yaml-mediatypes-03.html#name-arbitrary-code-execution). This can be packed with other tag-related features in a specific namespace. |
The
That was basically my thought, although depending on context resolution is a major shortcoming, as I've tried to avoid duplicating the context processing as is done in the JSON-LD Expansion process, so this might be limited to only the top-most context definition without consideration of scoped-contexts. It's definitely a weakness. Relying on specific use of Admittedly, my understanding of YAML is pretty basic, so there may be some details of the YAML syntax which are either incompatible, or not properly exploited in these ideas. |
in my experience, it needs quite some time to exploit all the possible ideas. I think that once we "release" the basic profile of yamlld and its media type registration, the json-ld ecosystem will provide enough material to speriment all those not-yet-standardized ideas in the real world. We will have enough experience to address all the actual issues that will arise to design an extension and identify best practices. |
@anatoly-scherbakov I also worry about non-standard YAML parsers (even before this issue), since I haven't seen any parser to properly handle custom tag definitions. As @gkellogg wrote, you can't even use a URI in the tag definition, but have to use the weird
Ok, but you mean resolving tags to URLs. I mean associating regexes with tags so you don't need to use a tag with the value. CURIEs are a bit of a side topic in this issue:
|
This was only true for one parser I tried written in Perl, but I've lost the reference now. LibYAML, which is fairly widely used, requires an escape of the '#' character, but otherwise seems to parse ASCII-space URIs.
Given the varying state of support for the full spec, it would be good do run some cross-platform tests to identify restrictions on using such features.
In the extended mode, operating on the Representation Graph, we could probably add additional regular expressions to identify types of literals, for example dates, times, dateTimes, and various number formats similar to how specified in Tag Resolution. Given legal relative forms, doing so for an IRI is challenging, but the forms defined in RFC3986/7 can at least determine if one is considered valid. It may be that it would be too eager, and consider "foo" as being an IRI, as it is a valid path component. Limiting it to full/absolute IRIs would help, but it's still very broad; basically, anything with a ':' could be considered an IRI.
|
Paul Tyson [email protected] to [email protected], Nov 17, 2022: {
"@context": {
"ex": "http://example.org/ns/",
},
"ex:thing1": {"ex:foo": 1},
"ex:thing2": {"ex:foo": "a string"},
"ex:thing3": {"ex:foo": "http://example.org/yugo"}
"ex:thing4": {"ex:foo": "2022-11-16T21:04:41"}
} Is there any way to construct the context to make this come out in RDF like: _:b0 <http://example.org/ns/thing1> _:b1 .
_:b0 <http://example.org/ns/thing2> _:b2 .
_:b0 <http://example.org/ns/thing3> _:b3 .
_:b0 <http://example.org/ns/thing4> _:b4 .
_:b1 <http://example.org/ns/foo>
"1"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:b2 <http://example.org/ns/foo> "a string" .
_:b3 <http://example.org/ns/foo> <http://example.org/yugo> .
_:b4 <http://example.org/ns/foo>
"2022-11-16T21:04:41"^^<http://www.w3.org/2001/XMLSchema#dateTime> . Vladimir: There's no way to do this in JSON-LD.
You're asking to leverage regexes to attach appropriate datatypes to literals. I've only seen this in Perl:
We're discussing similar stuff for YAML-LD, see this issue |
I made this a few weekends back: https://github.com/transmute-industries/jsonld-to-cypher It has something related internally here: https://github.com/transmute-industries/jsonld-to-cypher/blob/main/src/utils.js#L30 |
As a data architect
I want YAML plain values to be recognized by regular expression
So that I don't have to explicitly tag them
"Application specific tag resolution rules should be restricted to resolving the “?” non-specific tag, most commonly to resolving plain scalars. These may be matched against a set of regular expressions to provide automatic resolution of integers, floats, timestamps and similar types."
Examples (Tagging @OR13 and @mgh128 who work with EPCIS data):
Note: the benefit of datatype
xsd:anyURI
(tag!anyURI
) is that:We could also use explicit delimiters eg
<...>
around URNs (URIs, IRIs), which will also enable the use of CURIEs.Eg below each of the props
epcis:readPoint, epcis:epcList
has 2 identical values (first a full URN, then a CURIE),without having to declare that these are
@id
properties:The CURIE spec says
I've seen this once in practice:
geo:lat
(prefix for eg the WGS ontology), vs<geo:1.23,4.56>
(point using thegeo:
URI scheme, so the above prefix if used in a context precludes you from using this scheme)Safe_CURIEs use
[...]
delimiters, so we could rewrite the above example as follows:Here we avoid the need for any delimiters in URNs, but the brackets can be confused for "array in flow style":
We'd need extra spaces around the array brackets, and some damn specialized YAML parsers to grok this.
@ioggstream @gkellogg @anatoly-scherbakov
The text was updated successfully, but these errors were encountered: