-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non 14-digit datetime reports "at None" in 404 HTML body #286
Comments
Whoh, here's a mess. The links to the Link and CDXJ TimeMaps (added in #285) also do not properly extract out the URI-R (e.g., the Link link goes to http://localhost:5000/timemap/link/) |
A big problem here is that 2016 is treated as part of the URI-R...sort of, instead of being captured as part of the datetime (see #286). Maybe we should provide a sanity check on the .split('/')[0] value to ensure it's a valid hostname. Compare to a 14-digit fabricated datetime: The partial culprit here is the newly created |
Handle non-14-digit datetimes. Closes #286
In MemGator, it is implemented as following: var regs = map[string]*regexp.Regexp{
// some stuff
"dttmstr": regexp.MustCompile(`^(\d{4})(\d{2})?(\d{2})?(\d{2})?(\d{2})?(\d{2})?$`),
// some stuff
}
func paddedTime(dttmstr string) (dttm *time.Time, err error) {
m := regs["dttmstr"].FindStringSubmatch(dttmstr)
dts := m[1]
dts += (m[2] + "01")[:2]
dts += (m[3] + "01")[:2]
dts += (m[4] + "00")[:2]
dts += (m[5] + "00")[:2]
dts += (m[6] + "00")[:2]
var dtm time.Time
dtm, err = time.Parse("20060102150405", dts)
dttm = &dtm
return
} |
That would be an improvement, @ibnesayeed. I was also mistaken in attributing the issue to The more liberal date handled (1-14 digits instead of hard at 14) now handles the request instead of the general handler in replay. #301 would allow for more strategic specification of dates. I agree that 0-padding is not a correct assumption but the issue in this ticket was to make the above display function correctly to provide a listing of the URI-Ms for a URI-R when a capture for the datetime (in this case, ill-specified) instead of a blank, unhelpful display. The latter problem you mentioned might also be an issue, though I believe WARC/1.1 allows support for datetimes beyond 14-digits, so we may want to attempt to interpret a datetime > 14 digits as one beyond the conventional granularity. |
No, I did not mean more than 14-digit here. I was illustrating a side-effect of 0-padding that will make month 2 to 20. Also, the above illustrated MemGator code accepts datetime in the |
Right, I understand this. I want to propose the >14-digit datetime as a third bullet to the two cases you mentioned above. They should all be handled in #301. |
A datetime with a smaller granularity than one second is good for WARC, but perhaps not very practical from replay lookup perspective at this time (this may change in future). |
On a related note, a WARC containing other facets of the 1.1 spec would be good for testing. |
The text was updated successfully, but these errors were encountered: