Skip to content

Commit

Permalink
Throw exception when WARC date not parseable per @ibnesayeed and #283
Browse files Browse the repository at this point in the history
  • Loading branch information
machawk1 committed Jun 19, 2020
1 parent 182d9ef commit 98c1cb4
Showing 1 changed file with 16 additions and 0 deletions.
16 changes: 16 additions & 0 deletions ipwb/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import os
import sys
import requests
import dataclasses
import ipfshttpclient4ipwb as ipfsapi

import re
Expand Down Expand Up @@ -159,13 +160,25 @@ def rfc1123ToDigits14(rfc1123DateString):
return d.strftime('%Y%m%d%H%M%S')


@dataclasses.dataclass(frozen=True)
class InvalidWARCDateException(Exception):
target_string: str

def __str__(self):
return 'WARC-Date {self.target_string} not parseable.'.format(
self=self,
)


def iso8601ToDigits14(warcDatetimeString):
setLocale()

iso8601_datestrings = ["%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%dT%H:%MZ",
'%Y-%m-%dT%HZ', '%Y-%m-%d', '%Y-%m', '%Y',
'%Y-%m-%dT%H:%M:%S.%fZ']

d = None

for format in iso8601_datestrings:
try:
d = datetime.datetime.strptime(warcDatetimeString, format)
Expand All @@ -178,6 +191,9 @@ def iso8601ToDigits14(warcDatetimeString):

# TODO: Account for conversion if TZ other than GMT not specified

if d is None:
raise InvalidWARCDateException(target_string=warcDatetimeString)

return d.strftime('%Y%m%d%H%M%S')


Expand Down

3 comments on commit 98c1cb4

@machawk1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibnesayeed Here is an edit to #283 in the spirit of your suggestion to use an indicative exception. This could also probably be done by not initializing d = None then just have a line with d instead of the conditional but I think the check for d is None is more clear. Comments welcome.

@ibnesayeed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, though we can leverage f-strings now because we have moved to Python 3.7+.

@machawk1
Copy link
Member Author

@machawk1 machawk1 commented on 98c1cb4 Jun 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, @ibnesayeed, I converted to an f-string in 8f65140. I would still like to figure out the sub-second case before closing #283 ( see #283 (comment) )

Please sign in to comment.