-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update HSTS check #203
base: develop
Are you sure you want to change the base?
Update HSTS check #203
Conversation
…ons. Removed regex based string manipulation in favor of simple split calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few questions, and there are a few lines that I think merit comments.
…a comment to explain the hsts_max_age check rationale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still have an open question regarding how we handle multiple headers, and I added a further question related to the max-age value evaluation.
pshtt/pshtt.py
Outdated
# host, and therefore only strictly positive values indicate | ||
# the presence of HSTS protection. See | ||
# https://tools.ietf.org/html/rfc6797#section-6.1.1 for more | ||
# details. | ||
if endpoint.hsts_max_age is None or endpoint.hsts_max_age <= 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is an argument here for letting endpoint.hsts
indicate only the presence of a syntactically correct HSTS header. That value can be combined with endpoint.hsts_max_age
and endpoint.hsts_all_subdomains
to determine if the domain "supports HSTS." To that end, I think it makes sense to change endpoint.hsts_max_age <= 0
to endpoint.hsts_max_age < 0
since a max-age value of zero is valid according to the RFC.
What do you think, @mcdonnnj?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on changing <=
to <
and only checking for syntactical correctness here. That makes sense for what this part of the code is trying to do (which is just establish that we're getting a valid header and what it contains from the response).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsf9k @mcdonnnj I understand wanting to impart valid HSTS syntax, but rather than saying HSTS=True when the max-age is set to zero, could we instead Fail "HSTS" while still reporting the HSTS header in "HSTS Header" since technically the a value of zero means "forget me as an HSTS host/I don't do HSTS"? Alternatively, we could relabel the "HSTS" column as "Valid HSTS" if you'd prefer that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@climber-girl A valid of zero is valid HSTS, according to the RFC.
Don't forget that pshtt is not the end of the line when it comes to the BOD reporting. The pshtt results are interpreted via cisagov/pshtt_reporter, and that is where the actual report is created. In this case, they would still fail "Supports HSTS" in the report, since max-age is too small.
In the past we have tried to make pshtt and trustymail simply collect information that is then interpreted by the reporting code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsf9k I understand that it is valid per the RFC, hence the secondary alternative I proposed to relabel the current HSTS column that is reported (which would I guess be included in the pshtt_reporter?). For the purposes of collecting info with pshtt, I'm fine with the change you and @mcdonnnj discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already thought that the HSTS column was referring to whether it was valid since it can already be False
and still have data in the HSTS Header column. I don't think the column name needs to be changed since that is already how it is used and there may be downstream effects from changing the column name. Generally, though, I like the trustymail
way of having one column for existence and then another column for validity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@climber-girl, if the value is zero and the HSTS
column is True
, then the Domain Uses Strong HSTS
column will still be False
. Therefore the domain will "fail HSTS" in the HTTPS report. The HSTS
column in the raw pshtt results only says whether a valid HSTS header is being served, but the Domain Uses Strong HSTS
includes the max-age check.
…or negative and none. Added links to relevant sections explaining multiple HSTS headers.
Now that we're only checking strictly for compliance to the RFC, the wording needs to be reworked a little.
This PR would update the
hsts_check()
function's method of processing the HSTS header. Currently we do not closely conform to RFCs 7230 and 6797 with regard to valid header formats. This is specifically related to domains returning multipleStrict-Transport-Security
headers. This has allowed some domains to pass checks they might not otherwise because they are responding with an invalid HSTS header format.This change would result in domains that previously passed some checks now failing them. However, if it is a result of them failing to conform to the RFC specification then I believe that is appropriate. This is still a lenient change because we use the first header instead of immediately failing if multiple headers are seen.
This is in part a result of the requests library assuming that a response with multiple header fields with the same field name conform to 7230-3.2.2. It concatenates them with a comma so in a server incorrectly outputting multiple
Strict-Transport-Security
headers it combines them following the guidelines of 7230-3.2.2.Example HTTP response (just
Strict-Transport-Security
fields from curl output):becomes
"max-age=31536000, max-age=31536000; includeSubDomains; preload, max-age=31536000"
The same domain run against the Qualys SSL Labs SSL Server Test:
Per RFC 7230 section 3.2.2:
Per RFC 6797 section 6.1: