Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto check deadlinks #1463

Closed
bwbroersma opened this issue Jul 17, 2024 · 5 comments
Closed

Auto check deadlinks #1463

bwbroersma opened this issue Jul 17, 2024 · 5 comments
Assignees
Labels
Milestone

Comments

@bwbroersma
Copy link
Collaborator

Using https://github.com/lycheeverse/lychee/ and this quick docker run on the FAQ:

$ docker run --rm -ti lycheeverse/lychee https://{nl,en}.internet.nl/faqs/{ipv6,dnssec,https,appsecpriv,mailauth,starttls,rpki,report,badges,halloffame}/

Some dead links can be found

  909/909 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links                                                                                                                                                Issues found in 5 inputs. Find details below.

[https://en.internet.nl/faqs/rpki/]:
↻ [ERR] https://rpki-maps.nlnetlabs.nl/ | Cached: Error (cached)
✗ [404] https://stats.sidnlabs.nl/en/security.html#domain%20names%20protected%20with%20rpki | Failed: Network error: Not Found
⧖ [TIMEOUT] https://rpki-rfc.routingsecurity.net/ | Timeout

[https://nl.internet.nl/faqs/rpki/]:
✗ [404] https://stats.sidnlabs.nl/nl/security.html#domeinnamen%20beveiligd%20met%20rpki | Failed: Network error: Not Found
✗ [ERR] https://rpki-maps.nlnetlabs.nl/ | Failed: Network error: error sending request for url (https://rpki-maps.nlnetlabs.nl/)
⧖ [TIMEOUT] https://rpki-rfc.routingsecurity.net/ | Timeout

[https://en.internet.nl/faqs/dnssec/]:
✗ [404] https://www.cert.org/blogs/certcc/post.cfm?EntryID=206 | Failed: Network error: Not Found

[https://en.internet.nl/faqs/ipv6/]:
⧖ [TIMEOUT] https://www.akamai.com/us/en/about/our-thinking/state-of-the-internet-report/state-of-the-internet-ipv6-adoption-visualization.jsp | Timeout

[https://nl.internet.nl/faqs/dnssec/]:
✗ [404] https://www.cert.org/blogs/certcc/post.cfm?EntryID=206 | Failed: Network error: Not Found

🔍 909 Total (in 1m 28s) ✅ 878 OK 🚫 6 Errors 💤 22 Excluded
@bwbroersma bwbroersma added this to the v1.9 milestone Jul 17, 2024
@bwbroersma
Copy link
Collaborator Author

Output:

$ docker run --rm -ti raviqqe/muffet https://internet.nl/ --max-connections-per-host=2 --header "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0" --buffer-size=$((16*1024*1024)) --timeout=120 --max-response-body-size=$((150*1024*1024)) -r=20 --format=text --color=always "--exclude=^https://internet.nl/halloffame/.*$" --ignore-fragments --exclude=https://twitter.com/internet_nl --exclude=https://www.linkedin.com/company/internet-nl/
https://internet.nl/copyright/
	lookup aucheck.com.au on 9.9.9.9:53: no such host	https://aucheck.com.au/
	lookup sikkerpånettet.dk: no such host	https://sikkerp%C3%A5nettet.dk/
https://internet.nl/privacy/
	401	https://batch.internet.nl
	404	https://www.belastingdienst.nl/wps/wcm/connect/bldcontentnl/belastingdienst/zakelijk/ondernemen/administratie/administratie_opzetten/wat_hoort_er_allemaal_bij_uw_administratie
https://internet.nl/article/uitnodiging-masterclass-DMARC/
	404	https://internet.nl/faqs/mail/#DMARC
https://internet.nl/article/internetnl-vernieuwd-hsts-en-afgedwongen-https-tellen-mee/
	parse "/article/nieuwe-versie-internetnl-met-aanvullingen-\nhttps-test/": net/url: invalid control character in URL	/article/nieuwe-versie-internetnl-met-aanvullingen-
https-test/
https://internet.nl/article/internet-draait-om-samenwerking/
	404	https://internet.nl/static/author//olaf-kolkman/picture.jpg
https://internet.nl/article/nieuwe-versie-internetnl-met-aanvullingen-https-test/
	404	https://www.ncsc.nl/actueel/whitepapers/ict-beveiligingsrichtlijnen-voor-transport-layer-security-tls.html
https://internet.nl/article/improved-internetnl-test-for-modern-internet-standards/
	tls: failed to verify certificate: x509: certificate is valid for dcc.digistatecloud.nl, not dhpa.nl	https://dhpa.nl/
https://internet.nl/article/release-1.8/
	lookup aucheck.com.au on 9.9.9.9:53: no such host	https://aucheck.com.au/
	lookup sikkerpånettet.dk: no such host	https://sikkerp%C3%A5nettet.dk/
https://internet.nl/article/rpki-test-toegevoegd/
	404	https://github.com/internetstandards/Internet.nl/blob/main/documentation/Operational%20Changes.md#change-overview-for-version-15=
https://internet.nl/article/securitytxt-test-toegevoegd/
	404	https://forumstandaardisatie.nl/netherlands-standardisation-forum
	404	https://github.com/internetstandards/Internet.nl/blob/main/documentation/Operational%20Changes.md#change-overview-for-version-16
https://internet.nl/faqs/dnssec/
	404	https://www.cert.org/blogs/certcc/post.cfm?EntryID=206
https://internet.nl/faqs/rpki/
	404	https://stats.sidnlabs.nl/en/security.html#domain%20names%20protected%20with%20rpki
	522	https://rpki-rfc.routingsecurity.net/
	lookup rpki-maps.nlnetlabs.nl on 9.9.9.9:53: no such host	https://rpki-maps.nlnetlabs.nl
https://internet.nl/faqs/ipv6/
	timeout	https://www.akamai.com/us/en/about/our-thinking/state-of-the-internet-report/state-of-the-internet-ipv6-adoption-visualization.jsp
$ docker run --rm -ti raviqqe/muffet https://nl.internet.nl/ --max-connections-per-host=2 --header "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0" --buffer-size=$((16*1024*1024)) --timeout=120 --max-response-body-size=$((150*1024*1024)) -r=20 --format=text --color=always "--exclude=^https://nl.internet.nl/halloffame/.*$" --ignore-fragments --exclude=https://twitter.com/internet_nl --exclude=https://www.linkedin.com/company/internet-nl/
https://nl.internet.nl/article/stilstand-is-achteruitgang/
	404	https://dinl.nl/Digital_Infrastructure_-_Driver_for_the_Online_Ecosystem__2014__v_1_1.pdf
	404	https://nl.internet.nl/static/author//marco-davids/picture.jpg
https://nl.internet.nl/mail/*/*/
	404	https://www.ncsc.nl/onderwerpen/verbindingsbeveiliging/documenten/publicaties/2021/januari/19/ict-beveiligingsrichtlijnen-voor-transport-layer-security-2.1
https://nl.internet.nl/copyright/
	lookup aucheck.com.au on 9.9.9.9:53: no such host	https://aucheck.com.au/
	lookup sikkerpånettet.dk: no such host	https://sikkerp%C3%A5nettet.dk/
https://nl.internet.nl/privacy/
	401	https://batch.internet.nl
	404	https://www.belastingdienst.nl/wps/wcm/connect/bldcontentnl/belastingdienst/zakelijk/ondernemen/administratie/administratie_opzetten/wat_hoort_er_allemaal_bij_uw_administratie
https://nl.internet.nl/article/rpki-test-toegevoegd/
	404	https://github.com/internetstandards/Internet.nl/blob/main/documentation/Operational%20Changes.md#change-overview-for-version-15=
https://nl.internet.nl/faqs/dnssec/
	404	https://www.cert.org/blogs/certcc/post.cfm?EntryID=206
https://nl.internet.nl/faqs/appsecpriv/
	lookup learn.microsoft.com: i/o timeout (following redirect https://learn.microsoft.com/archive/blogs/ie/ie8-security-part-vi-beta-2-update)	https://blogs.msdn.microsoft.com/ie/2008/09/02/ie8-security-part-vi-beta-2-update/
https://nl.internet.nl/article/securitytxt-test-toegevoegd/
	404	https://github.com/internetstandards/Internet.nl/blob/main/documentation/Operational%20Changes.md#change-overview-for-version-16
https://nl.internet.nl/article/release-1.8/
	lookup aucheck.com.au on 9.9.9.9:53: no such host	https://aucheck.com.au/
	lookup sikkerpånettet.dk: no such host	https://sikkerp%C3%A5nettet.dk/
https://nl.internet.nl/article/email-test-on-internetnl-extended/
	404	https://www.ncsc.nl/documenten/factsheets/2019/juni/01/factsheet-beveilig-verbindingen-van-mailservers
https://nl.internet.nl/article/paneldiscussie-the-power-of-internet-standards/
	404	https://www.ncsc.nl/onderwerpen/onderzoekssymposium
https://nl.internet.nl/article/improved-internetnl-test-for-modern-internet-standards/
	tls: failed to verify certificate: x509: certificate is valid for dcc.digistatecloud.nl, not dhpa.nl	https://dhpa.nl/
https://nl.internet.nl/article/DMARC-masterclass-authenticatie-afzender-noodzaak-geworden/
	404	https://nl.internet.nl/standards/
https://nl.internet.nl/article/internetnl-vernieuwd-hsts-en-afgedwongen-https-tellen-mee/
	parse "/article/nieuwe-versie-internetnl-met-aanvullingen-https-\ntest/": net/url: invalid control character in URL	/article/nieuwe-versie-internetnl-met-aanvullingen-https-
test/
https://nl.internet.nl/article/nieuwe-versie-internetnl-met-aanvullingen-https-test/
	404	https://www.ncsc.nl/actueel/whitepapers/ict-beveiligingsrichtlijnen-voor-transport-layer-security-tls.html
https://nl.internet.nl/article/internet-draait-om-samenwerking/
	404	https://nl.internet.nl/static/author//olaf-kolkman/picture.jpg
https://nl.internet.nl/article/het-internet-is-van-ons-allemaal/
	404	https://nl.internet.nl/static/author//gerben-klein-baltink/picture.jpg
https://nl.internet.nl/article/ipv6-vergt-de-medewerking-van-iedereen-in-de-wereld/
	404	https://nl.internet.nl/static/author//erik-huizer/picture.jpg
https://nl.internet.nl/faqs/rpki/
	404	https://stats.sidnlabs.nl/nl/security.html#domeinnamen%20beveiligd%20met%20rpki
	522	https://rpki-rfc.routingsecurity.net/
	lookup rpki-maps.nlnetlabs.nl on 9.9.9.9:53: no such host	https://rpki-maps.nlnetlabs.nl
https://nl.internet.nl/faqs/ipv6/
	timeout	https://www.akamai.com/us/en/about/our-thinking/state-of-the-internet-report/state-of-the-internet-ipv6-adoption-visualization.jsp
$ docker run --rm -ti raviqqe/muffet https://en.internet.nl/ --max-connections-per-host=2 --header "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0" --buffer-size=$((16*1024*1024)) --timeout=120 --max-response-body-size=$((150*1024*1024)) -r=20 --format=text --color=always "--exclude=^https://en.internet.nl/halloffame/.*$" --ignore-fragments --exclude=https://twitter.com/internet_nl --exclude=https://www.linkedin.com/company/internet-nl/
https://en.internet.nl/copyright/
	lookup aucheck.com.au on 9.9.9.9:53: no such host	https://aucheck.com.au/
	lookup sikkerpånettet.dk: no such host	https://sikkerp%C3%A5nettet.dk/
https://en.internet.nl/article/rpki-test-toegevoegd/
	404	https://github.com/internetstandards/Internet.nl/blob/main/documentation/Operational%20Changes.md#change-overview-for-version-15=
https://en.internet.nl/article/securitytxt-test-toegevoegd/
	404	https://forumstandaardisatie.nl/netherlands-standardisation-forum
	404	https://github.com/internetstandards/Internet.nl/blob/main/documentation/Operational%20Changes.md#change-overview-for-version-16
https://en.internet.nl/article/nieuwe-versie-internetnl-met-aanvullingen-https-test/
	404	https://www.ncsc.nl/actueel/whitepapers/ict-beveiligingsrichtlijnen-voor-transport-layer-security-tls.html
	no free connections available to host	https://en.Internet.nl
https://en.internet.nl/article/uitnodiging-masterclass-DMARC/
	404	https://en.internet.nl/faqs/mail/#DMARC
https://en.internet.nl/article/internetnl-vernieuwd-hsts-en-afgedwongen-https-tellen-mee/
	parse "/article/nieuwe-versie-internetnl-met-aanvullingen-\nhttps-test/": net/url: invalid control character in URL	/article/nieuwe-versie-internetnl-met-aanvullingen-
https-test/
https://en.internet.nl/article/improved-internetnl-test-for-modern-internet-standards/
	tls: failed to verify certificate: x509: certificate is valid for dcc.digistatecloud.nl, not dhpa.nl	https://dhpa.nl/
https://en.internet.nl/article/internet-draait-om-samenwerking/
	404	https://en.internet.nl/static/author//olaf-kolkman/picture.jpg
https://en.internet.nl/faqs/starttls/
	lookup words.filippo.io: i/o timeout (following redirect https://words.filippo.io/the-sad-state-of-smtp-encryption/)	https://blog.filippo.io/the-sad-state-of-smtp-encryption/
https://en.internet.nl/privacy/
	401	https://batch.internet.nl
	404	https://www.belastingdienst.nl/wps/wcm/connect/bldcontentnl/belastingdienst/zakelijk/ondernemen/administratie/administratie_opzetten/wat_hoort_er_allemaal_bij_uw_administratie
	lookup www.team-cymru.com: i/o timeout (following redirect https://www.team-cymru.com/community-services/ip-asn-mapping)	https://team-cymru.com/community-services/ip-asn-mapping/
https://en.internet.nl/article/release-1.8/
	lookup aucheck.com.au on 9.9.9.9:53: no such host	https://aucheck.com.au/
	lookup sikkerpånettet.dk: no such host	https://sikkerp%C3%A5nettet.dk/
https://en.internet.nl/faqs/dnssec/
	404	https://www.cert.org/blogs/certcc/post.cfm?EntryID=206
https://en.internet.nl/faqs/rpki/
	404	https://stats.sidnlabs.nl/en/security.html#domain%20names%20protected%20with%20rpki
	522	https://rpki-rfc.routingsecurity.net/
	lookup rpki-maps.nlnetlabs.nl: i/o timeout	https://rpki-maps.nlnetlabs.nl
	lookup rpki.readthedocs.io: i/o timeout	https://rpki.readthedocs.io
	lookup rpki.readthedocs.io: i/o timeout	https://rpki.readthedocs.io/en/latest/ops/resources.html#examples-of-bgp-hijacks
https://en.internet.nl/faqs/ipv6/
	lookup ipv6now.com.au: i/o timeout	https://ipv6now.com.au/primers/IPv6Reasons.php
	timeout	https://www.akamai.com/us/en/about/our-thinking/state-of-the-internet-report/state-of-the-internet-ipv6-adoption-visualization.jsp

For better support:

  • Change sikkerpånettet.dk to xn--sikkerpnettet-vfb.dk in href.
  • Remove newline in "/article/nieuwe-versie-internetnl-met-aanvullingen-\nhttps-test/" link.
  • Change some links that redirect.

@baknu
Copy link
Contributor

baknu commented Nov 12, 2024

https://internet.nl/article/internet-draait-om-samenwerking/
	404	https://internet.nl/static/author//olaf-kolkman/picture.jpg

This does not seems to be fixable via content. @mxsasha Could you fix this one?

@bwbroersma
Copy link
Collaborator Author

@mxsasha mxsasha closed this as not planned Won't fix, can't repro, duplicate, stale Nov 12, 2024
@baknu baknu reopened this Nov 12, 2024
@baknu
Copy link
Contributor

baknu commented Jan 18, 2025

Fixed all links. Two remarks:

  • Pointed on some places to capture in Internet Archive.
  • Have not changed sikkerpånettet.dk to xn--sikkerpnettet-vfb.dk in href, because browsers really should support the former.

@baknu baknu closed this as completed Jan 18, 2025
@bwbroersma
Copy link
Collaborator Author

Writing sikkerpånettet.dk as xn--sikkerpnettet-vfb.dk really is more backward compatible, e.g. in this tooling.
In general I have the rule to write ASCII text HTML with HTML entities, e.g. for HTML reports, I think I still had some use case where it went wrong, but maybe I'm still stuck in 2010.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants