Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add captureTDH for Automated TDH Structure Validation to Enhance SEO #34325

Open
koji-koji opened this issue Jan 15, 2025 · 3 comments

Comments

@koji-koji
Copy link

🚀 Feature Request

Feature Name: captureTDH Function

The proposed captureTDH function captures the structure of Title, Description, and Heading (hereafter referred to as TDH) essential for SEO. It enables automatic verification through a CI pipeline to ensure that these structures are not inadvertently disrupted. This feature effectively detects and prevents issues related to the hierarchical structure of Heading tags, which are common when using component-based frameworks (e.g., React).

  • Role of the captureTDH Function
    • Extracts the Title, Description, and Heading tags from a page and retrieves their structure as an object.
    • Compares the obtained TDH structure with a predefined baseline file (e.g., target-page-name-TDH.ts). If there is a mismatch, the test fails.
  • toMatchTDH Matcher
    • Similar to a Jest custom matcher, it compares the captured TDH structure with the structure defined in the baseline file.
    • The comparison results are reflected in the automated test outcomes within the CI pipeline.

Example

Test File Example

test('TDH structure is correct', async ({ page }) => {
  await page.goto('https://example.com');
  const tdhStructure = await page.captureTDH({ pageName: 'target-page-name-TDH' });
  expect(tdhStructure).toMatchTDH('target-page-name-TDH.ts');
});

Verification File Example

export const tdh = {
  title: 'page title',
  description: 'page description',
  heading: [{
    headingLevel: 1,
    headingText: 'heading 1 text',
    children: [
      {
        headingLevel: 2,
        headingText: 'heading 2 text',
        children: []
      }
    ]
  }]
}

Motivation

  • Importance of SEO
    • Title and Description are fundamental elements SEO. Properly setting them directly impacts search rankings and click-through rates.
    • The structure of Heading tags is also crucial for SEO. Maintaining the correct hierarchical structure makes it easier for search engines to accurately understand the content.
  • Challenges with Component-Based Frameworks
    • In component-based frameworks like React, even if individual components function correctly, the overall hierarchical structure of Heading tags on a page can unintentionally become disrupted.
    • Manually reviewing this issue is time-consuming and labor-intensive, creating a need for an automated validation method.
  • Leveraging Playwright’s Strengths for Automation
    • Playwright excels in E2E testing, allowing efficient page-level testing.
    • Integrating the captureTDH feature into Playwright enables continuous monitoring and maintenance of SEO and accessibility quality.
@yury-s
Copy link
Member

yury-s commented Jan 15, 2025

Would toMatchAriaSnapshot work for your scenario?

@koji-koji
Copy link
Author

@yury-s
Thank you for your comment!

In my scenario, toMatchAriaSnapshot also works!
I find it to be a very user-friendly tool for checking accessibility.
You can also perform checks on the levels of Heading tags, which is great.

However, I have not yet fully verified whether it can check detailed nesting.
I would like to better understand this by actually running it.
For SEO optimization, I aim to test the nesting structure of H1 - H6 tags.

There is another reason for proposing captureTDH.
I believe that by implementing captureTDH, you can retrieve and test the details of a page's H1 - H6 tags without needing to know them in advance.
For example, if you want to test 20 pages, writing tests for each page's H1 - H6 structure could be time-consuming.
On the other hand, I believe that with captureTDH, you can simply specify the page name and run the tests, making it much easier for many people to use.

@yury-s
Copy link
Member

yury-s commented Jan 16, 2025

However, I have not yet fully verified whether it can check detailed nesting.

Feel free to file separate bug/feature request if something is not working in toMatchAriaSnapshot.

I believe that by implementing captureTDH, you can retrieve and test the details of a page's H1 - H6 tags without needing to know them in advance.
For example, if you want to test 20 pages, writing tests for each page's H1 - H6 structure could be time-consuming.
On the other hand, I believe that with captureTDH, you can simply specify the page name and run the tests, making it much easier for many people to use.

What would the test verify if you don't know expected text values ahead of time? If you just want to match the structure of the document ignoring actual text, you can write toMatchAriaSnapshot expectation with .* regex patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants