[Feature]: Add captureTDH for Automated TDH Structure Validation to Enhance SEO #34325

koji-koji · 2025-01-15T00:13:07Z

🚀 Feature Request

Feature Name: `captureTDH` Function

The proposed captureTDH function captures the structure of Title, Description, and Heading (hereafter referred to as TDH) essential for SEO. It enables automatic verification through a CI pipeline to ensure that these structures are not inadvertently disrupted. This feature effectively detects and prevents issues related to the hierarchical structure of Heading tags, which are common when using component-based frameworks (e.g., React).

Role of the captureTDH Function
- Extracts the Title, Description, and Heading tags from a page and retrieves their structure as an object.
- Compares the obtained TDH structure with a predefined baseline file (e.g., target-page-name-TDH.ts). If there is a mismatch, the test fails.
toMatchTDH Matcher
- Similar to a Jest custom matcher, it compares the captured TDH structure with the structure defined in the baseline file.
- The comparison results are reflected in the automated test outcomes within the CI pipeline.

Example

Test File Example

test('TDH structure is correct', async ({ page }) => {
  await page.goto('https://example.com');
  const tdhStructure = await page.captureTDH({ pageName: 'target-page-name-TDH' });
  expect(tdhStructure).toMatchTDH('target-page-name-TDH.ts');
});

Verification File Example

export const tdh = {
  title: 'page title',
  description: 'page description',
  heading: [{
    headingLevel: 1,
    headingText: 'heading 1 text',
    children: [
      {
        headingLevel: 2,
        headingText: 'heading 2 text',
        children: []
      }
    ]
  }]
}

Motivation

Importance of SEO
- Title and Description are fundamental elements SEO. Properly setting them directly impacts search rankings and click-through rates.
- The structure of Heading tags is also crucial for SEO. Maintaining the correct hierarchical structure makes it easier for search engines to accurately understand the content.
Challenges with Component-Based Frameworks
- In component-based frameworks like React, even if individual components function correctly, the overall hierarchical structure of Heading tags on a page can unintentionally become disrupted.
- Manually reviewing this issue is time-consuming and labor-intensive, creating a need for an automated validation method.
Leveraging Playwright’s Strengths for Automation
- Playwright excels in E2E testing, allowing efficient page-level testing.
- Integrating the captureTDH feature into Playwright enables continuous monitoring and maintenance of SEO and accessibility quality.

The text was updated successfully, but these errors were encountered:

yury-s · 2025-01-15T01:07:59Z

Would toMatchAriaSnapshot work for your scenario?

koji-koji · 2025-01-16T00:52:34Z

@yury-s
Thank you for your comment!

In my scenario, toMatchAriaSnapshot also works!
I find it to be a very user-friendly tool for checking accessibility.
You can also perform checks on the levels of Heading tags, which is great.

However, I have not yet fully verified whether it can check detailed nesting.
I would like to better understand this by actually running it.
For SEO optimization, I aim to test the nesting structure of H1 - H6 tags.

There is another reason for proposing captureTDH.
I believe that by implementing captureTDH, you can retrieve and test the details of a page's H1 - H6 tags without needing to know them in advance.
For example, if you want to test 20 pages, writing tests for each page's H1 - H6 structure could be time-consuming.
On the other hand, I believe that with captureTDH, you can simply specify the page name and run the tests, making it much easier for many people to use.

yury-s · 2025-01-16T00:59:49Z

However, I have not yet fully verified whether it can check detailed nesting.

Feel free to file separate bug/feature request if something is not working in toMatchAriaSnapshot.

I believe that by implementing captureTDH, you can retrieve and test the details of a page's H1 - H6 tags without needing to know them in advance.
For example, if you want to test 20 pages, writing tests for each page's H1 - H6 structure could be time-consuming.
On the other hand, I believe that with captureTDH, you can simply specify the page name and run the tests, making it much easier for many people to use.

What would the test verify if you don't know expected text values ahead of time? If you just want to match the structure of the document ignoring actual text, you can write toMatchAriaSnapshot expectation with .* regex patterns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Add captureTDH for Automated TDH Structure Validation to Enhance SEO #34325

[Feature]: Add captureTDH for Automated TDH Structure Validation to Enhance SEO #34325

koji-koji commented Jan 15, 2025

yury-s commented Jan 15, 2025

koji-koji commented Jan 16, 2025

yury-s commented Jan 16, 2025

[Feature]: Add captureTDH for Automated TDH Structure Validation to Enhance SEO #34325

[Feature]: Add captureTDH for Automated TDH Structure Validation to Enhance SEO #34325

Comments

koji-koji commented Jan 15, 2025

🚀 Feature Request

Feature Name: captureTDH Function

Example

Motivation

yury-s commented Jan 15, 2025

koji-koji commented Jan 16, 2025

yury-s commented Jan 16, 2025

Feature Name: `captureTDH` Function