Playwright页面数据提取器 - 从React/Vue/Angular动态网页自动提取结构化数据

Playwright Page Data Extractor by agentskillexchange/skills

1 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/agentskillexchange/skills --skill 'Playwright Page Data Extractor'

自动化数据处理测试

🇨🇳中文介绍

Playwright 页面数据提取器

使用 Microsoft Playwright 的 Node.js API 来导航动态网页应用，拦截网络请求，并通过自动等待策略从 React/Vue/Angular 单页应用中提取结构化数据。

概述

Playwright 页面数据提取器技能利用 Microsoft Playwright 从现代 JavaScript 密集型网页应用中可靠地提取数据。它处理在客户端渲染内容的 React、Vue 和 Angular 单页应用，使用 Playwright 的自动等待机制确保在提取前内容已完全加载。

该技能使用 Playwright 的 page.evaluate() 进行 DOM 遍历，page.route() 用于拦截网络请求和捕获 API 响应，以及 page.waitForSelector() 配合可配置的超时策略。它生成的提取脚本可以处理无限滚动分页、模态对话框以及通过 IntersectionObserver 模式实现的动态内容加载。

高级功能包括使用 BrowserContext 进行会话隔离的多页面爬取、基于截图的视觉比较以检测变化，以及用于离线分析的 HAR 文件记录。该技能支持代理配置、用于获取特定区域内容的模拟地理位置，并生成带有强类型提取数据结构的 TypeScript 提取脚本。

安装

Any Agent

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor

Claude Code

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor -a claude-code

Cursor

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor -a cursor

Codex

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor -a codex

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

🇺🇸English

Playwright Page Data Extractor

Uses Microsoft Playwright’s Node.js API to navigate dynamic web applications, intercept network requests, and extract structured data from React/Vue/Angular SPAs with automatic wait strategies.

Overview

The Playwright Page Data Extractor skill leverages Microsoft Playwright for reliable data extraction from modern JavaScript-heavy web applications. It handles React, Vue, and Angular single-page applications that render content client-side, using Playwright’s auto-waiting mechanisms to ensure content is fully loaded before extraction.

The skill uses Playwright’s page.evaluate() for DOM traversal, page.route() for network request interception and API response capture, and page.waitForSelector() with configurable timeout strategies. It generates extraction scripts that handle infinite scroll pagination, modal dialogs, and dynamic content loading via IntersectionObserver patterns.

Advanced capabilities include multi-page crawling with BrowserContext for session isolation, screenshot-based visual comparison for change detection, and HAR file recording for offline analysis. The skill supports proxy configuration, geolocation spoofing for region-specific content, and generates TypeScript extraction scripts with strong typing for extracted data structures.

Installation

Any Agent

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor

Claude Code

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor -a claude-code

Cursor

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor -a cursor

Codex

npx skills add agentskillexchange/skills --skill playwright-page-data-extractor -a codex

OpenClaw

clawhub install playwright-page-data-extractor

Source

Marketplace: https://agentskillexchange.com/skills/playwright-page-data-extractor/

Weekly Installs

–

Repository

agentskillexcha…e/skills

GitHub Stars

First Seen

–

Security Audits

Gen Agent Trust HubPass SocketWarn SnykWarn

Playwright页面数据提取器 - 从React/Vue/Angular动态网页自动提取结构化数据

🇨🇳中文介绍

Playwright 页面数据提取器

概述

安装

Any Agent

Claude Code

Cursor

Codex

相关 Skills

OpenClaw

来源

🇺🇸English

Playwright Page Data Extractor

Overview

Installation

Any Agent

Claude Code

Cursor

Codex

OpenClaw

Source

最新 Skills