mirror of
https://github.com/sbrow/nix.git
synced 2026-02-27 21:31:45 -05:00
feat: Added crawler template.
This commit is contained in:
26
templates/crawler/README.md
Normal file
26
templates/crawler/README.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# TopMarket Scraper
|
||||
|
||||
A web scraper built with Crawlee for JavaScript.
|
||||
|
||||
## Setup
|
||||
|
||||
1. Install dependencies:
|
||||
```bash
|
||||
npm install
|
||||
```
|
||||
|
||||
2. Run the scraper:
|
||||
```bash
|
||||
npm start
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Edit `src/main.js` to:
|
||||
- Change the `startUrls` array to target your desired websites
|
||||
- Modify the `requestHandler` to extract the data you need
|
||||
- Adjust `maxRequestsPerCrawl` to control crawling limits
|
||||
|
||||
## Output
|
||||
|
||||
Scraped data is saved to the `storage/datasets/default` directory in JSON format.
|
||||
Reference in New Issue
Block a user