# Scraping Guide

This comprehensive guide covers everything you need to know about scraping Facebook Marketplace listings with HiveFB.

### You’ll learn

* How to start and stop scrapes safely.
* When to use Standard vs Endless mode.
* How to tune speed vs reliability.
* What “Phase 1” and “Phase 2” output looks like.

## Starting a Scrape

### Basic Scraping

{% stepper %}
{% step %}

### Click "🚀 Start Scraping"

* Located in the sidebar navigation
* Opens the scraping configuration modal

{% hint style="info" %}
**Screenshot placeholder:** Sidebar with “Start Scraping” plus the configuration modal opened.
{% endhint %}
{% endstep %}

{% step %}

### Select Mode

* **Vehicle Mode**: For cars, trucks, motorcycles, RVs, boats, etc.
* **Item Mode**: For general marketplace items
  {% endstep %}

{% step %}

### Configure Settings

* **Search Term**: What to search for (e.g., "honda civic", "truck", "suv")
* **Max Listings**: Maximum number of listings to collect (default: 500)
* **Scroll Speed**: Delay between scrolls in milliseconds (default: 2000ms)
* **Page Load Wait**: Time to wait for pages to load (default: 10000ms)
  {% endstep %}

{% step %}

### Start the Scrape

* Click "Start Scraping"
* Monitor progress in the Console tab
* Watch the Browser tab to see automation in action

{% hint style="info" %}
**Screenshot placeholder:** Console tab during Phase 1 and during Phase 2.
{% endhint %}
{% endstep %}
{% endstepper %}

## Scraping Modes

### Standard Mode

**How it works:**

* Scrapes listings based on your search term
* Stops when max listings reached or no more results found
* Single search execution

**Best for:**

* One-time searches
* Specific vehicle searches
* Quick data collection

Example:

* Search: "honda civic"
* Max: 200 listings
* Result: 200 Honda Civic listings collected

### Endless Mode

**How it works:**

* Continuously cycles through multiple catalogues
* Never stops automatically (until you manually stop it)
* Perfect for ongoing deal hunting

**Best for:**

* Ongoing deal hunting
* Multiple search terms
* Continuous monitoring

Setup:

{% stepper %}
{% step %}

### Create multiple catalogues

{% endstep %}

{% step %}

### Click "Start Endless Mode"

{% endstep %}

{% step %}

### Select catalogues to cycle through

{% endstep %}

{% step %}

### Configure cycle settings

{% endstep %}
{% endstepper %}

Example:

* Catalogue 1: "honda civic"
* Catalogue 2: "toyota camry"
* Catalogue 3: "ford f150"
* Result: Continuously cycles through all three

### Update Database Mode

**How it works:**

* Re-scrapes existing listings to refresh data
* Updates prices, descriptions, and availability
* Compares new data with existing records

**Best for:**

* Keeping your database current
* Checking price changes
* Updating listing status

Setup:

{% stepper %}
{% step %}

### Click "🔄 Update Database" (if available)

{% endstep %}

{% step %}

### Select listings to update

{% endstep %}

{% step %}

### Starts re-scraping process

{% endstep %}
{% endstepper %}

## Monitoring Progress

### Console Tab

The Console tab shows real-time information:

**Phase information:**

* Current phase (Phase 1 or Phase 2)
* Phase progress percentage
* Estimated time remaining

**Listing counts:**

* Unique listings discovered
* Listings processed
* Listings remaining

**Status messages:**

* Current operation description
* Errors or warnings
* Completion status

Example Console Output:

```
✅ Phase 1 Complete: 147 unique listings discovered
🔄 Starting Phase 2: Deep inspection
📊 Processing listing 1 of 147...
✅ Extracted: 2018 Honda Civic LX
```

### Browser Tab

The Browser tab shows:

* Live view of the automation browser
* See exactly what HiveFB is doing
* Watch pages load and scroll
* Useful for debugging issues

**What you'll see:**

* Facebook Marketplace pages
* Listing detail pages
* Navigation between pages
* Scroll actions

{% hint style="info" %}
**Screenshot placeholder:** Browser tab mid-scrape while scrolling a results page.
{% endhint %}

## Scraping Parameters

### Search Term

Tips for Best Results:

* Be Specific: "honda civic 2018" vs "civic"
* Use Variations: Try "truck" and "pickup" separately
* Location Matters: Facebook uses your location for results
* Avoid Special Characters: Keep it simple

Examples:

* ✅ Good: "honda civic", "ford f150", "toyota camry"
* ❌ Avoid: "honda\*civic", "civic!!!", "best deal ever"

### Max Listings

Recommendations:

* 100-200: Quick searches, specific vehicles
* 500: Standard searches (default)
* 1000+: Comprehensive data collection

Considerations:

* More listings = longer scraping time
* Facebook may limit results
* Quality over quantity

### Scroll Speed

Settings:

* 1000-2000ms: Fast (may trigger rate limiting)
* 2000-3000ms: Balanced (recommended)
* 3000-5000ms: Slow (safer, takes longer)

Factors:

* Faster = quicker but riskier
* Slower = safer but takes longer
* Adjust based on your needs

### Page Load Wait

Settings:

* 5000-10000ms: Standard pages
* 10000-15000ms: Slow connections
* 15000+ms: Very slow connections

Purpose:

* Ensures pages fully load before scraping
* Prevents missing data
* Adjust if you see incomplete data

## Stopping a Scrape

### Manual Stop

{% stepper %}
{% step %}

### Click Stop Button

* The "✕" button appears when scraping is active
* Located in the scraping interface
* Immediately stops all operations

{% hint style="info" %}
**Screenshot placeholder:** Stop (“✕”) button visible while a scrape is running.
{% endhint %}
{% endstep %}

{% step %}

### What Happens

* Current operation completes safely
* All collected data is saved
* No data loss
  {% endstep %}
  {% endstepper %}

### Automatic Stop

Scraping stops automatically when:

* Maximum listings reached
* No more results found
* Error occurs (with option to retry)

### Data Preservation

Always Saved:

* All listings collected up to stop point
* Phase 1 discoveries
* Phase 2 inspections completed
* Progress information

Not Lost:

* Partial data is preserved
* Can resume with new scrape
* Database updated incrementally

## Best Practices

{% stepper %}
{% step %}

### Start Small

* Begin with 50-100 listings
* Test your search terms
* Verify results quality
* Scale up once comfortable
  {% endstep %}

{% step %}

### Use Appropriate Delays

* Don't set scroll speed too fast
* Allow pages to load completely
* Mimic human behavior
* Prevents rate limiting
  {% endstep %}

{% step %}

### Monitor Progress

* Check Console tab regularly
* Watch for errors
* Verify data quality
* Adjust settings if needed
  {% endstep %}

{% step %}

### Use Catalogues

* Save common searches
* Reuse configurations
* Organize your searches
* Enable endless mode for multiple searches
  {% endstep %}

{% step %}

### Take Breaks

* Configure break settings
* Prevents account issues
* More natural pattern
* Reduces detection risk
  {% endstep %}
  {% endstepper %}

## Common Scenarios

### Scenario: Finding a Specific Vehicle

**Goal**: Find a 2018 Honda Civic

{% stepper %}
{% step %}

### Search term

"honda civic 2018"
{% endstep %}

{% step %}

### Max listings

100
{% endstep %}

{% step %}

### Mode

Standard mode
{% endstep %}

{% step %}

### Review results

Review results in dashboard and filter by price/condition
{% endstep %}
{% endstepper %}

### Scenario: Broad Market Research

**Goal**: Understand truck market in your area

{% stepper %}
{% step %}

### Create catalogue

"trucks"
{% endstep %}

{% step %}

### Search term

"truck"
{% endstep %}

{% step %}

### Max listings

500
{% endstep %}

{% step %}

### Run catalogue

Run catalogue and analyze in Analytics page
{% endstep %}
{% endstepper %}

### Scenario: Continuous Monitoring

**Goal**: Never miss a deal on multiple vehicle types

{% stepper %}
{% step %}

### Create multiple catalogues

{% endstep %}

{% step %}

### Enable endless mode

{% endstep %}

{% step %}

### Select all catalogues

{% endstep %}

{% step %}

### Let it run continuously

Check hot deals regularly
{% endstep %}
{% endstepper %}

***

### Next pages

* Save searches: [Catalogue Management](/core-workflow/catalogue-management.md)
* Monitor deals: [Hot Deals](/core-workflow/hot-deals.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.hivefb.app/core-workflow/scraping-guide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
