Formats

Format options control what data types are included in your response. Specify one or more formats to receive HTML, markdown, screenshots, or extracted links alongside your extracted data. Common uses:

Full page content: Get raw HTML for custom processing
Readable text: Convert pages to clean markdown format
Visual records: Capture screenshots for monitoring or archival
Link extraction: Get all URLs from the page for further crawling

You can combine multiple formats in a single request. All specified formats will be included in the response.

Supported parameters

Available in - Extract and Crawl.

Parameter	Type	Description	Default
`formats`	List (String)	Sets what data types are included in your response	`["html"]`

Available formats

Format	Description	Use Case
`html`	Raw HTML content	Custom parsing, archival, full DOM access
`markdown`	Clean markdown conversion	Readable text, content analysis, LLM processing
`screenshot`	Page screenshot (base64)	Visual verification, monitoring, documentation
`links`	All extracted URLs	Link discovery, crawling, sitemap building

Usage

Single HTML format

Request one format type - html (default). Best for:

Custom HTML parsing
Preserving exact page structure
Accessing all DOM elements and attributes
Archival purposes

from nimble import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

result = nimble.extract({
    "url": "https://www.example.com",
    "formats": ["html"] # default
})

# Access HTML content
html_content = result["data"]["html"]
print(html_content)

Multiple formats

Combine multiple formats to get different data representations:

from nimble import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

result = nimble.extract({
    "url": "https://www.example.com",
    "formats": ["html", "markdown", "screenshot", "links"]
})

print(result)

Markdown format

Convert the page to clean, readable markdown. Best for:

Clean text extraction
Content analysis
LLM processing
Human-readable output:

from nimble import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

result = nimble.extract({
    "url": "https://www.example.com/article",
    "formats": ["markdown"]
})

# Access markdown content
markdown_content = result["data"]["markdown"]
print(markdown_content)

Screenshot format

Capture a visual snapshot of the page. Best for:

Visual verification
Monitoring page changes
Documentation and reporting
Debugging layout issues:

from nimble import Nimble
import base64

nimble = Nimble(api_key="YOUR-API-KEY")

result = nimble.extract({
    "url": "https://www.example.com",
    "formats": ["screenshot"],
    "render": True
})

# Access screenshot (base64 encoded)
screenshot_data = result["data"]["screenshot"]

# Decode and save
with open("screenshot.png", "wb") as f:
    f.write(base64.b64decode(screenshot_data))

Screenshots require page rendering to be enabled (render: true). The image is returned as base64-encoded PNG data.

Links format

Extract all URLs found on the page. Best for:

Link discovery
Building sitemaps
Crawling workflows
Finding internal/external links:

from nimble import Nimble

nimble = Nimble(api_key="YOUR-API-KEY")

result = nimble.extract({
    "url": "https://www.example.com",
    "formats": ["links"]
})

# Access extracted links
links = result["data"]["links"]
for link in links:
    print(link)

Combining with other features

Formats work seamlessly with parsing, browser actions, and other features:

from nimble import Nimble
from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float

nimble = Nimble(api_key="YOUR-API-KEY")

result = nimble.extract({
    "url": "https://www.example.com/product",
    "render": True,
    "formats": ["html", "markdown", "screenshot"],
    "schema": Product,
    "browser_actions": [
        {
            "wait": {
                "delay": 1000
            }
        }
    ]
})

# Access different formats
product_data = result["data"]["parsed"]
html = result["data"]["html"]
markdown = result["data"]["markdown"]
screenshot = result["data"]["screenshot"]

Example response

When formats are specified, all requested data is included in the response. The response includes:

html: Raw HTML if requested
markdown: Converted markdown if requested
screenshot: Base64-encoded PNG if requested
links: Array of extracted URLs if requested
parsed: Structured data if parsing was used
metadata: Execution details and formats included:

{
  "status": "success",
  "data": {
    "html": "<!DOCTYPE html><html><head>...</head><body>...</body></html>",
    "markdown": "# Article Title\n\nThis is the article content...",
    "screenshot": "iVBORw0KGgoAAAANSUhEUgAAA...",
    "links": [
      "https://www.example.com/about",
      "https://www.example.com/contact",
      "https://www.example.com/products",
      "https://external-site.com"
    ],
    "parsed": {
      "title": "Example Article",
      "author": "John Doe"
    }
  },
  "metadata": {
    "driver": "vx8",
    "execution_time_ms": 1850,
    "formats_included": ["html", "markdown", "screenshot", "links"]
  }
}

Best practices

Format selection

Choose formats based on your needs:

Use html when you need full DOM access
Use markdown for clean text and content analysis
Use screenshot for visual verification
Use links for discovering URLs to crawl

Avoid unnecessary formats:

# ❌ Don't request all formats if you only need one
formats=["html", "markdown", "screenshot", "links"]

# ✅ Request only what you need
formats=["markdown"]

Performance considerations

Each format adds processing time
Screenshots require rendering and are slower
HTML and markdown are faster to generate
Request only needed formats for optimal performance

Link filtering

Process links after extraction:

# Filter internal links only
internal_links = [
    link for link in result["data"]["links"]
    if link.startswith("https://www.example.com")
]

# Filter by file type
pdf_links = [
    link for link in result["data"]["links"]
    if link.endswith(".pdf")
]

Introduction

Getting Started

Web Tools

Supported parameters

Available formats

Usage

Single HTML format

Multiple formats

Markdown format

Screenshot format

Links format

Combining with other features

Example response

Best practices

Format selection

Performance considerations

Link filtering

Introduction

Getting Started

Web Tools

​Supported parameters

​Available formats

​Usage

​Single HTML format

​Multiple formats

​Markdown format

​Screenshot format

​Links format

​Combining with other features

​Example response

​Best practices

​Format selection

​Performance considerations

​Link filtering

Supported parameters

Available formats

Usage

Single HTML format

Multiple formats

Markdown format

Screenshot format

Links format

Combining with other features

Example response

Best practices

Format selection

Performance considerations

Link filtering