Network Capture intercepts internal API calls made during webpage loading, giving you direct access to structured data in JSON format instead of parsing HTML.
Common uses:
- Dynamic content: Capture lazy-loaded data and real-time updates.
- API access: Interact directly with backend APIs bypassing UI rendering.
- Performance: Reduce overhead by accessing machine-readable responses.
- Accuracy: Get reliable data directly from API endpoints.
Network Capture requires page rendering to be enabled (render: true). For XHR/AJAX calls that don’t need rendering, use the is_xhr parameter instead.
Supported parameters
Available in - Extract.
| Parameter | Type | Description | Default |
|---|
network_capture | List (Object) | uses filters to target specific requests | - |
Network Capture Object
| Parameter | Type | Description | Default |
|---|
method | String | HTTP method filter (GET, POST, PUT, etc). Default: Any | - |
url.type | Enum | URL matching type: exact or contains. Default: exact | - |
url.value | String | The URL or URL portion to match | - |
resource_type | Array | Filter by resource type: xhr, fetch, stylesheet, script, document | - |
validation | Boolean | Enable content validation on responses. Default: false | - |
wait_for_requests_count | Integer | Minimum number of requests to capture. Default: 0 | - |
wait_for_requests_count_timeout | Integer | Max wait time (seconds) for request count. Default: 10 | - |
Usage
Filter by exact URL match
Capture a specific API endpoint by matching the complete URL.
from nimble import Nimble
nimble = Nimble(api_key="YOUR-API-KEY")
result = nimble.extract(
url="https://www.example.com",
render=True,
network_capture=[
{
"method": "GET",
"url": {
"type": "exact",
"value": "https://www.example.com/api/data"
}
}
]
)
print(result)
Filter by URL pattern
Use contains to capture requests with URLs matching a pattern. This is useful for capturing file types (like .css or .js), requests with dynamic URL components, or when you don’t know the exact URL.
from nimble import Nimble
nimble = Nimble(api_key="YOUR-API-KEY")
result = nimble.extract(
url="https://www.example.com",
render=True,
network_capture=[
{
"url": {
"type": "contains",
"value": "/graphql"
}
}
]
)
print(result)
Filter by resource type
Capture specific types of resources like XHR, Fetch, or Script requests.
from nimble import Nimble
nimble = Nimble(api_key="YOUR-API-KEY")
result = nimble.extract(
url="https://www.example.com",
render=True,
network_capture=[
{
"method": "GET",
"resource_type": ["xhr", "fetch"]
}
]
)
print(result)
Multiple filters
Combine multiple filters to capture different request types in one call.
from nimble import Nimble
nimble = Nimble(api_key="YOUR-API-KEY")
result = nimble.extract(
url="https://www.example.com",
render=True,
network_capture=[
{
"method": "GET",
"url": {
"type": "exact",
"value": "https://www.example.com/api/resource"
}
},
{
"url": {
"type": "contains",
"value": ".css"
}
}
]
)
print(result)
Wait for requests
Use wait_for_requests_count to ensure you capture a minimum number of network requests. The request duration will be extended until the count is reached or the timeout expires.
from nimble import Nimble
nimble = Nimble(api_key="YOUR-API-KEY")
result = nimble.extract(
url="https://www.example.com",
render=True,
network_capture=[
{
"method": "GET",
"resource_type": ["xhr", "script"],
"wait_for_requests_count": 3,
"wait_for_requests_count_timeout": 5
}
]
)
print(result)
This configuration will wait up to 5 seconds to capture at least 3 network requests matching the filter criteria.
XHR without rendering
For direct API endpoints that don’t require page rendering, use is_xhr for better performance.
from nimble import Nimble
nimble = Nimble(api_key="YOUR-API-KEY")
result = nimble.extract(
url="https://api.example.com/endpoint",
is_xhr=True,
)
print(result)
is_xhr only works when render is false. It sends XHR-specific headers and targets the API URL directly.
Example response
When browser actions complete successfully, you’ll receive the final page state along with any data captured.
The response includes:
- data: All related extacted data
- data.html: Final DOM state after all actions
- data.network_capture: The network capture response by order
- metadata: Execution details including task id, driver used, execution time and more
{
"status": "success",
"data": {
"html": "<!DOCTYPE html><html>...</html>",
"network_capture":[
{
"filter": {
"method":"GET",
"resource_type":["xhr", "script"]
},
"result": [
{
"request":{
"resource_type":"script",
"method": "GET",
"url": "https://www.example.com/script/0001.js",
"headers":{}
},
"response":{
"status":200,
"headers":{},
"body":"..."
}
},
{
"request":{
"resource_type":"xhr",
"method": "GET",
"url": "https://www.example.com/script/0002.js",
"headers":{}
},
"response":{
"status":200,
"headers":{},
"body":"..."
}
}
]
}
]
},
"metadata": {
"task_id":".....",
"country":"US",
"driver": "vx10",
"execution_time_ms": 3450
}
}
Best practices
Use specific URL patterns
Be specific with URL matching:
# ✅ Specific pattern for API endpoints
network_capture = [
{
"url": {
"type": "contains",
"value": "/api/v1/products"
}
}
]
# ❌ Too broad - captures everything
network_capture = [
{
"url": {
"type": "contains",
"value": "/"
}
}
]
Filter by resource type
Narrow down to relevant resources:
# ✅ Capture only XHR and Fetch requests
network_capture = [
{
"resource_type": ["xhr", "fetch"]
}
]
# ✅ Capture scripts and stylesheets
network_capture = [
{
"resource_type": ["script", "stylesheet"]
}
]
Set appropriate wait counts
Use wait_for_requests_count for dynamic content:
# ✅ Wait for specific number of requests
network_capture = [
{
"method": "GET",
"resource_type": ["xhr"],
"wait_for_requests_count": 3,
"wait_for_requests_count_timeout": 10
}
]
# ❌ No wait - may miss delayed requests
network_capture = [
{
"method": "GET",
"resource_type": ["xhr"]
}
]
Use XHR mode for direct API calls
Skip rendering when accessing APIs directly:
# ✅ Direct API access without rendering
result = nimble.extract({
"url": "https://api.example.com/data",
"is_xhr": True
})
# ❌ Unnecessary rendering for API endpoints
result = nimble.extract({
"url": "https://api.example.com/data",
"render": True,
"network_capture": [...]
})