http

Version: Next
http
Sends HTTP/1.1 requests and forwards the response.
http url:string, [method=string, payload=string, headers=record, response_field=field, metadata_field=field, paginate=string, paginate_delay=duration, parallel=int, tls=bool, certfile=string, keyfile=string, password=string, connection_timeout=duration, max_retry_count=int, retry_delay=duration] { … }
Description
The http operator issues HTTP/1.1 requests and forwards received responses as events.
url: string
URL to connect to.
method = string (optional)
One of the following HTTP method to use when using the client:
get
head
post
put
del
connect
options
trace
Defaults to get, or post if payload is specified.
payload = string (optional)
Payload to send with the HTTP request.
headers = record (optional)
Record of headers to send with the request.
response_field = field (optional)
Field to insert the response into.
Defaults to this.
metadata_field = field (optional)
Field to insert metadata into when using the parsing pipeline.
The metadata has the following schema:
Field Type Description
code uint64 The HTTP status code of the response.
headers record The response headers.
paginate = string (optional)
An expression to evaluate against the result of the request (optionally parsed by the given pipeline). If the expression evaluation is successful and non-null, the resulting string is used as the URL for a new GET request with the same headers.
paginate_delay = duration (optional)
The duration to wait between consecutive paginatation requests.
Defaults to 0s.
parallel = int (optional)
Maximum amount of requests that can be in progress at any time.
Defaults to 1.
tls = bool (optional)
Enables TLS.
certfile = string (optional)
Path to the client certificate.
keyfile = string (optional)
Path to the key for the client certificate.
password = string (optional)
Password file for keyfile.
connection_timeout = duration (optional)
Timeout for the connection.
Defaults to 5s.
max_retry_count = int (optional)
The maximum times to retry a failed request. Every request has its own retry count.
Defaults to 0.
retry_delay = duration (optional)
The duration to wait between each retry.
Defaults to 1s.
{ … }
A pipeline that receives the response body as bytes, allowing parsing per request. This is especially useful in scenarios where the response body can be parsed into multiple events.
Examples
Make a GET request
Here we make a request to urlscan.io to search for scans for tenzir.com and get the first result.
from {} http "https://urlscan.io/api/v1/search?q=tenzir.com" {read_json} unroll results head 1
{ results: { submitter: { ... }, task: { ... }, stats: { ... }, page: { ... }, _id: "0196edb1-521e-761f-9d62-1ca4cfad5b30", _score: null, sort: [ "1747744570133", "\"0196edb1-521e-761f-9d62-1ca4cfad5b30\"" ], result: "https://urlscan.io/api/v1/result/0196edb1-521e-761f-9d62-1ca4cfad5b30/", screenshot: "https://urlscan.io/screenshots/0196edb1-521e-761f-9d62-1ca4cfad5b30.png", }, total: 9, took: 296, has_more: false, }
Keeping input context
Frequently, the purpose of making real-time requests in a pipeline is to enrich the incoming data with additional context. In these cases, we want to keep the original event around. This can be done simply by specifying the response_field and metadata_field options as appropriate.
E.g. in the above example, let's assume we had some initial context that we want to keep around:
from { ctx: {severity: "HIGH"}, domain: "tenzir.com", ip: 0.0.0.0 } http "https://urlscan.io/api/v1/search?q=" + domain, response_field=scan {read_json} scan.results = scan.results[0]
{ ctx: { severity: "HIGH", }, domain: "tenzir.com", ip: 0.0.0.0, scan: { results: { submitter: { ... }, task: { ... }, stats: { ... }, page: { ... }, _id: "0196edb1-521e-761f-9d62-1ca4cfad5b30", _score: null, sort: [ "1747744570133", "\"0196edb1-521e-761f-9d62-1ca4cfad5b30\"" ], result: "https://urlscan.io/api/v1/result/0196edb1-521e-761f-9d62-1ca4cfad5b30/", screenshot: "https://urlscan.io/screenshots/0196edb1-521e-761f-9d62-1ca4cfad5b30.png", }, total: 9, took: 88, has_more: false, }, }
Paginate an API
We can utilize the sort and has_more fields to get more pages from the API.
let $URL = "https://urlscan.io/api/v1/search?q=example.com" from {} http $URL, paginate=$URL + "&search_after=" + results.last().sort.first() + "," + results.last().sort.last().slice(begin=1, end=-1) if has_more? { read_json } head 10
Here we construct the next url for pagination by extracting values from the responses.
The query parameter search_after expects the two values from the sort key in the response to be joined with a ,. Thus forming a URL like https://urlscan.io/api/v1/search?q=example.com&search_after=1747796723608,0196f0cd-6fda-761a-81a6-ae1b18914e61.
The if has_more? ensures pagination only continues till we have a has_more field that is true.
Additionally, we limit the maximum pages by a simple head 10.
Edit this page

Description

`url: string`

`method = string (optional)`

`payload = string (optional)`

`headers = record (optional)`

`response_field = field (optional)`

`metadata_field = field (optional)`

`paginate = string (optional)`

`paginate_delay = duration (optional)`

`parallel = int (optional)`

`tls = bool (optional)`

`certfile = string (optional)`

`keyfile = string (optional)`

`password = string (optional)`

`connection_timeout = duration (optional)`

`max_retry_count = int (optional)`

`retry_delay = duration (optional)`

`{ … }`

Examples

Make a GET request

Keeping input context

Paginate an API

Field	Type	Description
`code`	`uint64`	The HTTP status code of the response.
`headers`	`record`	The response headers.

http

Description​

url: string​

method = string (optional)​

payload = string (optional)​

headers = record (optional)​

response_field = field (optional)​

metadata_field = field (optional)​

paginate = string (optional)​

paginate_delay = duration (optional)​

parallel = int (optional)​

tls = bool (optional)​

certfile = string (optional)​

keyfile = string (optional)​

password = string (optional)​

connection_timeout = duration (optional)​

max_retry_count = int (optional)​

retry_delay = duration (optional)​

{ … }​

Examples​

Make a GET request​

Keeping input context​

Paginate an API​