Documentation
¶
Index ¶
- func Bool(b bool) param.Opt[bool]
- func BoolPtr(v bool) *bool
- func DefaultClientOptions() []option.RequestOption
- func File(rdr io.Reader, filename string, contentType string) file
- func Float(f float64) param.Opt[float64]
- func FloatPtr(v float64) *float64
- func Int(i int64) param.Opt[int64]
- func IntPtr(v int64) *int64
- func Opt[T comparable](v T) param.Opt[T]
- func Ptr[T any](v T) *T
- func String(s string) param.Opt[string]
- func StringPtr(v string) *string
- func Time(t time.Time) param.Opt[time.Time]
- func TimePtr(v time.Time) *time.Time
- type Client
- func (r *Client) Delete(ctx context.Context, path string, params any, res any, ...) error
- func (r *Client) Execute(ctx context.Context, method string, path string, params any, res any, ...) error
- func (r *Client) Get(ctx context.Context, path string, params any, res any, ...) error
- func (r *Client) Patch(ctx context.Context, path string, params any, res any, ...) error
- func (r *Client) Post(ctx context.Context, path string, params any, res any, ...) error
- func (r *Client) Put(ctx context.Context, path string, params any, res any, ...) error
- type Error
- type FileExtractTextParams
- type FileExtractTextResponse
- type FileService
- type URLExtractTextParams
- type URLExtractTextParamsProxy
- type URLExtractTextResponse
- type URLService
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func DefaultClientOptions ¶
func DefaultClientOptions() []option.RequestOption
DefaultClientOptions read from the environment (CRAWLER_DEV_API_KEY, CRAWLER_DEV_BASE_URL). This should be used to initialize new clients.
func Opt ¶
func Opt[T comparable](v T) param.Opt[T]
Types ¶
type Client ¶
type Client struct {
Options []option.RequestOption
Files FileService
URLs URLService
}
Client creates a struct with services and top level methods that help with interacting with the crawler.dev API. You should not instantiate this client directly, and instead use the NewClient method instead.
func NewClient ¶
func NewClient(opts ...option.RequestOption) (r Client)
NewClient generates a new client with the default option read from the environment (CRAWLER_DEV_API_KEY, CRAWLER_DEV_BASE_URL). The option passed in as arguments are applied after these default arguments, and all option will be passed down to the services and requests that this client makes.
func (*Client) Delete ¶
func (r *Client) Delete(ctx context.Context, path string, params any, res any, opts ...option.RequestOption) error
Delete makes a DELETE request with the given URL, params, and optionally deserializes to a response. See [Execute] documentation on the params and response.
func (*Client) Execute ¶
func (r *Client) Execute(ctx context.Context, method string, path string, params any, res any, opts ...option.RequestOption) error
Execute makes a request with the given context, method, URL, request params, response, and request options. This is useful for hitting undocumented endpoints while retaining the base URL, auth, retries, and other options from the client.
If a byte slice or an io.Reader is supplied to params, it will be used as-is for the request body.
The params is by default serialized into the body using encoding/json. If your type implements a MarshalJSON function, it will be used instead to serialize the request. If a URLQuery method is implemented, the returned [url.Values] will be used as query strings to the url.
If your params struct uses param.Field, you must provide either [MarshalJSON], [URLQuery], and/or [MarshalForm] functions. It is undefined behavior to use a struct uses param.Field without specifying how it is serialized.
Any "…Params" object defined in this library can be used as the request argument. Note that 'path' arguments will not be forwarded into the url.
The response body will be deserialized into the res variable, depending on its type:
- A pointer to a *http.Response is populated by the raw response.
- A pointer to a byte array will be populated with the contents of the request body.
- A pointer to any other type uses this library's default JSON decoding, which respects UnmarshalJSON if it is defined on the type.
- A nil value will not read the response body.
For even greater flexibility, see option.WithResponseInto and option.WithResponseBodyInto.
func (*Client) Get ¶
func (r *Client) Get(ctx context.Context, path string, params any, res any, opts ...option.RequestOption) error
Get makes a GET request with the given URL, params, and optionally deserializes to a response. See [Execute] documentation on the params and response.
func (*Client) Patch ¶
func (r *Client) Patch(ctx context.Context, path string, params any, res any, opts ...option.RequestOption) error
Patch makes a PATCH request with the given URL, params, and optionally deserializes to a response. See [Execute] documentation on the params and response.
type FileExtractTextParams ¶
type FileExtractTextParams struct {
// The file to upload.
File io.Reader `json:"file,omitzero,required" format:"binary"`
// Whether to clean and normalize the extracted text. When enabled (true):
//
// - For HTML content: Removes script, style, and other non-text elements before
// extraction
// - Normalizes whitespace (collapses multiple spaces/tabs, normalizes newlines)
// - Removes empty lines and trims leading/trailing whitespace
// - Normalizes Unicode characters (NFC)
// - For JSON content: Only minimal cleaning to preserve structure When disabled
// (false): Returns raw extracted text without any processing.
CleanText param.Opt[bool] `json:"clean_text,omitzero"`
// contains filtered or unexported fields
}
func (FileExtractTextParams) MarshalMultipart ¶
func (r FileExtractTextParams) MarshalMultipart() (data []byte, contentType string, err error)
type FileExtractTextResponse ¶
type FileExtractTextResponse struct {
ContentType string `json:"contentType"`
ExtractedText string `json:"extractedText"`
Filename string `json:"filename"`
SizeBytes int64 `json:"sizeBytes"`
TextLength int64 `json:"textLength"`
// JSON contains metadata for fields, check presence with [respjson.Field.Valid].
JSON struct {
ContentType respjson.Field
ExtractedText respjson.Field
Filename respjson.Field
SizeBytes respjson.Field
TextLength respjson.Field
ExtraFields map[string]respjson.Field
// contains filtered or unexported fields
} `json:"-"`
}
func (FileExtractTextResponse) RawJSON ¶
func (r FileExtractTextResponse) RawJSON() string
Returns the unmodified JSON received from the API
func (*FileExtractTextResponse) UnmarshalJSON ¶
func (r *FileExtractTextResponse) UnmarshalJSON(data []byte) error
type FileService ¶
type FileService struct {
Options []option.RequestOption
}
FileService contains methods and other services that help with interacting with the crawler.dev API.
Note, unlike clients, this service does not read variables from the environment automatically. You should not instantiate this service directly, and instead use the NewFileService method instead.
func NewFileService ¶
func NewFileService(opts ...option.RequestOption) (r FileService)
NewFileService generates a new service that applies the given options to each request. These options are applied after the parent client's options (if there is one), and before any request-specific options.
func (*FileService) ExtractText ¶
func (r *FileService) ExtractText(ctx context.Context, body FileExtractTextParams, opts ...option.RequestOption) (res *FileExtractTextResponse, err error)
Upload a file and extract text content from it. Supports PDF, DOC, DOCX, TXT and other text-extractable document formats.
type URLExtractTextParams ¶
type URLExtractTextParams struct {
// The URL to extract text from.
URL string `json:"url,required"`
// Maximum cache time in milliseconds for the webpage. Must be between 0 (no
// caching) and 259200000 (3 days). Defaults to 172800000 (2 days) if not
// specified.
CacheAge param.Opt[int64] `json:"cache_age,omitzero"`
// Whether to clean extracted text
CleanText param.Opt[bool] `json:"clean_text,omitzero"`
// Maximum number of redirects to follow when fetching the URL. Must be between 0
// (no redirects) and 20. Defaults to 5 if not specified.
MaxRedirects param.Opt[int64] `json:"max_redirects,omitzero"`
// Maximum content length in bytes for the URL response. Must be between 1024 (1KB)
// and 52428800 (50MB). Defaults to 10485760 (10MB) if not specified.
MaxSize param.Opt[int64] `json:"max_size,omitzero"`
// Maximum time in milliseconds before the crawler gives up on loading a URL. Must
// be between 1000 (1 second) and 30000 (30 seconds). Defaults to 10000 (10
// seconds) if not specified.
MaxTimeout param.Opt[int64] `json:"max_timeout,omitzero"`
// When enabled, we use a proxy for the request. If set to true, and the 'proxy'
// option is set, it will be ignored. Defaults to false if not specified. Note:
// Enabling stealth_mode consumes an additional credit/quota point (2 credits total
// instead of 1) for this request.
StealthMode param.Opt[bool] `json:"stealth_mode,omitzero"`
// Custom HTTP headers to send with the request (case-insensitive)
Headers map[string]string `json:"headers,omitzero"`
// Proxy configuration for the request
Proxy URLExtractTextParamsProxy `json:"proxy,omitzero"`
// contains filtered or unexported fields
}
func (URLExtractTextParams) MarshalJSON ¶
func (r URLExtractTextParams) MarshalJSON() (data []byte, err error)
func (*URLExtractTextParams) UnmarshalJSON ¶
func (r *URLExtractTextParams) UnmarshalJSON(data []byte) error
type URLExtractTextParamsProxy ¶ added in v0.1.0
type URLExtractTextParamsProxy struct {
// Proxy password for authentication
Password param.Opt[string] `json:"password,omitzero"`
// Proxy server URL (e.g., http://proxy.example.com:8080 or
// socks5://proxy.example.com:1080)
Server param.Opt[string] `json:"server,omitzero"`
// Proxy username for authentication
Username param.Opt[string] `json:"username,omitzero"`
// contains filtered or unexported fields
}
Proxy configuration for the request
func (URLExtractTextParamsProxy) MarshalJSON ¶ added in v0.1.0
func (r URLExtractTextParamsProxy) MarshalJSON() (data []byte, err error)
func (*URLExtractTextParamsProxy) UnmarshalJSON ¶ added in v0.1.0
func (r *URLExtractTextParamsProxy) UnmarshalJSON(data []byte) error
type URLExtractTextResponse ¶
type URLExtractTextResponse struct {
ContentType string `json:"contentType"`
ExtractedText string `json:"extractedText"`
FinalURL string `json:"finalUrl"`
SizeBytes int64 `json:"sizeBytes"`
StatusCode int64 `json:"statusCode"`
TextLength int64 `json:"textLength"`
URL string `json:"url"`
// JSON contains metadata for fields, check presence with [respjson.Field.Valid].
JSON struct {
ContentType respjson.Field
ExtractedText respjson.Field
FinalURL respjson.Field
SizeBytes respjson.Field
StatusCode respjson.Field
TextLength respjson.Field
URL respjson.Field
ExtraFields map[string]respjson.Field
// contains filtered or unexported fields
} `json:"-"`
}
func (URLExtractTextResponse) RawJSON ¶
func (r URLExtractTextResponse) RawJSON() string
Returns the unmodified JSON received from the API
func (*URLExtractTextResponse) UnmarshalJSON ¶
func (r *URLExtractTextResponse) UnmarshalJSON(data []byte) error
type URLService ¶
type URLService struct {
Options []option.RequestOption
}
URLService contains methods and other services that help with interacting with the crawler.dev API.
Note, unlike clients, this service does not read variables from the environment automatically. You should not instantiate this service directly, and instead use the NewURLService method instead.
func NewURLService ¶
func NewURLService(opts ...option.RequestOption) (r URLService)
NewURLService generates a new service that applies the given options to each request. These options are applied after the parent client's options (if there is one), and before any request-specific options.
func (*URLService) ExtractText ¶
func (r *URLService) ExtractText(ctx context.Context, body URLExtractTextParams, opts ...option.RequestOption) (res *URLExtractTextResponse, err error)
Extract text content from a webpage or document accessible via URL. Supports HTML, PDF, and other web-accessible content types.
Directories
¶
| Path | Synopsis |
|---|---|
|
encoding/json
Package json implements encoding and decoding of JSON as defined in RFC 7159.
|
Package json implements encoding and decoding of JSON as defined in RFC 7159. |
|
encoding/json/shims
This package provides shims over Go 1.2{2,3} APIs which are missing from Go 1.22, and used by the Go 1.24 encoding/json package.
|
This package provides shims over Go 1.2{2,3} APIs which are missing from Go 1.22, and used by the Go 1.24 encoding/json package. |
|
packages
|
|
|
shared
|
|