Upload a file to the server, this will return a presigned upload url to be used for the upload.
The presignedUploadURL is valid for 300 seconds (5 minutes) and can be used multiple times.
ZIP archives are supported. When fileType is application/zip, the server will extract the archive after upload and treat each contained file as an independent document (OCR, classification, and webhook callbacks are emitted per extracted file). All extracted files share the same apiRequestId so you can correlate their per-file callbacks back to the original ZIP upload. Poll GET /v1/uploads/{uploadId} to see per-file status (HTTP 207 is returned when some files in the ZIP succeeded and others failed).
The input should contain the following information:
fileName: the name of the file to be uploaded (include the .zip extension when uploading a ZIP archive)
fileType: the MIME type of the file to be uploaded. Use application/zip for ZIP archives.
isSplit: whether the file is a split file or not (optional, default: false)
isSplitExcel: whether to split Excel files by worksheets (optional, default: false)
callbackURL: (optional) the URL that will be called after file processing. Must be a valid HTTPS URL.
ocrModel: the OCR model to be used for file processing (optional). Available models:
Beethoven_ENG_O5.6 - OpenAI v6Beethoven_ENG_G5.5 - Gemini v5Beethoven_ENG_GP25 - Gemini Pro 2.5Beethoven_ENG_GP25.1 - Gemini Pro 2.5 v1Beethoven_ENG_GP25.2 - Gemini Pro 2.5 PDFBeethoven_CUS_O5.1 - Custom OpenAI v8Beethoven_CUS_O5.2 - Custom Gemini v13Unified (google-document-ai-ocr-gemini-v10) - Unified modelBeethoven_ZH_O5.9 - Chinese OpenAI v9Beethoven_JP_O5.3 - Japanese OpenAI v3Beethoven_JP_G5.4 - Japanese Gemini fine-tunedBeethoven_TH_O5.1 - Thai OpenAI v1Beethoven_TH_G5.1 - Thai Gemini v1schemaLocking: whether the schema should be locked after the file is uploaded, must be one of true or false (optional)
directoryId: the directory id where the file should be uploaded (optional)
isEphemeral: whether the file and all related data should be deleted after the file is processed, must be one of true or false (optional, default: false)
pageCount: page count of the file, used for early validation against page limits (optional). Not applicable to ZIP archives.
apiRequestId: optional client-supplied correlation ID that groups files uploaded together into a logical batch. Persisted on every resulting document and forwarded as api_request_id on every subsequent webhook callback, so you can reconcile processing events back to the originating request. For ZIP uploads, every extracted file inherits the same apiRequestId. If omitted, the server auto-generates one (shared across all files extracted from the same ZIP).
The presignedUploadURL is valid for 3600 seconds (1 hour) and can be used multiple times. The input should contain the following information:Documentation Index
Fetch the complete documentation index at: https://docs.file.ai/llms.txt
Use this file to discover all available pages before exploring further.
| Property | Type | Required | Description |
|---|---|---|---|
| fileName | string | Yes | Original name of the file including extension (e.g., “document.pdf”) |
| fileType | string | Yes | MIME type of the file (e.g., “application/pdf”, “image/jpeg”) |
| isSplit | boolean | Yes | Whether the file should be processed as separate pages/sections |
| callbackURL | string | No | HTTP endpoint to receive processing completion notifications |
| ocrModel | string | No | OCR engine to use for text extraction. Available models vary by plan |
| schemaLocking | boolean | Yes | Whether to lock the schema after processing. Must be true or false |
| isEphemeral | boolean | No | Whether to automatically delete a file 24 hours after upload. |
API key for authentication
File name
"file.pdf"
File type
"application/pdf"
Is split
false
Is split excel - whether to split Excel files by worksheets
false
Callback URL
"https://example.com/callback"
OCR model
Beethoven_ENG_O5.6, Beethoven_ENG_G5.5, Beethoven_ENG_GP25, Beethoven_ENG_GP25.1, Beethoven_ENG_GP25.2, Beethoven_CUS_O5.1, Beethoven_CUS_O5.2, Beethoven_CUS_GP25.1, Unified (google-document-ai-ocr-gemini-v10), Beethoven_ZH_O5.9, Beethoven_JP_O5.3, Beethoven_JP_G5.4, Beethoven_TH_O5.1, Beethoven_TH_G5.1 "Beethoven_ENG_O5.6"
Schema locking
false
Directory Id
"649e2d2d2d2d2d2d2d2d2d2d"
Is ephemeral
false
Page count of the PDF file. Used for early validation against page limits.
50
Optional client-supplied correlation ID that ties files uploaded in the same logical batch together.
Behaviour:
api_request_id on all subsequent webhook callbacks for this upload, so you can correlate processing events back to the originating request on your side.Supply your own value when you already have an idempotency key or job ID on the client side (e.g. from your own queue) that you want to reconcile against inbound webhook events.
"my-batch-request-123"
Get a presigned upload url for upload file, after getting the result use the presignedUploadURL with a PUT method to send the request with the binary file, the presignedUploadURL is valid for 300 seconds (5 minutes) and can be used multiple times