Endpoints

Upload a large file using multipart upload

Initiate a multipart upload for large files (typically >100MB). This will return presigned URLs for each part.

Each presigned URL is valid for 900 seconds (15 minutes) and can be used multiple times.

The workflow is:

Call this endpoint to get presigned URLs for each part
Upload each part to its respective presigned URL using PUT requests
Call the complete multipart upload endpoint with all part ETags

ZIP archives are supported through this endpoint too — large ZIPs are a common use case for multipart upload. When fileType is application/zip, the server will extract the archive after the upload completes and treat each contained file as an independent document (OCR, classification, and webhook callbacks are emitted per extracted file). All extracted files share the same apiRequestId so you can correlate their per-file callbacks back to the original ZIP upload. Poll GET /v1/uploads/{uploadId} to see per-file status (HTTP 207 is returned when some files in the ZIP succeeded and others failed).

The input should contain:

fileName: the name of the file to be uploaded (include the .zip extension when uploading a ZIP archive)
fileType: the MIME type of the file. Use application/zip for ZIP archives.
fileSize: the size of the file in MB
partSizeLimit: (optional) the size limit for each part in MB
isSplit: whether the file should be split after upload (optional, default: false)
isSplitExcel: whether to split Excel files by worksheets (optional, default: false)
callbackURL: the url that will be called after processing (optional)
ocrModel: the OCR model to use (optional)
schemaLocking: whether the schema should be locked (optional)
directoryId: the directory id where the file should be uploaded (optional)
isEphemeral: whether the file and all related data should be deleted after the file is processed, must be one of true or false (optional, default: false)
pageCount: page count of the file, used for early validation against page limits (optional). Not applicable to ZIP archives.
apiRequestId: optional client-supplied correlation ID that groups files uploaded together into a logical batch. Persisted on every resulting document and forwarded as api_request_id on every subsequent webhook callback, so you can reconcile processing events back to the originating request. For ZIP uploads, every extracted file inherits the same apiRequestId. If omitted, the server auto-generates one (shared across all files extracted from the same ZIP).

POST

prod

files

upload

multipart

Upload a large file using multipart upload

curl --request POST \
  --url https://api.orion.file.ai/prod/v1/files/upload/multipart \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "fileName": "large-file.pdf",
  "fileType": "application/pdf",
  "fileSize": 150.5,
  "partSizeLimit": 10,
  "isSplit": false,
  "isSplitExcel": false,
  "callbackURL": "https://example.com/callback",
  "ocrModel": "Beethoven_ENG_O5.6",
  "schemaLocking": false,
  "directoryId": "649e2d2d2d2d2d2d2d2d2d2d",
  "isEphemeral": false,
  "pageCount": 50,
  "apiRequestId": "my-batch-request-123"
}
'

{
  "id": "upload_aws_xyz789",
  "key": "org_123/workspace_456/device/file_abc/document.pdf",
  "s3Path": "s3://bucket/org_123/workspace_456/device/file_abc/document.pdf",
  "partSize": 6291456,
  "totalParts": 2,
  "totalSize": 9123430,
  "presignedUrls": [
    {
      "partNumber": 1,
      "presignedUrl": "https://s3.amazonaws.com/...?signature=...",
      "startByte": 0,
      "endByte": 6291455,
      "size": 6291456
    },
    {
      "partNumber": 2,
      "presignedUrl": "https://s3.amazonaws.com/...?signature=...",
      "startByte": 6291456,
      "endByte": 9123429,
      "size": 2831974
    }
  ],
  "uploadId": "file_abc123xyz",
  "callbackURL": "https://example.com/callback",
  "ocrModel": "Beethoven_ENG_O5.6",
  "schemaLocking": true,
  "isSplit": false,
  "isSplitExcel": false,
  "directoryId": "649e2d2d2d2d2d2d2d2d2d2d",
  "isEphemeral": false
}

Initiate a multipart upload for large files (typically >100MB). This endpoint returns presigned URLs for each part that you can use to upload file chunks directly to storage.

Each presigned URL is valid for 900 seconds (15 minutes) and can be used multiple times.

How It Works

1. Initiate Upload

Call this endpoint to get presigned URLs for each part

2. Upload Parts

Upload each part to its respective presigned URL using PUT requests

3. Complete Upload

Call the complete multipart upload endpoint with all part ETags

Authorizations

x-api-key

string

header

required

API key for authentication

Body

application/json

fileName

string

required

File name

Example:

"large-file.pdf"

fileType

string

required

File type

Example:

"application/pdf"

fileSize

number

required

File size in MB

Example:

150.5

partSizeLimit

number

Part size limit in MB (optional, default will be calculated)

Example:

10

isSplit

boolean

default:false

Is split

Example:

false

isSplitExcel

boolean

Is split excel - whether to split Excel files by worksheets

Example:

false

callbackURL

string

Callback URL

Example:

"https://example.com/callback"

ocrModel

enum<string>

OCR model

Available options:

Beethoven_ENG_O5.6,

Beethoven_ENG_G5.5,

Beethoven_ENG_GP25,

Beethoven_ENG_GP25.1,

Beethoven_ENG_GP25.2,

Beethoven_CUS_O5.1,

Beethoven_CUS_O5.2,

Beethoven_CUS_GP25.1,

Unified (google-document-ai-ocr-gemini-v10),

Beethoven_ZH_O5.9,

Beethoven_JP_O5.3,

Beethoven_JP_G5.4,

Beethoven_TH_O5.1,

Beethoven_TH_G5.1

Example:

"Beethoven_ENG_O5.6"

schemaLocking

boolean

Schema locking

Example:

false

directoryId

string

Directory Id

Example:

"649e2d2d2d2d2d2d2d2d2d2d"

isEphemeral

boolean

Is ephemeral

Example:

false

pageCount

number

Page count of the PDF file. Used for early validation against page limits.

Example:

50

apiRequestId

string

Optional client-supplied correlation ID that ties files uploaded in the same logical batch together.

Behaviour:

Persisted on the resulting document and forwarded as api_request_id on all subsequent webhook callbacks for this upload, so you can correlate processing events back to the originating request on your side.
When the uploaded file is a ZIP archive (large zips are a common use case for multipart upload), every file extracted from it inherits this same value, letting you link all per-file callbacks back to the one ZIP upload.
If omitted, the server auto-generates an opaque ID. For ZIP uploads specifically, the generated ID is shared across all extracted files.

Supply your own value when you already have an idempotency key or job ID on the client side (e.g. from your own queue) that you want to reconcile against inbound webhook events.

Example:

"my-batch-request-123"

Response

Multipart upload initiated successfully. Use the presigned URLs to upload each part.

Upload a file Complete a multipart upload

⌘I

fileAI API

Key Functions

Endpoints

Upload a large file using multipart upload

How It Works

Authorizations

Body

Response

fileAI API

Key Functions

Endpoints

Documentation Index

​How It Works

Authorizations

Body

Response

How It Works