> ## Documentation Index
> Fetch the complete documentation index at: https://docs.file.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Upload a large file using multipart upload

> Initiate a multipart upload for large files (typically >100MB). This will return presigned URLs for each part.

Each presigned URL is valid for 900 seconds (15 minutes) and can be used multiple times.

The workflow is:
1. Call this endpoint to get presigned URLs for each part
2. Upload each part to its respective presigned URL using PUT requests
3. Call the complete multipart upload endpoint with all part ETags

**ZIP archives are supported** through this endpoint too — large ZIPs are a common use case for multipart upload. When `fileType` is `application/zip`, the server will extract the archive after the upload completes and treat each contained file as an independent document (OCR, classification, and webhook callbacks are emitted per extracted file). All extracted files share the same `apiRequestId` so you can correlate their per-file callbacks back to the original ZIP upload. Poll `GET /v1/uploads/{uploadId}` to see per-file status (HTTP 207 is returned when some files in the ZIP succeeded and others failed).

The input should contain:
- `fileName`: the name of the file to be uploaded (include the `.zip` extension when uploading a ZIP archive)
- `fileType`: the MIME type of the file. Use `application/zip` for ZIP archives.
- `fileSize`: the size of the file in MB
- `partSizeLimit`: (optional) the size limit for each part in MB
- `isSplit`: whether the file should be split after upload (optional, default: false)
- `isSplitExcel`: whether to split Excel files by worksheets (optional, default: false)
- `callbackURL`: the url that will be called after processing (optional)
- `ocrModel`: the OCR model to use (optional)
- `schemaLocking`: whether the schema should be locked (optional)
- `directoryId`: the directory id where the file should be uploaded (optional)
- `isEphemeral`: whether the file and all related data should be deleted after the file is processed, must be one of true or false (optional, default: false)
- `pageCount`: page count of the file, used for early validation against page limits (optional). Not applicable to ZIP archives.
- `apiRequestId`: optional client-supplied correlation ID that groups files uploaded together into a logical batch. Persisted on every resulting document and forwarded as `api_request_id` on every subsequent webhook callback, so you can reconcile processing events back to the originating request. For ZIP uploads, every extracted file inherits the same `apiRequestId`. If omitted, the server auto-generates one (shared across all files extracted from the same ZIP).

Initiate a multipart upload for large files (typically >100MB). This endpoint returns presigned URLs for each part that you can use to upload file chunks directly to storage.

<Info>
  Each presigned URL is valid for 900 seconds (15 minutes) and can be used
  multiple times.
</Info>

## How It Works

<Steps>
  <Step title="1. Initiate Upload">
    Call this endpoint to get presigned URLs for each part
  </Step>

  <Step title="2. Upload Parts">
    Upload each part to its respective presigned URL using PUT requests
  </Step>

  <Step title="3. Complete Upload">
    Call the complete multipart upload endpoint with all part ETags
  </Step>
</Steps>


## OpenAPI

````yaml post /prod/v1/files/upload/multipart
openapi: 3.0.0
info:
  title: Public API
  description: >-

    ### Welcome to fileAI’s Public API Documentation.

    This API allows users to check the health of the system, upload and manage
    files, and manage AI Schemas.

    Should you have any questions, please reach out to fileAI via the “Contact a
    Developer” link below.



    [Contact a Developer](mailto:support@file.ai)



    ### Prerequisites


    Before using our API, please ensure you complete the following prerequisites

    - You must have a fileAI account. Sign up or login
    [here](https://orion.file.ai/en/sign-up)

    - You must have an API Key. After creating your fileAI account, you can
    generate your API Key. Refer to the Authentication section below for more
    details.



    ### Authentication

    All API requests require an API key for authentication.

    - To obtain your API key, please log in to your fileAI account and navigate
    to Project Settings in your dashboard

    - Keep your API key secure and do not share it publicly.


    ![Authentication](https://fileai-static-assets.s3.us-west-2.amazonaws.com/public-api-service/authentication.png)


    ### How to Use Your API Key

    Once you have your API key:

    - Click the Authorize button on the top-right of this page

    - Enter your API Key under Value

    - Click Authorize to start making authenticated requests directly from the
    documentation


    ![How to Use Your API
    Key](https://fileai-static-assets.s3.us-west-2.amazonaws.com/public-api-service/how-to-use-api-keys.png)
        
  version: '1.0'
  contact: {}
servers:
  - url: https://api.orion.file.ai
security: []
tags:
  - name: Public API V1
paths:
  /prod/v1/files/upload/multipart:
    post:
      tags:
        - Public API V1
      summary: Upload a large file using multipart upload
      description: >-
        Initiate a multipart upload for large files (typically >100MB). This
        will return presigned URLs for each part.


        Each presigned URL is valid for 900 seconds (15 minutes) and can be used
        multiple times.


        The workflow is:

        1. Call this endpoint to get presigned URLs for each part

        2. Upload each part to its respective presigned URL using PUT requests

        3. Call the complete multipart upload endpoint with all part ETags


        **ZIP archives are supported** through this endpoint too — large ZIPs
        are a common use case for multipart upload. When `fileType` is
        `application/zip`, the server will extract the archive after the upload
        completes and treat each contained file as an independent document (OCR,
        classification, and webhook callbacks are emitted per extracted file).
        All extracted files share the same `apiRequestId` so you can correlate
        their per-file callbacks back to the original ZIP upload. Poll `GET
        /v1/uploads/{uploadId}` to see per-file status (HTTP 207 is returned
        when some files in the ZIP succeeded and others failed).


        The input should contain:

        - `fileName`: the name of the file to be uploaded (include the `.zip`
        extension when uploading a ZIP archive)

        - `fileType`: the MIME type of the file. Use `application/zip` for ZIP
        archives.

        - `fileSize`: the size of the file in MB

        - `partSizeLimit`: (optional) the size limit for each part in MB

        - `isSplit`: whether the file should be split after upload (optional,
        default: false)

        - `isSplitExcel`: whether to split Excel files by worksheets (optional,
        default: false)

        - `callbackURL`: the url that will be called after processing (optional)

        - `ocrModel`: the OCR model to use (optional)

        - `schemaLocking`: whether the schema should be locked (optional)

        - `directoryId`: the directory id where the file should be uploaded
        (optional)

        - `isEphemeral`: whether the file and all related data should be deleted
        after the file is processed, must be one of true or false (optional,
        default: false)

        - `pageCount`: page count of the file, used for early validation against
        page limits (optional). Not applicable to ZIP archives.

        - `apiRequestId`: optional client-supplied correlation ID that groups
        files uploaded together into a logical batch. Persisted on every
        resulting document and forwarded as `api_request_id` on every subsequent
        webhook callback, so you can reconcile processing events back to the
        originating request. For ZIP uploads, every extracted file inherits the
        same `apiRequestId`. If omitted, the server auto-generates one (shared
        across all files extracted from the same ZIP).
      operationId: PublicAPIController_uploadMultipartFileRequest
      parameters: []
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/UploadMultipartFileForPAInput'
      responses:
        '201':
          description: >-
            Multipart upload initiated successfully. Use the presigned URLs to
            upload each part.
          content:
            application/json:
              example:
                id: upload_aws_xyz789
                key: org_123/workspace_456/device/file_abc/document.pdf
                s3Path: s3://bucket/org_123/workspace_456/device/file_abc/document.pdf
                partSize: 6291456
                totalParts: 2
                totalSize: 9123430
                presignedUrls:
                  - partNumber: 1
                    presignedUrl: https://s3.amazonaws.com/...?signature=...
                    startByte: 0
                    endByte: 6291455
                    size: 6291456
                  - partNumber: 2
                    presignedUrl: https://s3.amazonaws.com/...?signature=...
                    startByte: 6291456
                    endByte: 9123429
                    size: 2831974
                uploadId: file_abc123xyz
                callbackURL: https://example.com/callback
                ocrModel: Beethoven_ENG_O5.6
                schemaLocking: true
                isSplit: false
                isSplitExcel: false
                directoryId: 649e2d2d2d2d2d2d2d2d2d2d
                isEphemeral: false
        '401':
          description: Invalid API key | API key is missing
          content:
            application/json:
              example:
                message: Invalid API key
                error: Unauthorized
                statusCode: 401
              schema:
                type: object
                properties:
                  message:
                    type: string
                  error:
                    type: string
                  statusCode:
                    type: number
        '403':
          description: Access denied. You are in readonly mode.
          content:
            application/json:
              example:
                message: Access denied. You are in readonly mode.
                error: Forbidden
                statusCode: 403
              schema:
                type: object
                properties:
                  message:
                    type: string
                  error:
                    type: string
                  statusCode:
                    type: number
        '422':
          description: Invalid input parameters
          content:
            application/json:
              examples:
                invalidFileName:
                  summary: Invalid fileName
                  value:
                    message: Invalid fileName.
                    error: Unprocessable Entity
                    statusCode: 422
                invalidFileType:
                  summary: Invalid fileType
                  value:
                    message: Invalid fileType.
                    error: Unprocessable Entity
                    statusCode: 422
                invalidFileSize:
                  summary: Invalid fileSize
                  value:
                    message: Invalid fileSize. Must be greater than 0.
                    error: Unprocessable Entity
                    statusCode: 422
                invalidIsSplit:
                  summary: Invalid isSplit
                  value:
                    message: Invalid isSplit. It must be one of true or false.
                    error: Unprocessable Entity
                    statusCode: 422
                invalidIsSplitExcel:
                  summary: Invalid isSplitExcel
                  value:
                    message: Invalid isSplitExcel. It must be one of true or false.
                    error: Unprocessable Entity
                    statusCode: 422
                invalidSchemaLocking:
                  summary: Invalid schemaLocking
                  value:
                    message: Invalid schemaLocking. It must be one of true or false.
                    error: Unprocessable Entity
                    statusCode: 422
                invalidDirectoryId:
                  summary: Invalid directoryId
                  value:
                    message: Invalid directoryId.
                    error: Unprocessable Entity
                    statusCode: 422
                invalidOcrModel:
                  summary: Invalid OCR model
                  value:
                    message: Invalid ocr model
                    error: Unprocessable Entity
                    statusCode: 422
                invalidIsEphemeral:
                  summary: Invalid isEphemeral
                  value:
                    message: Invalid isEphemeral. It must be one of true or false.
                    error: Unprocessable Entity
                    statusCode: 422
              schema:
                type: object
                properties:
                  message:
                    type: string
                  error:
                    type: string
                  statusCode:
                    type: number
        '429':
          description: Too Many Requests
          content:
            application/json:
              example:
                statusCode: 429
                message: Too Many Requests
              schema:
                type: object
                properties:
                  statusCode:
                    type: number
                  message:
                    type: string
      security:
        - x-api-key: []
components:
  schemas:
    UploadMultipartFileForPAInput:
      type: object
      properties:
        fileName:
          type: string
          description: File name
          example: large-file.pdf
        fileType:
          type: string
          description: File type
          example: application/pdf
        fileSize:
          type: number
          description: File size in MB
          example: 150.5
        partSizeLimit:
          type: number
          description: Part size limit in MB (optional, default will be calculated)
          example: 10
        isSplit:
          type: boolean
          description: Is split
          default: false
          example: false
        isSplitExcel:
          type: boolean
          description: Is split excel - whether to split Excel files by worksheets
          example: false
        callbackURL:
          type: string
          description: Callback URL
          example: https://example.com/callback
        ocrModel:
          type: string
          description: OCR model
          enum:
            - Beethoven_ENG_O5.6
            - Beethoven_ENG_G5.5
            - Beethoven_ENG_GP25
            - Beethoven_ENG_GP25.1
            - Beethoven_ENG_GP25.2
            - Beethoven_CUS_O5.1
            - Beethoven_CUS_O5.2
            - Beethoven_CUS_GP25.1
            - Unified (google-document-ai-ocr-gemini-v10)
            - Beethoven_ZH_O5.9
            - Beethoven_JP_O5.3
            - Beethoven_JP_G5.4
            - Beethoven_TH_O5.1
            - Beethoven_TH_G5.1
          example: Beethoven_ENG_O5.6
        schemaLocking:
          type: boolean
          description: Schema locking
          example: false
        directoryId:
          type: string
          description: Directory Id
          example: 649e2d2d2d2d2d2d2d2d2d2d
        isEphemeral:
          type: boolean
          description: Is ephemeral
          example: false
        pageCount:
          type: number
          description: >-
            Page count of the PDF file. Used for early validation against page
            limits.
          example: 50
        apiRequestId:
          type: string
          description: >-
            Optional client-supplied correlation ID that ties files uploaded in
            the same logical batch together.


            Behaviour:

            - Persisted on the resulting document and forwarded as
            `api_request_id` on all subsequent webhook callbacks for this
            upload, so you can correlate processing events back to the
            originating request on your side.

            - When the uploaded file is a **ZIP archive** (large zips are a
            common use case for multipart upload), every file extracted from it
            inherits this same value, letting you link all per-file callbacks
            back to the one ZIP upload.

            - If omitted, the server auto-generates an opaque ID. For ZIP
            uploads specifically, the generated ID is shared across all
            extracted files.


            Supply your own value when you already have an idempotency key or
            job ID on the client side (e.g. from your own queue) that you want
            to reconcile against inbound webhook events.
          example: my-batch-request-123
      required:
        - fileName
        - fileType
        - fileSize
  securitySchemes:
    x-api-key:
      type: apiKey
      in: header
      name: x-api-key
      description: API key for authentication

````