Skip to main content

Document Extraction

Overview

The "Document Extraction" API is a universal identification interface that allows you to initiate model calls based on the model ID. You can specify whether the interface returns the recognition results synchronously or asynchronously by the parameter "synchronously".

Request

POST https://api.docai.pro/api/identify/file/v2/identify

Request body

NameTypeMandatoryDescription
modelIdstringtrueThe unique identifier of the model
filefiletrueDocuments that need to be identified
legalPageNumbersstringfalsePage numbers to be identified, if not specified, the entire document will be recognized.
workspaceCodestringfalseThe workspace name within the model.if empty, defaults to 'Default Workspace'.
isAsynchronousstringfalseWhether to return the recognition results asynchronously,Default: 0,1: Asynchronously return,0: Synchronously return

Examples of the legalPageNumbers:

1-1: only need to recognize the first page;

2-3: only need to recognize the second page and the third page;

1-1, 3-4: only need to recognize the first page, the third page, and the fourth page.

Response body

NameTypeMandatoryDescription
fileIdstringtrueUnique identifier for uploading files
recognizeStatusstringtrueUpload file recognition status (0 - failed, 1 - successful, 2 - Identifying)
resultobjecttrueIdentification field

result Message Format

NameTypeMandatory Description
fieldListarrayfalseList of extracted fields from the document
   pageNumberintfalseThe page number within the document for this field  
   keystringfalseThe name of the extracted field
   valuestringfalseThe value of the extracted field
itemListarrayfalseExtracted line information from the document
   pageNumberintegerfalseThe page number within the document for this item
   keystringfalseThe name of the extracted item
   valuestringfalseThe value of the extracted item

Example request

curl -i -X POST https://api.docai.pro/api/identify/file/v2/identify \
-H "Content-Type: multipart/form-data" \
-H "Authorization: Bearer $DOCAI_API_KEY" \
-F 'modelId=@MODEL_ID' \
-F 'file=@REPLACE_IMAGE_PATH.jpg'

Example of synchronous response

{
"code": 200,
"message": null,
"costTime": null,
"data": {
"fileId": "2310091711378393918074800",
"recognizeStatus": "1",
"result": {
"fieldList": [
{
"pageNumber": 1,
"key": "paymentTerms",
"value": "Payment is due within 15 days"
},
{
"pageNumber": 1,
"key": "invoiceDate",
"value": "2019-02-11"
},
{
"pageNumber": 1,
"key": "documentType",
"value": "invoice"
},
{
"pageNumber": 1,
"key": "invoiceNumber",
"value": "US-001"
},
{
"pageNumber": 1,
"key": "grossAmount",
"value": "154.06"
},
{
"pageNumber": 1,
"key": "totalTax",
"value": "9.06"
},
{
"pageNumber": 1,
"key": "dueDate",
"value": "2019-02-26"
}
],
"taxList": [
[
{
"pageNumber": 1,
"key": "taxValue",
"value": "9.06"
},
{
"pageNumber": 1,
"key": "taxPercentage",
"value": "6.25"
}
]
],
"itemList": [
[
{
"pageNumber": 1,
"key": "totalPriceInclTax",
"value": "100.00"
},
{
"pageNumber": 1,
"key": "unitPrice",
"value": "100.00"
},
{
"pageNumber": 1,
"key": "quantity",
"value": "1"
},
{
"pageNumber": 1,
"key": "description",
"value": "Front and rear brake cables"
},
{
"pageNumber": 1,
"key": "totalPriceExclTax",
"value": "100.00"
}
],
[
{
"pageNumber": 1,
"key": "totalPriceInclTax",
"value": "30.00"
},
{
"pageNumber": 1,
"key": "unitPrice",
"value": "15.00"
},
{
"pageNumber": 1,
"key": "quantity",
"value": "2"
},
{
"pageNumber": 1,
"key": "description",
"value": "New set of pedal arms"
},
{
"pageNumber": 1,
"key": "totalPriceExclTax",
"value": "30.00"
}
],
[
{
"pageNumber": 1,
"key": "totalPriceInclTax",
"value": "15.00"
},
{
"pageNumber": 1,
"key": "unitPrice",
"value": "5.00"
},
{
"pageNumber": 1,
"key": "quantity",
"value": "3"
},
{
"pageNumber": 1,
"key": "description",
"value": "Labor 3hrs"
},
{
"pageNumber": 1,
"key": "totalPriceExclTax",
"value": "15.00"
}
]
]
}
},
"requestId": "da69edd3cb1d47b4a68db9303d280000",
"currentTimeMillis": 1696859401185
}

Example of asynchronous response

{
"code": 200,
"message": null,
"costTime": null,
"data": {
"fileId": "2310091711378393918074800",
"recognizeStatus": "1"
},
"requestId": "da69edd3cb1d47b4a68db9303d280000",
"currentTimeMillis": 1696859401185
}