Document Extraction
Overview
The "Document Extraction" API is a universal identification interface that allows you to initiate model calls based on the model ID. You can specify whether the interface returns the recognition results synchronously or asynchronously by the parameter "synchronously".
Request
POST https://api.docai.pro/api/identify/file/v2/identify
Request body
Name | Type | Mandatory | Description |
---|---|---|---|
modelId | string | true | The unique identifier of the model |
file | file | true | Documents that need to be identified |
legalPageNumbers | string | false | Page numbers to be identified, if not specified, the entire document will be recognized. |
workspaceCode | string | false | The workspace name within the model.if empty, defaults to 'Default Workspace'. |
isAsynchronous | string | false | Whether to return the recognition results asynchronously,Default: 0,1: Asynchronously return,0: Synchronously return |
Examples of the legalPageNumbers:
1-1: only need to recognize the first page;
2-3: only need to recognize the second page and the third page;
1-1, 3-4: only need to recognize the first page, the third page, and the fourth page.
Response body
Name | Type | Mandatory | Description |
---|---|---|---|
fileId | string | true | Unique identifier for uploading files |
recognizeStatus | string | true | Upload file recognition status (0 - failed, 1 - successful, 2 - Identifying) |
result | object | true | Identification field |
result
Message Format
Name | Type | Mandatory | Description |
---|---|---|---|
fieldList | array | false | List of extracted fields from the document |
pageNumber | int | false | The page number within the document for this field |
key | string | false | The name of the extracted field |
value | string | false | The value of the extracted field |
itemList | array | false | Extracted line information from the document |
pageNumber | integer | false | The page number within the document for this item |
key | string | false | The name of the extracted item |
value | string | false | The value of the extracted item |
Example request
curl -i -X POST https://api.docai.pro/api/identify/file/v2/identify \
-H "Content-Type: multipart/form-data" \
-H "Authorization: Bearer $DOCAI_API_KEY" \
-F 'modelId=@MODEL_ID' \
-F 'file=@REPLACE_IMAGE_PATH.jpg'
Example of synchronous response
{
"code": 200,
"message": null,
"costTime": null,
"data": {
"fileId": "2310091711378393918074800",
"recognizeStatus": "1",
"result": {
"fieldList": [
{
"pageNumber": 1,
"key": "paymentTerms",
"value": "Payment is due within 15 days"
},
{
"pageNumber": 1,
"key": "invoiceDate",
"value": "2019-02-11"
},
{
"pageNumber": 1,
"key": "documentType",
"value": "invoice"
},
{
"pageNumber": 1,
"key": "invoiceNumber",
"value": "US-001"
},
{
"pageNumber": 1,
"key": "grossAmount",
"value": "154.06"
},
{
"pageNumber": 1,
"key": "totalTax",
"value": "9.06"
},
{
"pageNumber": 1,
"key": "dueDate",
"value": "2019-02-26"
}
],
"taxList": [
[
{
"pageNumber": 1,
"key": "taxValue",
"value": "9.06"
},
{
"pageNumber": 1,
"key": "taxPercentage",
"value": "6.25"
}
]
],
"itemList": [
[
{
"pageNumber": 1,
"key": "totalPriceInclTax",
"value": "100.00"
},
{
"pageNumber": 1,
"key": "unitPrice",
"value": "100.00"
},
{
"pageNumber": 1,
"key": "quantity",
"value": "1"
},
{
"pageNumber": 1,
"key": "description",
"value": "Front and rear brake cables"
},
{
"pageNumber": 1,
"key": "totalPriceExclTax",
"value": "100.00"
}
],
[
{
"pageNumber": 1,
"key": "totalPriceInclTax",
"value": "30.00"
},
{
"pageNumber": 1,
"key": "unitPrice",
"value": "15.00"
},
{
"pageNumber": 1,
"key": "quantity",
"value": "2"
},
{
"pageNumber": 1,
"key": "description",
"value": "New set of pedal arms"
},
{
"pageNumber": 1,
"key": "totalPriceExclTax",
"value": "30.00"
}
],
[
{
"pageNumber": 1,
"key": "totalPriceInclTax",
"value": "15.00"
},
{
"pageNumber": 1,
"key": "unitPrice",
"value": "5.00"
},
{
"pageNumber": 1,
"key": "quantity",
"value": "3"
},
{
"pageNumber": 1,
"key": "description",
"value": "Labor 3hrs"
},
{
"pageNumber": 1,
"key": "totalPriceExclTax",
"value": "15.00"
}
]
]
}
},
"requestId": "da69edd3cb1d47b4a68db9303d280000",
"currentTimeMillis": 1696859401185
}
Example of asynchronous response
{
"code": 200,
"message": null,
"costTime": null,
"data": {
"fileId": "2310091711378393918074800",
"recognizeStatus": "1"
},
"requestId": "da69edd3cb1d47b4a68db9303d280000",
"currentTimeMillis": 1696859401185
}