Classify Extract Document Service.
POST {{baseUrl}}/session/services/classifyextractdocument
The Classify Extract Document Service will perform classification and extraction on each item submitted and return an UIM object containing information from the result of classification and extraction. Optionally, the service will perform document separation as configured in the recognition project. Service Properties Env - Metadata environment identifier. Value is one of IncludeOcrData - Boolean. If true then the returned UIM data object will contain extracted characters information. Otherwise it won't. EnableDocumentSeparation - Optional string. Whether to perform auto document separation as per Dpp project folder management settings. Default value is “false”. Project - Optional string. The recognition project to use to classify the document. Valid values are Number of Request Items This Real-Time Service supports one or more items. Values Per Request Item No values are needed or used Files Per Request Item Each item can have one or more files. It can either be an embedded file or a reference to a file ID previously posted to the Files Resource. The File Type property for the file is ignored for this service. If the request item contains more than one image, then the document type associated with the first classified page is used for the document. The extraction results for all pages belonging to the document type are merged into a single document. If a given field has conflicting values from different pages, then the value is set according to the
D
, T
or P
. Default value is P
.Default
for the Advanced Recognition and InformationExtraction
for the Information Extraction. If omitted, Default
is used.Extract Page
visual property for that field in the document type definition.
Body
PARAM
Key | Datatype | Required | Description |
HEADERS
Key | Datatype | Required | Description |
Content-Type
|
string |
RESPONSES
status OK
{
"returnStatus": {
"status": 200,
"code": "OK0000",
"message": "",
"server": "WS-S117316e70dcd4942884f142d79d41295IS"
},
"licenseUsedPercent": 0,
"id": "ReqCPEXTRAC_3K3_0031108",
"serviceName": "classifyextractdocument",
"executionMilliSeconds": 660,
"licensePagesUsed": 1,
"licensePagesUsed2": 1,
"resultItems": [
{
"nodeId": 1,
"errorCode": "",
"errorMessage": "",
"values": [
{
"name": "ClassificationPageIds,",
"value": [
"d907a548196c4e35837dad51954cd3ed",
"748f4bfce54f4fc084ba19cee31bcccc"
]
},
{
"name": "UimData",
"value": {
"docType": "TestWren",
"locale": "en-US",
"flaggedReason": null,
"extractionId": "3aff08999e844ff6a31aff002b0fcb4a",
"nodeList": [
{
"name": "InvoiceNumber",
"isArray": false,
"indexFieldType": "Number",
"labelText": "Invoice No.",
"isRequired": true,
"controlType": "TextBox",
"data": [
{
"arrayIndex": 0,
"value": 227628,
"fieldError": {
"errorCode": "ER2208",
"recoverable": false,
"message": "Out of Bounds:Valid values: 1000 ➜ 10210"
},
"mustConfirm": true,
"choices": null,
"locationRect": {
"left": 639,
"top": 144,
"width": 124,
"height": 30
},
"pageId": 1,
"confidence": 100,
"extractedCharacters": [
{
"values": [
{
"value": "2",
"confidence": 100
}
]
}
]
}
]
}
Curl curl -X POST 'https://capture.ot2.opentext.com/cp-rest/v2/session/services/classifyextractdocument' -H 'Content-Type: application/hal+json' -d '{"serviceProps":[{"name":"Env","value":"D"},{"name":"IncludeOcrData","value":true},{"name":"Project","value":"InformationExtraction"}],"requestItems":[{"nodeId":1,"values":null,"files":[{"name":"Wren","value":"F_28741fcd8ea9468fa23bee502f4b19ebTIF","contentType":"image/tiff","fileType":"tif"},{"name":"Wren_p2","value":"F_2061b933c8e5412aa563a1b9c7ebf337TIF","contentType":"image/tiff","fileType":"tif"}]}]}' ENDPOINTS |