Logo
OpenText Developer Cloud public resources API Documentation

Classify Extract Document Service.

POST {{baseUrl}}/session/services/classifyextractdocument

The Classify Extract Document Service will perform classification and extraction on each item submitted and return an UIM object containing information from the result of classification and extraction. Optionally, the service will perform document separation as configured in the recognition project.

Service Properties

  • Env - Metadata environment identifier. Value is one of D, T or P. Default value is P.

  • IncludeOcrData - Boolean. If true then the returned UIM data object will contain extracted characters information. Otherwise it won't.

  • EnableDocumentSeparation - Optional string. Whether to perform auto document separation as per Dpp project folder management settings. Default value is “false”.

  • Project - Optional string. The recognition project to use to classify the document. Valid values are Default for the Advanced Recognition and InformationExtraction for the Information Extraction. If omitted, Default is used.

Number of Request Items

This Real-Time Service supports one or more items.

Values Per Request Item

No values are needed or used

Files Per Request Item

Each item can have one or more files. It can either be an embedded file or a reference to a file ID previously posted to the Files Resource. The File Type property for the file is ignored for this service.

If the request item contains more than one image, then the document type associated with the first classified page is used for the document. The extraction results for all pages belonging to the document type are merged into a single document. If a given field has conflicting values from different pages, then the value is set according to the Extract Page visual property for that field in the document type definition.

 

Body PARAM

Key Datatype Required Description 



HEADERS

Key Datatype Required Description 
Content-Type
string




RESPONSES

status OK

{ "returnStatus": { "status": 200, "code": "OK0000", "message": "", "server": "WS-S117316e70dcd4942884f142d79d41295IS" }, "licenseUsedPercent": 0, "id": "ReqCPEXTRAC_3K3_0031108", "serviceName": "classifyextractdocument", "executionMilliSeconds": 660, "licensePagesUsed": 1, "licensePagesUsed2": 1, "resultItems": [ { "nodeId": 1, "errorCode": "", "errorMessage": "", "values": [ { "name": "ClassificationPageIds,", "value": [ "d907a548196c4e35837dad51954cd3ed", "748f4bfce54f4fc084ba19cee31bcccc" ] }, { "name": "UimData", "value": { "docType": "TestWren", "locale": "en-US", "flaggedReason": null, "extractionId": "3aff08999e844ff6a31aff002b0fcb4a", "nodeList": [ { "name": "InvoiceNumber", "isArray": false, "indexFieldType": "Number", "labelText": "Invoice No.", "isRequired": true, "controlType": "TextBox", "data": [ { "arrayIndex": 0, "value": 227628, "fieldError": { "errorCode": "ER2208", "recoverable": false, "message": "Out of Bounds:Valid values: 1000 ➜ 10210" }, "mustConfirm": true, "choices": null, "locationRect": { "left": 639, "top": 144, "width": 124, "height": 30 }, "pageId": 1, "confidence": 100, "extractedCharacters": [ { "values": [ { "value": "2", "confidence": 100 } ] } ] } ] }
Curl
curl -X POST 'https://capture.ot2.opentext.com/cp-rest/v2/session/services/classifyextractdocument' -H 'Content-Type: application/hal+json' -d '{"serviceProps":[{"name":"Env","value":"D"},{"name":"IncludeOcrData","value":true},{"name":"Project","value":"InformationExtraction"}],"requestItems":[{"nodeId":1,"values":null,"files":[{"name":"Wren","value":"F_28741fcd8ea9468fa23bee502f4b19ebTIF","contentType":"image/tiff","fileType":"tif"},{"name":"Wren_p2","value":"F_2061b933c8e5412aa563a1b9c7ebf337TIF","contentType":"image/tiff","fileType":"tif"}]}]}'

ENDPOINTS