Logo
OpenText Developer Cloud public resources API Documentation

Extract Page Service.

POST {{baseUrl}}/session/services/extractpage

The Extract Page Service will perform extraction on each item submitted and return a UIM object containing information from the result.

Service Properties

  • Env - Metadata environment identifier. Value is one of D, T or P. Default value is P.

  • IncludeOcrData - Boolean. If true then the returned UIM data object will contain extracted characters information. Otherwise it won't.

  • Project - Optional string. The recognition project to use to classify the document. Valid values are Default for the Advanced Recognition and InformationExtraction for the Information Extraction. If omitted, Default is used.

Values Per Request Item

  • DocumentTypeName - String. The Document Type name to be used for extraction. This is optional if the TemplateId property is passed.

  • PageIndex - Number. The zero-based page index within Document Type. If omitted, then it defaults to 0. This is optional if the TemplateId property is passed. Unused if the project is InformationExtraction.

  • TemplateId - String. The image template ID assigned in the recognition project that should be used for extraction. If not supplied, then the DocumentTypeName should be specified. Unused if the project is InformationExtraction.

Files Per Request Item

There can only be one file per request item object. It can either be an embedded file or a reference to a file ID previously posted to the Files Resource. The File Type property for the file is ignored for this service.

If the DocumentTypeName and PageIndex are specified, then the data will be extracted based on the index of the template in the order of the template names (not IDs) in the specified document type. If the PageIndex is greater than the number of templates in the document type, then the image is not processed for data extraction.

 

Body PARAM

Key Datatype Required Description 



HEADERS

Key Datatype Required Description 
Content-Type
string




RESPONSES

status OK

{ "returnStatus": { "status": 200, "code": "OK0000", "message": "", "server": "WS-S6ac611ec08f243cdb3c9977004953b1fIS" }, "licenseUsedPercent": 0, "id": "ReqCPEXTRAC_3K3_0031049", "serviceName": "extractpage", "executionMilliSeconds": 600, "licensePagesUsed": 1, "licensePagesUsed2": 0, "resultItems": [ { "nodeId": 1, "errorCode": "", "errorMessage": "", "values": [ { "name": "ClassificationId", "value": "d907a548196c4e35837dad51954cd3ed" }, { "name": "UimData", "value": { "docType": "TestWren", "locale": "en-US", "flaggedReason": null, "extractionId": null, "nodeList": [ { "name": "InvoiceNumber", "isArray": false, "indexFieldType": "Number", "labelText": "Invoice No.", "isRequired": true, "controlType": "TextBox", "data": [ { "arrayIndex": 0, "value": 227628, "fieldError": { "errorCode": "ER2208", "recoverable": false, "message": "Out of Bounds:Valid values: 1000 ➜ 10210" }, "mustConfirm": true, "choices": null, "locationRect": { "left": 639, "top": 144, "width": 124, "height": 30 }, "pageId": 1, "confidence": 100, "extractedCharacters": null } ] } ] } } ] } ] }

Curl
curl -X POST 'https://capture.ot2.opentext.com/cp-rest/v2/session/services/extractpage' -H 'Content-Type: application/hal+json' -d '{"serviceProps":[{"name":"Env","value":"D"},{"name":"IncludeOcrData","value":false},{"name":"Project","value":"InformationExtraction"}],"requestItems":[{"nodeId":1,"values":[{"name":"DocumentTypeName","value":"TestWren"},{"name":"TemplateId","value":""},{"name":"PageIndex","value":"0"}],"files":[{"name":"Wren","value":"F_0d7bbf6b26b74960b36aa26523ed4d10TIF","contentType":"image/tiff","fileType":"tif"}]}]}'

ENDPOINTS