Extract Page Service.
POST {{baseUrl}}/session/services/extractpage
The Extract Page Service will perform extraction on each item submitted and return a UIM object containing information from the result.
Service Properties
Env - Metadata environment identifier. Value is one of
D
,T
orP
. Default value isP
.IncludeOcrData - Boolean. If true then the returned UIM data object will contain extracted characters information. Otherwise it won't.
Project - Optional string. The recognition project to use to classify the document. Valid values are
Default
for the Advanced Recognition andInformationExtraction
for the Information Extraction. If omitted,Default
is used.
Values Per Request Item
DocumentTypeName - String. The Document Type name to be used for extraction. This is optional if the TemplateId property is passed. This value is case sensitive when the request uses the
InformationExtraction
project.PageIndex - Number. The zero-based page index within Document Type. If omitted, then it defaults to 0. This is optional if the TemplateId property is passed. Unused if the project is
InformationExtraction
.TemplateId - String. The image template ID assigned in the recognition project that should be used for extraction. If not supplied, then the DocumentTypeName should be specified. Unused if the project is
InformationExtraction
.OcrLocale - Optional string. The locale that is used for reading text from the image when the request uses the
InformationExtraction
project. If omitted, text is read using the extraction profile's locale.
Files Per Request Item
This service supports JPEG, PNG, and TIFF files. There can only be one file per request item object. It can either be an embedded file or a reference to a file ID previously posted to the Files Resource. The File Type property for the file is ignored for this service.
If the DocumentTypeName and PageIndex are specified, then the data will be extracted based on the index of the template in the order of the template names (not IDs) in the specified document type. If the PageIndex is greater than the number of templates in the document type, then the image is not processed for data extraction.
Request Body
{"serviceProps"=>[{"name"=>"Env", "value"=>"D"}, {"name"=>"IncludeOcrData", "value"=>false}, {"name"=>"Project", "value"=>"InformationExtraction"}], "requestItems"=>[{"nodeId"=>1, "values"=>[{"name"=>"DocumentTypeName", "value"=>"TestWren"}, {"name"=>"TemplateId", "value"=>""}, {"name"=>"PageIndex", "value"=>"0"}], "files"=>[{"name"=>"Wren", "value"=>"F_0d7bbf6b26b74960b36aa26523ed4d10TIF", "contentType"=>"image/tiff", "fileType"=>"tif"}]}]}
HEADERS
Key | Datatype | Required | Description |
---|---|---|---|
Content-Type | string | ||
Accept | string |
RESPONSES
status: OK
{"returnStatus":{"status":200,"code":"OK0000","message":"","server":"WS-S6ac611ec08f243cdb3c9977004953b1fIS"},"licenseUsedPercent":0,"id":"ReqCPEXTRAC_3K3_0031049","serviceName":"extractpage","executionMilliSeconds":600,"licensePagesUsed":1,"licensePagesUsed2":0,"resultItems":[{"nodeId":1,"errorCode":"","errorMessage":"","values":[{"name":"ClassificationId","value":"d907a548196c4e35837dad51954cd3ed"},{"name":"UimData","value":{"docType":"TestWren","locale":"en-US","flaggedReason":null,"extractionId":null,"nodeList":[{"name":"InvoiceNumber","isArray":false,"indexFieldType":"Number","labelText":"Invoice No.","isRequired":true,"controlType":"TextBox","data":[{"arrayIndex":0,"value":227628,"fieldError":{"errorCode":"ER2208","recoverable":false,"message":"Out of Bounds:Valid values: 1000 ➜ 10210"},"mustConfirm":true,"choices":null,"locationRect":{"left":639,"top":144,"width":124,"height":30},"pageId":1,"confidence":100,"extractedCharacters":null}]}]}}]}]}