Extract Page Service.

POST {{baseUrl}}/session/services/extractpage

The Extract Page Service will perform extraction on each item submitted and return a UIM object containing information from the result.

Service Properties

  • Env - Metadata environment identifier. Value is one of D, T or P. Default value is P.

  • IncludeOcrData - Boolean. If true then the returned UIM data object will contain extracted characters information. Otherwise it won't.

  • Project - Optional string. The recognition project to use to classify the document. Valid values are Default for the Advanced Recognition and InformationExtraction for the Information Extraction. If omitted, Default is used.

Values Per Request Item

  • DocumentTypeName - String. The Document Type name to be used for extraction. This is optional if the TemplateId property is passed. This value is case sensitive when the request uses the InformationExtraction project.

  • PageIndex - Number. The zero-based page index within Document Type. If omitted, then it defaults to 0. This is optional if the TemplateId property is passed. Unused if the project is InformationExtraction.

  • TemplateId - String. The image template ID assigned in the recognition project that should be used for extraction. If not supplied, then the DocumentTypeName should be specified. Unused if the project is InformationExtraction.

  • OcrLocale - Optional string. The locale that is used for reading text from the image when the request uses the InformationExtraction project. If omitted, text is read using the extraction profile's locale.

Files Per Request Item

This service supports JPEG, PNG, and TIFF files. There can only be one file per request item object. It can either be an embedded file or a reference to a file ID previously posted to the Files Resource. The File Type property for the file is ignored for this service.

If the DocumentTypeName and PageIndex are specified, then the data will be extracted based on the index of the template in the order of the template names (not IDs) in the specified document type. If the PageIndex is greater than the number of templates in the document type, then the image is not processed for data extraction.

Request Body

{"serviceProps"=>[{"name"=>"Env", "value"=>"D"}, {"name"=>"IncludeOcrData", "value"=>false}, {"name"=>"Project", "value"=>"InformationExtraction"}], "requestItems"=>[{"nodeId"=>1, "values"=>[{"name"=>"DocumentTypeName", "value"=>"TestWren"}, {"name"=>"TemplateId", "value"=>""}, {"name"=>"PageIndex", "value"=>"0"}], "files"=>[{"name"=>"Wren", "value"=>"F_0d7bbf6b26b74960b36aa26523ed4d10TIF", "contentType"=>"image/tiff", "fileType"=>"tif"}]}]}

HEADERS

KeyDatatypeRequiredDescription
Content-Typestring
Acceptstring

RESPONSES

status: OK

{"returnStatus":{"status":200,"code":"OK0000","message":"","server":"WS-S6ac611ec08f243cdb3c9977004953b1fIS"},"licenseUsedPercent":0,"id":"ReqCPEXTRAC_3K3_0031049","serviceName":"extractpage","executionMilliSeconds":600,"licensePagesUsed":1,"licensePagesUsed2":0,"resultItems":[{"nodeId":1,"errorCode":"","errorMessage":"","values":[{"name":"ClassificationId","value":"d907a548196c4e35837dad51954cd3ed"},{"name":"UimData","value":{"docType":"TestWren","locale":"en-US","flaggedReason":null,"extractionId":null,"nodeList":[{"name":"InvoiceNumber","isArray":false,"indexFieldType":"Number","labelText":"Invoice No.","isRequired":true,"controlType":"TextBox","data":[{"arrayIndex":0,"value":227628,"fieldError":{"errorCode":"ER2208","recoverable":false,"message":"Out of Bounds:Valid values: 1000 ➜ 10210"},"mustConfirm":true,"choices":null,"locationRect":{"left":639,"top":144,"width":124,"height":30},"pageId":1,"confidence":100,"extractedCharacters":null}]}]}}]}]}