Extract Text

POST https://api.pdfrest.com/extracted-text

Extract text from a PDF, with the option to include information on text style and position.

Required form-data body parameters:

  • One of:
    • file - File to be uploaded and processed
    • id - Alphanumeric ID (UUID) of existing file on server to be processed

Optional form-data body parameters (bold = default):

  • full_text - Controls how all document text is returned as "fullText"
    • document - "fullText" will be a string consisting of all extracted text
    • bypage_ - "fullText" will be an object containing an array called "pages", consisting of objects containing a page number and the text from that page.
    • off - "fullText" will not be returned.
  • word_style - Get style information for each word. Set to on or off.
  • word_coordinates - Get the position of each word. Set to on or off.

Request Body

[{"name"=>"file", "value"=>nil, "datatype"=>"null"}, {"name"=>"id", "value"=>"xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "datatype"=>"string"}, {"name"=>"full_text", "value"=>"document", "datatype"=>"string"}, {"name"=>"word_style", "value"=>"off", "datatype"=>"string"}, {"name"=>"word_coordinates", "value"=>"off", "datatype"=>"string"}]

HEADERS

KeyDatatypeRequiredDescription
Api-Keystring