Query PDF
POST https://api.pdfrest.com/pdf-info
Get detailed information about a PDF document and its contents to assess the current state of the file and drive conditional processing.
Some PDF conditions can prevent all queries from completing. For example, if the document is password-protected, corrupted, or not actually a PDF, all queries will not be able to be completed. The output response will always include an allQueriesProcessed
field with a true or false value. When this is false, an additional warning
field will also be included in the output response with a human-readable string explaining why all queries could not be processed.
Required form-data body parameters:
- One of:
file
- File to be uploaded and processedid
- Alphanumeric ID (UUID) of existing file on server to be processed
queries
- Specifies the metadata you wish to retrieve from your input document. Select one or more values from the following:- tagged - Checks for presence of structure tags in the input document. Returns true or false
- imageonly_ - Checks if the document is 'image only' meaning that it will only feature a series of embedded graphical image files, one per page and does not have any text or other features common to PDF documents, except for some metadata. Returns true or false
- title - The title of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have a title
- subject - The subject of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have a subject
- author - The author of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have an author
- producer - The producer of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have a producer
- creator - The creator of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have a creator
- creationdate_ - The creation date of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have a creation date
- modifieddate_ - The most recent modification date of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have a modification date
- keywords - The keywords of the PDF as listed in the metadata. Returns a string which may be empty if the document does not have keywords
- doclanguage_ - The language that the file claims to be written in. Returns a string
- pagecount_ - The number of pages in the PDF document. Returns an integer
- containsannotations_ - Checks whether the document contains annotations, such as notes, highlighted text, file attachments, crossed out text, and text callout boxes. Returns true or false
- containssignature_ - Checks if the document contains any digital signatures. Returns true or false
- pdfversion_ - Retrieves the version of the PDF standard that the document was created with. Returns a string of the form X.Y.Z where X, Y, and Z are the major, minor, and extension versions respectively
- filesize_ - Retrieves the size of the input file in bytes. Returns an integer
- filename - The name of the input file. Returns a string
- restrictpermissionsset - Checks whether the document has restrict permissions set to prevent printing, copying, signing etc. Returns true or false
- containsxfa_ - Checks whether the document contains XFA forms. Returns true or false
- containsacroforms_ - Checks whether the document contains Acroforms. Returns true or false
- containsjavascript_ - Checks whether the document contains javascript. Returns true or false
- containstransparency_ - Checks whether the document contains transparent objects. Returns true or false
- containsembeddedfile - Checks whether the document contains one or more embedded files. Returns true or false
- usesembeddedfonts - Checks whether the document contains fully embedded fonts. Returns true or false
- usesnonembeddedfonts - Checks whether the document contains non-embedded fonts. Returns true or false
- pdfa - Checks whether the document claims and conforms to a PDF/A standard. Returns true or false
- pdfuaclaim_ - Checks whether the document claims to conform to a PDF/UA standard. Returns true or false
- pdfeclaim_ - Checks whether the document claims to conform to a PDF/E standard. Returns true or false
- pdfxclaim_ - Checks whether the document claims to conform to a PDF/X standard. Returns true or false
- requirespasswordtoopen_ - Checks whether the document requires a password to open. Returns true or false.
- NOTE: A document requiring a password cannot be opened by this route and will not be able to return much other information.
Request Body
[{"name"=>"file", "value"=>nil, "datatype"=>"null"}, {"name"=>"id", "value"=>"xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "datatype"=>"string"}, {"name"=>"queries", "value"=>"tagged,image_only,title,subject,author,producer,creator,creation_date,modified_date,keywords,doc_language,page_count,contains_annotations,contains_signature,pdf_version,file_size,filename,restrict_permissions_set,contains_xfa,contains_acroforms,contains_javascript,contains_transparency,contains_embedded_file,uses_embedded_fonts,uses_nonembedded_fonts,pdfa,requires_password_to_open,pdfua_claim,pdfe_claim,pdfx_claim", "datatype"=>"string"}]
HEADERS
Key | Datatype | Required | Description |
---|---|---|---|
Api-Key | string |