Document - Data Source

Use one or more documents as a data source. This section covers how to upload documents.

Click on Document to add documents as a data source:

Add data source menu

Upload

Upload your files (click or drag and drop on the dedicated area):

Info
The following file formats are currently supported: .txt, .html, .md, .ods, .docx, .xlsx, .doc, .rtf, .odt, .csv, .pdf, .pptx
document empty

Optical Character Recognition (OCR)

To activate the OCR feature, click on the switch button on the right side of the document you want to apply the OCR on.

OCR converts a text image into machine-readable text. For example, scanning a document produces an image file that cannot be edited, searched, or word-counted directly.

OCR converts that image into a text document whose contents are stored as text data.

documents list OCR

Once all documents are specified, click "Finish". The data source page opens:

documents list

Batch Actions on Selected Files

You can select multiple files at once using the checkboxes on the left side of each row.

Once one or more files are selected, a batch action toolbar appears above the table with the following actions:

  • Downloaddownloads all selected PDF files at once. The button is active only when at least one selected file is a PDF or audio file in Indexed status.
  • Deletepermanently removes all selected files from the connector.
Document table with batch download and delete toolbar
Tip

To try out this data source, use this collection of PDFs about the Simpsons:

These documents are coming from the Simpsons wiki

Access a live search interface built on this PDF collection: