Document - data source
You can use as a data source one or several documents. In this section we describe how to upload documents as a data source.
Click on Document to add documents as a data source:
Upload
Upload your files (click or drag and drop on the dedicated area):
The following file formats are currently supported: .txt, .html, .md, .ods, .docx, .xlsx, .doc, .rtf, .odt, .csv, .pdf, .pptx
- Empty
- Files selected
Optical Character Recognition (OCR)
To activate the OCR feature, click on the switch button on the right side of the document you want to apply the OCR on.
OCR is the process of converting a text image into a machine-readable text format. For example, if you scan a text document, your computer saves the scan as an image file. You can't use a text editor to edit, search or count words in the file.
However, you can use OCR to convert the image into a text document, the contents of which will be stored as text data.
Once you specified all your documents click on "Finish". You will be redirected to data source page:
If you need some files to try out this data source you can use this collection of PDFs about the Simpsons that can be found here:
These documents are coming from the Simpsons wiki
Here you can access a search interface based on this PDF collection, you can try it yourself!