Click here to Skip to main content
15,887,264 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
I used Amazon's Kendra to build a semantic search engine. I took 2 datasets for this task - csv & excel. The csv dataset is mostly text. The excel dataset is mostly numbers.

The data to be used is first connected to a data source which is linked to a Kendra index. The Kendra index then performs Crawling and Indexing upon syncing the data source.

What I have tried:

When I sync the data source with text-based csv files, the crawling & indexing takes place within 5 minutes, but it takes indefinite time to do the same on the excel file. The excel file took 2 hours to crawl & the indexing has been happening since the past 7 hours. I converted the excel file to csv & tried but the issue persists. The datasets contain not more than 20 rows, so size of the data doesn't seem to be the issue.

The disclaimer on the website says:

Amazon Kendra is syncing the following data source: 'dq-rule-fail'. It can take from a few minutes to a few hours. Syncing is a two-step process. First documents are crawled to determine the ones to index. Then the selected documents are indexed. Sync speeds are limited by factors such as remote repository throughput and throttling, network bandwidth, and the size of documents.

The factors specified in the disclaimer are not problematic in my case.

My doubt: Is Kendra not designed to index numerical data? How do I make sure my data is properly synced?
Posted
Updated 21-Sep-23 21:29pm

1 solution

We have suggested more than once that this site is not the best place for support of Amazon's products. Please make us of the Kendra support forum at Amazon Kendra[^].
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900