Amazon kendra's performance on data rich in text vs data rich in numbers

Question

1.00/5 (2 votes)

See more:

I used Amazon's Kendra to build a semantic search engine. I took 2 datasets for this task - csv & excel. The csv dataset is mostly text. The excel dataset is mostly numbers.

The data to be used is first connected to a data source which is linked to a Kendra index. The Kendra index then performs Crawling and Indexing upon syncing the data source.

What I have tried:

When I sync the data source with text-based csv files, the crawling & indexing takes place within 5 minutes, but it takes indefinite time to do the same on the excel file. The excel file took 2 hours to crawl & the indexing has been happening since the past 7 hours. I converted the excel file to csv & tried but the issue persists. The datasets contain not more than 20 rows, so size of the data doesn't seem to be the issue.

The disclaimer on the website says:

Amazon Kendra is syncing the following data source: 'dq-rule-fail'. It can take from a few minutes to a few hours. Syncing is a two-step process. First documents are crawled to determine the ones to index. Then the selected documents are indexed. Sync speeds are limited by factors such as remote repository throughput and throttling, network bandwidth, and the size of documents.

The factors specified in the disclaimer are not problematic in my case.

My doubt: Is Kendra not designed to index numerical data? How do I make sure my data is properly synced?

Posted 21-Sep-23 20:49pm

Apoorva666

Updated 21-Sep-23 21:29pm

Add a Solution

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Richard MacCutchan · Answer 1 · 2023-09-21T23:26:00

Solution 1

We have suggested more than once that this site is not the best place for support of Amazon's products. Please make us of the Kendra support forum at Amazon Kendra[^].

Posted 21-Sep-23 23:26pm

Richard MacCutchan