Click here to Skip to main content
11,437,816 members (28,883 online)
Click here to Skip to main content

Tagged as

Ranking Fixed Length Search Results

, 18 Aug 2014 MIT
Rate this:
Please Sign up or sign in to vote.
Ranking of fixed length columns in search

Introduction

The problem arises when looking up and maintaining the search ranking in the fixed length datasets, this is a frequent issue which can be seen very often in the search results. If the datasets are examined carefully we can see that there is a co-relation in the blank spaces which are defined in semantic search or general searching algorithms which can be used for ranking. To add to search ranking one parameter which we can look onto is the number of times a user has typed in the spaces (when writing free text or natural language queries) and then we can rank it in ascending/descending order based on this metric.

Background

Data has been around for a long time; especially now that big data is available on public datasets, usually its raw nature does not add any value to the meaning of information which we extract from it unless we use some specialized algorithms. Most of the algorithms which are in market only provide a handful of information and insights into the data or natural language based data sets. They usually ignore the semantics for NLP. This can be frequently seen in the search results drawn from different search engines for free text or specialized searches, especially when it comes to sorting datasets. 

Implementation

In order to rank the results from the fixed data set look onto the column in the given test data set. 

[Fixed Length Column 1]
This is a test subject
This is test subject01

Now if we observe correctly there seems to be a difference in words with spaces and the words without spaces even though both the subjects (rows) have equal field lengths (number of characters). These lengths cannot differentiate or determine the data ranking; neither have they provided any meaningful information about the data itself. 

However if we add a parameter of length to this then we can easily distinguish between the two while trimming out the empty space characters from each row in the above dataset. This would allow us to extract the information based on certain pattern. 

This can be achieved by using the length of the whole column, you can simply use 

LENGTH[TRIMMED EMPTY SPACES[ROW]] 

When this length is calculated it gives out a part of calculated result from each row. This can be used to place different rows in ranking order, even though they won’t give out a lot of information about data but it is a simple metric which is helpful in sorting when it comes to using fixed length data sets.

After using the above described procedure in the dataset we will get the following outputs 

LEN(TRIM(This is a test subject)) = 22 with spaces and 18 without them
LEN(TRIM(This is test subject01)) = 22 with spaces and 19 without them

If we rank the data in ascending or descending order this should initiate a varying data set search results which can result in sorting and ranking of the data. Again interesting point to note here is that the data when sorted automatically had no meanings, even it was not differentiating or providing any useful information by using this simple parameter we are able to sort it in a different order.

With the above code the rows which were there in dataset in ascending order have now been changed to form descending order due to the knowledge extraction from the NLP or simply put free user text input.

Future Point of Interests

Find out the co-relation in depth by using this and few other parameters which can contribute to the search results on fixed length dataset and derive them into an equation / pseudo which can be used for general purpose context.

History

 

First Published for review on: 17-AUG-2014

 

License

This article, along with any associated source code and files, is licensed under The MIT License

Share

About the Author

moeenkhurshid

United States United States
No Biography provided

Comments and Discussions

 
QuestionHave you consider... Pin
Nelek26-Aug-14 0:08
memberNelek26-Aug-14 0:08 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.150506.1 | Last Updated 18 Aug 2014
Article Copyright 2014 by moeenkhurshid
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid