
Comments and Discussions



The problem and solutions were clearly explained. The author does not claim to have found the "best" solution. That is for the reader to play with.





Hello everybody.
This is a nice article especially for the ones who want to get the basics of the hash table data structure.
Just for fun I have another solutions which is O(n) time and O(1) memory. Bassicaly the thing that makes this a better solutions is the fact that the hash takes extra O(n) space.
So my idea is the following:
Take the first character from the string and then simply check if the rest are the same, cool isn't it?
I also thing that you should include this in your article as it's a nice optimizations, although it's just an idea, and have nothing to do with the main reason of this article.





Thanks for the idea. You mean take the first character and compare it with the rest of the characters? Iff yes then what about the rest of the characters(in my example of string "bhargav" 3rd and 6th characters were same)? Please clarify.





Sorry, I was confused, I thought that the statment asks if the string has only one unique character i.e aaaaaa, zzzz, ccc. I haven't read carefully.





You can find these code in Java as well as Ruby here.





I guess a new programmer might need that hint.
Using the hash to do it is great when considering you can now apply it to words, or Unicode (>256) without caring as much.
I read the intro and the first thing that came to mind was a simple array (like your boolean approach), but I wanted to see what else you brought to the table.
It reminded me of a problem in computer science. We had to store (efficiently) a lot of numbers in a hash table, but there was a guarantee of nonuniqueness, which can slow down hash tables a bit. My buddy implemented a Random Hash. Where the incoming number was used as a seed for a random number generator, if he got a collision, he called random() again, and resolved it. I was BLOWN away to realize how efficient and accurate it was. Because given a specific seed, you could reproduce a sequence of random numbers.
Anyways, thanks for sharing...





It is not a general algorithm but is its too related to language feauture





What language feature? HashSet? That's not a language feature but a library feature, and you'd be hard pressed to find a modern language that doesn't have this collection type in it's core library. If he'd not used HashSet but had implemented the hash buckets himself, would it have been a "general algorithm"? If not, why not? If it would, then why did it matter that he used the available collection type?
William E. Kempf





You are a bit confused. A general algorithm is an abstract implemention of the steps need to perform somthing, as quicksort algorithm. After you can implement your design in your prefered programming language, and you can optimize it by language features. Hashset is a particular datatype, and its implemention differs from on language to another. You wrote about a silly tips implemented using Java. That's it.





I didn't "wrote about a silly tips implemented using Java". I didn't write the article. I'm also not confused. Hash tables are a well known algorithm. Building another algorithm on top of that seems to me to still be a general purpose algorithm. You seem to be picking nits in your down vote here.
William E. Kempf





Please don't be heretic : hashtable is not an alghorithm, it is a data type. Following a O(n) algorithm to check unique character string.
# one prime per letter
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101]
starting_byte = ?a.ord
primes_product = 1
ARGV[0].each_byte do byte
current_prime = primes[byte  starting_byte]
if primes_product % current_prime == 0
puts "Not unique"
exit
else
primes_product = primes_product * current_prime
end
end
puts "Unique"





Watch the name calling and personal attacks.
Lets see. Pseudo code.
for each byte
let h = hash(byte)
if collision(byte, h) return false
How is that not an algorithm? In fact, the algorithm you provide does exactly this, you've just provided the hash algorithm and the collision detection algorithm. Your solution is more memory efficient (and faster, though it has the same O(n) complexity, due to the memory allocations done by a HashSet). You are picking nits.
William E. Kempf





First of all my reply it is not a personal attack.
Hash is not an algorithm. We can name "hash function" as algorithm. A hash function is any algorithm that maps data of arbitrary length to data of a fixed length. The values returned by a hash function are called hash values, hash codes, hash sums, checksums or simply hashes.
In your pseudocode the word hash is not well used : whic is its funcuntionaliy, to do hash or to add a value to an hashset datatype? My opinion is that you are also confusing hash function and hashset datatype.
Moreover in your Java code you use a boolean return type from the method add of the java class datatype hash. I ask you : do other languages have the some behavior for add method of hashset datatype implementation?
In my first reply i clearly wrote : your code is related to hashset java class imple e tation and it is not a general algorithm. I apologize if my opinion looks like a personal attack. Good luck in your coding experience.





Calling me a heretic is a personal attack, whether you meant it as harshly as it came across or not.
Stop assuming I'm an idiot and you're a genius. I very well know the difference between a "hash function" and a "hashset datatype" and I used hash correctly in the pseudo code (hint, it's a function that produces a hash code). You are picking nits (i.e. you're are ignoring the big picture of what I've said in favor of picking apart minor things I've said, usually incorrectly).
I told you once, and I'll tell you again. I did not write this article. The Java code isn't mine.
Most HashSet UDTs do, actually, return a Boolean indicating whether or not there was a collision on Add. But that's not really relevant. The algorithm isn't about the implementation. The algorithm is simply to use a hash to detect collisions, and the implementation he provided did this by utilizing the hashing functionality builtin to Java objects and the HashSet collection as a means to detect collisions. Like I pointed out, your "algorithm" did precisely this with a different implementation. So if what you posted is a "general algorithm" then so was the author's, as they are the same algorithm with differing implementations. Your implementation was superior, but the algorithm was the same.
William E. Kempf





I don't know if other languages have the same behavior for add method(I know that java and ruby has.) If some language does not return true or false depending on whether the element was added to the collection or not, then how we are supposed to know whether the element was added or not?





I'm struggling to find where Bhargav made the claim that this was a generic algorithm. Taken for what it is, it's an elegant solution  I've certainly seen this problem handled in far clumsier ways.





sample:
String stringToMatch = "abccdef";
Pattern p = Pattern.compile("(\\w)\\1+");
Matcher m = p.matcher(stringToMatch);
if (m.find())
{
System.out.println("Duplicate character " + m.group(1));
}





Try that with stringToMatch = "abcdefc".
William E. Kempf





This could work (if you add some tests etc), but it certainly will not run in O(n).







General News Suggestion Question Bug Answer Joke Rant Admin Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

