Algorithms Discussion Boards

Re: how to know a file just copy to some path in HDD ?

Eddy Vluggen30-May-09 1:43

30-May-09 1:43

They're not using .NET - seems that Avast! has hooked the FindFiles function. I found the process reading files whenever I opened a virtual DOKAN drive.

What you need is someone who is very familiar with the WinAPI Smile | :)

I are troll Smile | :)

Re: how to know a file just copy to some path in HDD ?

xingselex30-May-09 5:26

xingselex

30-May-09 5:26

can u give some link page for about what u said ?

Re: how to know a file just copy to some path in HDD ?

Eddy Vluggen30-May-09 9:00

Eddy Vluggen

30-May-09 9:00

WinAPI hooking[^].

Good luck.

I are troll Smile | :)

Iraq-born teen cracks maths puzzle

Bassam Abdul-Baki28-May-09 8:02

Bassam Abdul-Baki

28-May-09 8:02

A 16-year-old Iraqi immigrant living in Sweden has cracked a maths puzzle that has stumped experts for more than 300 years, Swedish media reported on Thursday.[^]

Web - Blog - RSS - Math - BM

Re: Iraq-born teen cracks maths puzzle

⁷³Zeppelin28-May-09 8:30

⁷³Zeppelin

28-May-09 8:30

More details?

Re: Iraq-born teen cracks maths puzzle

User 171649228-May-09 8:56

User 1716492

28-May-09 8:56

False alarm apparently.

http://www.uu.se/news/news_item.php?typ=artikel&id=693[^]

modified 1-Aug-19 21:02pm.

Re: Iraq-born teen cracks maths puzzle

Bassam Abdul-Baki28-May-09 9:13

Bassam Abdul-Baki

28-May-09 9:13

And people supplying the news through Twitter and mobile phones is going to help make it better how? That's too bad. I would like to know what they're talking about though.

Web - Blog - RSS - Math - BM

Re: Iraq-born teen cracks maths puzzle

User 171649228-May-09 11:38

User 1716492

28-May-09 11:38

Don't use and have never used Twitter, and at this moment in time, I fail to see the point of Twitter and am not attracted to it. However, I have no idea what the original story exactly was, but, I'll say no more than "Rumours and conjecture manage to twist stories out of all proportion".

modified 1-Aug-19 21:02pm.

Re: Iraq-born teen cracks maths puzzle

Moreno Airoldi29-May-09 8:43

Moreno Airoldi

29-May-09 8:43

It would be interesting to know if the guy really came to the solution indipendently, though it's not likely.

2+2=5 for very large amounts of 2
(always loved that one hehe!)

Finding Matches in Two Large Lists of Strings

ggraham41227-May-09 11:14

ggraham412

27-May-09 11:14

Hi,

I have two long (>50000) lists of names that must be periodically checked for possible matches between them. I've coded up several fast algorithms for computing edit distances (Eg- Levenshtein) but the size of the lists still makes it very time costly.

Is there any fast function F(S) that people compute offline on single strings that you could cluster them by, so that two strings far apart in F are also far apart in string distance, and one could reduce the size of the set that must be checked exactly? For example, if my criterion for matching is Leven(s1, s2) < N, then I know that |Length(s1) - Length(s2)| < N, and if I find a length difference >= N I won't even bother running Levenshtein.

I Googled once and came up with suggestions to use Hilbert curves or Z-ordering, and it made my head hurt. But there's gotta be something better than just length...

Thanks a million!

Re: Finding Matches in Two Large Lists of Strings

Alan Balkany28-May-09 10:54

Alan Balkany

28-May-09 10:54

There's a much simpler solution: Use a hash table.

Insert all strings from the larger list into a hash table with at least twice as many slots as strings. Then check to see if each string from the smaller list is in the hash table.

Hash tables trade space for speed and are very fast. If you want more speed, just increase the number of slots in the table.

Re: Finding Matches in Two Large Lists of Strings

supercat928-May-09 12:03

supercat9

28-May-09 12:03

I'm not sure there's a good method if one is seeking to find all strings in one set which are within some Leven distance of any string in another set, unless the only distances of interest are small (preferably one, maybe two).

If the strings in question are names, you could hash each one into an 26-bit integer identifying what letters it contains. If desired, some letters could be ignored, or other letters folded together, so as to yield e.g. a 16-bit integer. If two names are within an edit distance of e.g. 2, then there must be at most two bits different between the names. One could produce a list of all distance-two collisions among the 65,536 different integer hash values. There would be many such collisions, but one might still save some time versus a brute-force comparison of everything.

Re: Finding Matches in Two Large Lists of Strings

Moreno Airoldi29-May-09 8:41

Moreno Airoldi

29-May-09 8:41

As others said, I really think a hash table is the fastest solution to your problem. Not sure if there is some more specific approach for this case. Smile | :)

2+2=5 for very large amounts of 2
(always loved that one hehe!)

Re: Finding Matches in Two Large Lists of Strings

PIEBALDconsult12-Jul-09 17:58

PIEBALDconsult

12-Jul-09 17:58

It's unclear. Do you have two lists of names (List<string>)? Or two strings that contain multiple names?

If the former, I'd make one into a HashSet and then see if the members of the other appear in it.

If the latter, I'd Split them and proceed as above.

Re: Finding Matches in Two Large Lists of Strings

Tadeusz Westawic13-Jul-09 18:08

Tadeusz Westawic

13-Jul-09 18:08

Way, way too long ago I worked for a state DMV and they used something called the "Soundex Code" to hash surnames into 5-character alpha-num strings. Similar-sounding names cluster about the same hash values.

U can find the algorithm on wiki.

Tadeusz Westawic

An ounce of Clever is worth a pound of Experience.

[Message Deleted]

Padmanabha_M26-May-09 17:54

Padmanabha_M

26-May-09 17:54

[Message Deleted]

Re: arctan

Tim Craig26-May-09 18:28

Tim Craig

26-May-09 18:28

Padmanabha_M wrote:
i dont want to use new libraries.

Use the old libraries. Hmmm | :|

"Republicans are the party that says government doesn't work and then they get elected and prove it." -- P.J. O'Rourke

I'm a proud denizen of the Real Soapbox[^]
ACCEPT NO SUBSTITUTES!!!

image segmentation

sridharmcapsg25-May-09 20:16

sridharmcapsg

25-May-09 20:16

plz help for me to do an project on image segmentation.
plz send me the alogorithm and the techniques used for image segmentation
with s very useful to me.

i need it at most the earlier to do this.

Thanking you,

Sridhar.

Re: image segmentation

riced25-May-09 23:51

riced

25-May-09 23:51

I suggest you read this article http://en.wikipedia.org/wiki/Segmentation_(image_processing)[^] then decide what you want to do, then write some code to do it, then,if you have problems with the code, post on the appropriate language forum. Smile | :)

Regards
David R
---------------------------------------------------------------
"Every program eventually becomes rococo, and then rubble." - Alan Perlis

K means with Mahalanobis - Singularity [modified]

FatMooseHenry21-May-09 22:41

FatMooseHenry

21-May-09 22:41

Hello
Im doing K-means clustering and am about to implement the Mahalanobis distance. I have a problem with sometimes the matrix is singular. Im not really sure what it means in this case and what to do about it? Im fairly sure that my code is ok, but here is the code for calculating the covariance matrix:

public static Matrix CovarianceMatrix(List<double[]> dataset)
        {
            /*
                cov_xx cov_xy ...
                cov_yx cov_yy ...
                ...
             */

            //Calculate mean for this cluster
            // cov_xx = sum[x*x]/n, cov_xy = sum[x*y]/n
            double[] means = new double[dataset[0].Length];
            Matrix cov = new Matrix(dataset[0].Length, dataset[0].Length);
            double sum = 0;

            for (int i = 0; i < dataset[0].Length; i++)
            {
                for (int j = 0; j < dataset.Count; j++)
                {
                    means[i] += dataset[j][i];
                }
                means[i] /= dataset.Count;
            }

            double[,] subresults = new double[dataset[0].Length, dataset.Count];
            for (int j = 0; j < dataset.Count; j++)
            {
                for (int i = 0; i < dataset[0].Length; i++)
                {
                    subresults[i, j] = dataset[j][i] - means[i];
                }
            }
            
            //fill covariance
            for (int i = 0; i < dataset[0].Length; i++)
            {
                for (int j = i; j < dataset[0].Length; j++)
                {
                    double s = 0;
                    for (int x = 0; x < dataset.Count; x++)
                    {
                        s += subresults[i, x] * subresults[j, x];
                    }
                    cov.SetElement(i, j, s / dataset.Count);
                    if (i != j) cov.SetElement(j, i, s / dataset.Count);
                }
            }
            return cov;
        }

And here for the distance:

public static double Mahalanobis(double[] vector1, double[] vector2, Matrix covariance)
{
    Matrix v1 = new Matrix(vector1, vector1.Length);
    Matrix v2 = new Matrix(vector2, vector2.Length);
    Matrix m = v1.Subtract(v2);

    return (double)(m.Transpose()).Multiply(covariance.Inverse()).Multiply(m).GetElement(0, 0);
}

If more information (or comments), or a working code sample is perfered, let me know. However, some times it can cluster without problem, so I think it is more about how to handle the singularity than the code it self.

Looking forward to hear from you

modified on Friday, May 22, 2009 5:32 AM

Re: K means with Mahalanobis - Singularity

⁷³Zeppelin22-May-09 10:06

⁷³Zeppelin

22-May-09 10:06

A singular matrix has a determinant of zero. That means you can't invert it.

That's probably happening when you're inverting your covariance matrix: covariance.Inverse()

It also means that your covariance matrix isn't positive semi-definite and therefore it's not invertable. So the vectors you are seding to the Mahalanobis function are probably linear combinations of one another.

If these are random vectors, it could be that a component of the vector is extraneous. Better check your vectors.

Region based shape representation

raouaa21-May-09 1:15

raouaa

21-May-09 1:15

I am looking for code in C# or in Matlab that can represent the shape using -- Region based shape representation-- and calculate the similarity between two shapes

String matching algorithm

vSoares18-May-09 6:43

vSoares

18-May-09 6:43

Hi.

I have a text file full of words (lets say one word per line).
I want to find all words starting with a given pattern.
What's the best algorithm to do this?

Thanks,

vSoares

Re: String matching algorithm

Yusuf18-May-09 6:53

Yusuf

18-May-09 6:53

vSoares wrote:
What's the best algorithm to do this?

well, what is your definition of best? best matches? best execution time? best ...?

there are plenlenty of algorithms out them, did you try exact string match[^] or this[^]

Yusuf
May I help you?

Re: String matching algorithm

vSoares18-May-09 7:01

vSoares

18-May-09 7:01

Yes, I tried but i was expecting someone who had some experience with this kind of algorithms to point me in the right direction.

I agree, "best" is very broad. I need a fast search algorithm. I can optimize the source if needed.
It's not an exact string match. It's all strings beggining with a given pattern, something like google suggestions.

vSoares

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.