Algorithms

how to convert plural word to singular?

16-Nov-08 6:04

The original BASIC code has been translated into C here[^].

I am not familiar with the Wumpus problem, but perhaps you can dissect the code. Google will also give Java implementations and I'm sure you can find others. Maybe looking at the code can help.

hawkgao012913-Nov-08 23:49

hawkgao0129

13-Nov-08 23:49

especially for special word, like "tooth", "teeth".
is there any snippet exists for this?
thanks

⁷³Zeppelin14-Nov-08 0:39

14-Nov-08 0:39

Use an "if" "elseif" structure since words like "tooth" and "teeth" are exceptions in English. Basically you'll need a lookup dictionary. To handle other cases you test if the last (and second to last) letter of the word is a vowel or not.

User 171649214-Nov-08 1:13

User 1716492

14-Nov-08 1:13

John, what is your opinion of this as a resource.

http://www.sequencepublishing.com/academic.html[^] or TheSage dictionary (and other applications) from this company. I have good things to say about TheSage - excellent freeware stuff.

modified 1-Aug-19 21:02pm.

⁷³Zeppelin14-Nov-08 2:18

14-Nov-08 2:18

Looks quite useful, Richard. I know when I write papers the biggest challenge is trying not to be repetitive. Too many phrases begin: "the model suggests that...", or "so and so, in their paper find that...."

Any resource (free!) that can improve the quality of academic publications is good value in my opinion. In fact, any resource that can improve the quality of writing in university courses is well worth it - even more so if the product is good quality and freely available. Others differ, but I don't mind open source/free software for educational purposes.

Table comparison material

User 171649214-Nov-08 1:39

User 1716492

14-Nov-08 1:39

To supplement what 73Zep said, http://aspell.net/[^] - Free and Open Source spell checker. Have a look at their Manual [^] for some info ...

modified 1-Aug-19 21:02pm.

aggressor_us12-Nov-08 13:29

aggressor_us

12-Nov-08 13:29

Hi,

I urgently need some research info or articles about Text Tables Comparison algorithms. Something like

An O(ND) Difference Algorithm and Its Variations ∗
[^]

but for tables.

Maybe someone was working in this field and have some old links.

Thanx a lot.

Member 419459315-Nov-08 12:22

15-Nov-08 12:22

I have messed around in this area a bit - in this forum, as a matter of fact. I don't know if this would be of any use (depends upon what you mean by "Text Tables") - see http://www.codeproject.com/script/Forums/View.aspx?fid=326859&msg=2485913[^]

Nothing that is of research material, just a working program for massive file compares.

Dave.

aggressorus17-Nov-08 6:09

aggressorus

17-Nov-08 6:09

Hi Dave,

Thank you very much for your reply. I have many tab delimited text documents like:

Person \t Name \t Title
1 \t NoName \t NoTitle
...

I need to find differences in this files. I used GNU diff. But if one tab will be missing it's showing full add full delete.

One document:
1 \t NoName NoTitle
Another document:
1 \t NoName \t NoTitle

So i'm trying to make my own simple algorithm for such tabular text.

Member 419459317-Nov-08 12:40

17-Nov-08 12:40

I would guess that something, more like SED, is what you need. As I remember it from way back (many years), it would search for differences down to single characters. Its name means Stream EDitor.

I'm not too sure whether this would work or not, depending on the implementation. Usually DIFF algos are based on line differences and the whole line is marked as different. Depending on how large the files are, this could take a lot of time (order O(nm)). The method I developed talking to Skippums (in the referenced thread) speeded this up with some tricks.

Later, Skippums had another problem. He wanted an archive method for huge files (exceeding 512 MB) where the files were prior versions of documents, or programs, etc, etc - what it amounted to was he needed a small SED type of file created in order to save only the changes. Once you get into massive files, comparisons like this are extremely difficult and time consuming. I developed a method to minimize this time and was testing this. I asked for a sample file or two of this massive size (I could only come up with 9 and 11 MB libraries). I had sent him this data (the algorithm - my code was in MASM) and he finally responded to my Email about a month later explaining that he had been on vacation and would get back to me That was in July. I have checked his message log, and that is about when his last message was posted here.

I have also worked with DIFF algos.

If any of this algo info would be of use to you, I would be happy to help you.

Dave.

Buffering data for delimiter seperated blocks (theoretical question)

aggressorus9-Dec-08 7:13

aggressorus

9-Dec-08 7:13

Hi,

Thank you for your reply. Was thinking how to do it correctly. Diff will not work for sure. I posted new question.

invictus311-Nov-08 11:29

invictus3

11-Nov-08 11:29

Hi
There is a question I have been wondering about for ages and I was hoping someone could give me an answer to rest my mind.

Lets assume I have an input stream (like a file/socket/pipe) and want to parse the incoming data. Lets assume each block of incoming data is split by a newline like most common internet protocols. This just however just as well be parsing html/xml or any other smart data structure. The point is that the data is split into logical blocks by a delimiter rather than a fixed length. How can I buffer the data to wait for the delimiter to appear?

The question seems simple enough: just have a large enough byte/char array to fit the entire thing.

But what if the delimiter come after the fixed size buffer is full? This is actually a question about how to fit a dynamic block of data in a fixed size block. I can only really think of a few alternatives:

1) increase the buffer size when needed. This may require heavy memory reallocation, and may lead to resource exhaustion from specially crafted streams (or perhaps even denial of service in the case of sockets where we want to protect ourselves against exhaustion attacks and drop connections that try to exhaust resources...and an attacker starts sending fake, oversized, packets to trigger the protection)
2) start overwriting old data by using a circular buffer. Perhaps not an ideal method since the logical block would become incomplete
3) dump new data when the buffer is full. However, this way the delimiter will never be found so this choice is obviously not a good option
4) just make the fixed size buffer damn large and assume all incoming logical data blocks is within its bounds...and if ever full just interpret the full buffer as a logical block...

In either case I feel we must just assume that the logical blocks will never exceed a certain size...

Any thoughts on this topic? Obviously it must be a way since the higher level languages offer some sort of buffering mechanisms with their readLine() stream methods.

Is there any "best way" to solve this or is there always a tradeoff? I really appreciate all thoughts and ideas on this topic since this question has been haunting me everytime I have needed to write a parser of some sort.

Re: Buffering data for delimiter seperated blocks (theoretical question)

Arash Partow12-Nov-08 0:57

Arash Partow

12-Nov-08 0:57

You're right reallocation can be very costly, one solution I've seen to be quiet effective is to have a memory pool (set of buffers of common length) and have your stream maintain the chaining, as more memory is required (using a linked-list like approach) chain-in more blocks of memory from the pool.

Finally (socket example) when the frame, packet or token has arrived, provide per item(char usually) iterator access to the chain of buffers so as to make the chain transparent to the end user.

This obviously has its limits as well, specifically the largest block you can read will be the size of the pool of memory, however due to the design choice, if you ever reach that limit all you need to do is create more memory blocks in the pool which is much cheaper than realloc'ing everything accumulated so far.

Re: Buffering data for delimiter separated blocks (theoretical question)

Alan Balkany12-Nov-08 3:37

Alan Balkany

12-Nov-08 3:37

It sounds like you need to make a policy decision that's algorithm-independent: How large can a block be before you decide it's a denial-of-service attack?

This is domain-dependent. It depends on the knowledge you have of blocks' contents from your particular domain, and on the resources of your system. If you have lots of memory, you can afford to accept a big block on the chance it may be real.

The memory-reallocation approach wastes time because each reallocation requires you to recopy all the data received for that block so far. This starts to approach an O(n^2) running time for what should be a linear algorithm.

It seems like a buffer pool would use up memory waiting for big blocks that may never come. It would also limit the maximum size of a block you could accept.

The linked-list approach is the standard way of dealing with this type of situation; don't preallocate anything, and dynamically allocate blocks as you need them.

Constructing a convex hull.

Member 419459311-Nov-08 8:56

Re: Constructing a convex hull.

11-Nov-08 8:56

I know that I am opening Pandora's box, and I fully expect to be met with an angry hoard of rabid Bats, but I have an idea for constructing a convex hull from a data set. What I was looking for was a "Best" algorithm (here come the Bats) so that I might code it as a baseline to see if my algorithm is any faster. I was actually looking for triangulation, but wanted to start with the convex hull first.

I have already googled convex hull, but there is not too much info about relative performance.

The data will be just 2D (x,y) where there will be 64K-1 unsigned shorts for both x and y, no repeats (randomly select an integer from the set of (1 to (64k-1)), then remove the integer from the set, decrement the set size, and randomly select another entry from the remainder of the set). x and y will be independently randomized. The x y pair will not both be equal (swap the y value with the immediately following or previously defined pair whichever does not violate x!=y for either resultant pair). The set of points will be saved on a file as binary pairs, each algorithm will then be used to read the points and create the convex hull.

I will code my algorithm and the base algorithm in MASM, single threaded (the method should be able to be implemented as multi threaded, but for simplicity and timing, I wanted single threaded). I will use the MASM32 timing macros to set the CPU into Real Time Priority to try to get as accurate a timing as possible.

I'm not looking for code, just a detailed algorithm which would allow me to implement this sought after "Best" algorithm to use as a baseline.

Anyone have a guess how long this 64K-1 set might take to encircle? Should I reduce the size of the set?

Bring on the Bats!

Dave.

Arash Partow12-Nov-08 0:45

Arash Partow

12-Nov-08 0:45

Graham scan provides the best solution for 2D with an amortized complexity of O(nlogn) time and O(n) space

http://en.wikipedia.org/wiki/Graham_scan[^]

note: make sure you use heapsort and not quicksort.

Re: Constructing a convex hull.

Member 419459312-Nov-08 4:58

Re: Constructing a convex hull.

12-Nov-08 4:58

Arash,

I am "shocked"! You didn't suggest Wycobi!

Yes, I have read all of the CP articles and Wikis related to "Convex Hull" and read all of their references, but there is not too much reference to relative speeds under the same set of conditions, the touted advantages are mainly about how the algorithm extends to three or more dimensions, etc.

My attempt will be along the lines of "divide and conquer" with several twists I dreamed up (literally - went to bed thinking about the problem and woke up with what I thought was a good solution).

Dave.

Member 419459312-Nov-08 5:13

12-Nov-08 5:13

Arash,

Sorry for the double post, but please forgive me. I did not thank you for your reply (I did vote you for a good answer to my question). I actually had expected you to be the first to respond, and my initial reply was a attempt to be humorous (think back to 1941 and "Casablanca").

Dave.

Limit finding

Hadi Dayvary6-Nov-08 10:19

Hadi Dayvary

6-Nov-08 10:19

Hi I want to write a program that finds LIMIT of a math function in VC++
Would you please help me where to start, or any source code, or any other help
Thanks

www.logicsims.ir

⁷³Zeppelin6-Nov-08 10:26

6-Nov-08 10:26

The limit for a particular class of functions or an arbitrary function?

If you want to enter any given function and return the mathematical limit, then you'll need a parser and some kind of symbolic math engine in order to carry out operations like L'Hopital's rule, etc...

Not exactly a trivial programming project.

Hadi Dayvary6-Nov-08 10:37

Hadi Dayvary

6-Nov-08 10:37

thanks
I know, for sure it needs a parser, I have wrote some for drawing math functions, and matrix and...
but I have no idea about limit!
please help me if you can
Thanks

www.logicsims.ir

BobInNJ6-Nov-08 12:57

BobInNJ

6-Nov-08 12:57

Hadi,

I am wondering what kind of answer you are looking for. Are you looking for a numerical answer such as 3.14159 or a symbolic answer like PI? Also, do you want this function to work any function? For example the limit as x goes to 0 of log(x) is not defined. I believe I could give you a better answer if I had a better idea of the real problem you are solving.

By any chance, are you doing this to get a PhD?

Bob

bulg6-Nov-08 13:59

bulg

6-Nov-08 13:59

If you already have some for drawring, you should be able to adapt that. Lim x->c = slope of a function at point c, right? (change in y over change in x) If you add some boundary checking, you might come close without too much code required.

⁷³Zeppelin6-Nov-08 21:13