

Let’s say we have an array of integers
int[] myArray = new int[] {1,2,3,4,5};
So the length of this array is 4 [i.e. n=4] since C# array index starts at 0
Yes the length will be 5 and not 4 as pointed out in the next post. Its my bad  Sorry!
Define integer k such that 0<= k < n [n = length of an array]
For example, If k = 2 then the output should be
{3,4,5,1,2} i.e starting from kth position move all the array elements to the top of an array.
If k = 3, output would be
{4,5,1,2,3}
Here is the challenge.
Yes this is trivial if we write a loop that starts at 0 and goes up to n like
for(int i =0;i<n;i++){}
We want to optimize this loop so that it would not loop till n1. anything less than n1 is a good solution.
[Tip: if you want to reverse this array like 5,4,3,2,1 – you can use the loop like
for(int i=0;i<n/2;i++)
modified on Wednesday, October 1, 2008 5:23 PM





abhigad wrote: So the length of this array is 4 [i.e. n=4] since C# array index starts at 0
The length of the array is 5 , independently if it is 0 based or 1 based.
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
 Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
 Iain Clarke
[My articles]





Your question is unclear.





So you mean:
int[] in = new int[] {...}
int[] out = new int[in.Length];
Array.Copy(in (k > length) => out (0 ...));
Array.Copy(in (0 > k) => out (length  k));
Or is this some school assignment where you have to shuffle inplace?





Hi!
I wan't to construct a list of numbers which covers all combinations of those numbers.
Say for example that I want all combinations of the numbers 1 2 3 4, with the lenght of four (or say numbers 1 to 9 but still with the lenght of four, or if I know that '2' must be in the combination i.e. 2xxx, x2xx, xx2x or xxx2), like 1234, 1324 etc, but as a sequential string, e.g. 1234232 etc, where every new number becomes a new combination (in this case that string tests 1234, 2342, 3423, 4232).
How can I construct such an algorithm to finde the shortest possible string covering all combinations? I think it's called Euler path, but not sure. Did some googling.
Anyone who can push me in the right direction? Maybe an implementation as well?
Thanks in advance!
modified on Wednesday, October 1, 2008 4:41 AM





I have a simple question: What is the best algorithm for external sorting.
The external merge sort or the external distribution sort, or would you recommend a completly another sort algorithm?
Background:
I have to program an external sort program in C# but before I have to decide which approach to choose.
Files larger than 2 GB have to be sorted as fast as possible and therefore no internal algorithm is capable of handle such large files.





Merge sort is the classic way of sorting data that's too large to fit into memory, but shortcuts are possible.
One suggestion: Just read the keys of your records into memory, each paired with the record number it occurs in, e.g.:
(1, key1), (2, key2), ..., (n, keyn).
Then sort these pairs on the keys. When these pairs are sorted, the record numbers will have the sorted order, e.g. the first record number in the sorted pairs will be the record with the lowest key, etc...
Then assuming you can fit k records into memory, read in the k lowest records, write them to your output file, read the next k lowest records, append them to your output file and so on.
When done, your output file will have the sorted data.





I tried this approach, but it seems to be very slow because of the many hd seek commands.
The sort of the (Key,Line Number)pairs is rather fast, but the adjacent building of the output file takes quite a while. To get one specific line I compute the absolute address of the line and deduct a seek command.
I also tried the direct approach, holding (Key,Complete line)pairs is about two times faster, although fewer pairs fit into main memory, but no successional seeking is necessary
The results also surprised me. Further ideas are welcome





A further idea: Get a 64bit machine, which will allow more than 4 gigs of memory, and do a normal sort.
If this must run on a 32bit machine, I could provide a program that could read the whole file into memory compressed, sort it in its compressed form, and write out the sorted file, uncompressed. It would take some work so I'd have to charge for it however. Also the amount of compression would depend on characteristics of your data; random numbers wouldn't compress, but realworld data would probably compress to about 10% of the original size.





64bit: Would be nice, but my clients won't change their complete IT infrastructure.
What type of compression do you recommend?
I tried zip and gzip with lowest compression rate, but they slowed down the overall process too much.
Could a simple RunLengthCompression be satisfying?





Runlength compression would help if you have long sequences of the same symbol in your data.
I have a compression technique that gives more compression than in currentlyavailable commercial products. If you give me an email address I'll send you a link that describes it.





If your keys fit in memory fine, and sort fine, then that is as fast as its going to get. If the seek to pull data off the hard disk by key is an issue, then you need more RAM.
Think of it this way. You have the sorted keys, so your only challenge is to pull records off disk. Thats going to be slow. I'd start to look at why you have a 2 gig flatfile in the first place, and why that needs to be sorted so quickly.





Did you understand Mr. Balkany's suggestion? If you can fit 1/16 of the data in RAM along with all the indices, then you should be able to process everything (after the sort) in 16 passes through the data file, with no random seeks.
Suppose, for example, that there are 16,000,000 records and you can hold an array recBuff of 1,000,000 records in RAM along with an array finalPos of 16,000,000 integers. First, fill in finalPos such that finalPos(0) says where record #0 in the original file should go; finalPos(1) says where record #1 should go, etc. This can be done in linear time.
Next, read through the entire source file; after reading record #n from the file, look at finalPos(n). If it's less than 1,000,000 then store the record in recBuff(finalPos(n)). Otherwise discard it. Once this is done, recBuff(0..999999) will hold the first million records. Write them to disk.
Now read through the source file again. This time, look for records where finalPos(n) is in the range 1,000,000 to 1,999,999 and store those records in recBuff(finalPos(n)1000000). Once all records have been read, recBuff will hold the next million records. Write those to disk.
If recBuff and finalPos fit in RAM without swapping, the program should run very fast. Doubling the number of items in recBuff will double the speed, if it does not cause swapping. If it does cause swapping, it will dog the performance.
If there are so many records that the finalPos array itself takes an excessive amount of space, a temp file could be created which interleaves the source data with the finalPos items (since finalPos is always read in order). That would free up more space for recBuff.





First of all I have to ask, if this is the algorithm called "bucket sort" / "radix sort", because you didn't mentioned any comparison operations.
The point I don't get is how one record is classifed to the current recBuf()borders.
I will demonstrate my lack of understanding with an example.
Given an unsorted array: 5,3,2,9,1,8,2,4,7,2,6. Let's assume that my internal memory can only hold 3 values
First, I read the entire array and my goal is to classify 1,2,2 into the first recBuf(0..2). And that's my problem of understanding, how can I know that "3" belongs to the second recBuf(3..5).
There are three instances of "2" and so the array is not uniformly distributed.





How many records are there, how big are the keys, and how big are the records? Do you have one, two, or more disk drives available for processing?
If the keys are small enough (and there are few enough of them) that they can all fit into memory, you should start by sorting the keys (each one accompanied by an integer giving the location in the original file). Then proceed as I described.
If the number of records so large that e.g. only 10% of the keys will fit into memory, then I would suggest that you come up with some means of partitioning the keys into, say, 65,536 buckets. It doesn't matter whether the distribution is particularly even, provided that no single bucket holds more than 10%, and preferably no more than 2% or so. Make a pass through the file and count how many keys fit into each bucket.
Once that is done, count how many buckets one could add, starting at the bottom, before they totaled 10% of the records. Make a pass through the original file reading into RAM all the records that fit into those buckets. Then sort them in RAM and write them out. Then repeat the procedure, starting the the bucket after the last one that was used in the first pass. Then sort those and write those out.
The exact procedure you should use will vary depending upon what your data looks like and the number of separate disks available. Nonetheless, the key observations are (1) it's often good to sort with records containing just key and a reference to the original record, since the number of such records that can fit in RAM is larger than the number of full records that would fit; (2) it's better to think in terms of reading through a whole file, fetching some data into RAM and ignoring other data, than to think in terms of grabbing lots of little pieces of data scattered through a file; (3) though I haven't touched on this much, for really big jobs, having two or three hard drives will help things a lot.





"Mathematicians in California could be in line for a $100,000 prize (£54,000) for finding a new prime number which has 13 million digits."
BBC article[^].
Looks like it was done using distributed computing.
...that mortally intolerable truth; that all deep, earnest thinking is but the intrepid effort of the soul to keep the open independence of her sea; while the wildest winds of heaven and earth conspire to cast her on the treacherous, slavish shore.





The title is misleading since huge is relative to only the current one. But still .





Lol, how long will htey need to read the number in the oficial presentation?
Regards.

M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
“The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet.”  Michael A. Jackson
Rating helpfull answers is nice, but saying thanks can be even nicer.





Hello!
I'm working at a nonrecursive implementation of a scapegoat tree ( Scapegoat tree "partial" paper ). At section 6.2 there is summarized an implementation of a nonrecursive rebalancing algorithm. I think I almost got how I should use those 2 stacks, but my problem is how can I "plug" the nodes in the right(how should I tell if a node is connected to another?!) position. I think reduced the problem to the determination of height of a node in the rebalanced tree using the weight of the whole tree and the position of the node in the inorder traversal. The height determination should be only O(1) because the whole rebalance should be O(n) in time and O(logn) in space.
Thanks in advance!





can anyone suggest me an algorithm which is interesting to study?
and also if ever it has problems..
thanks..
i desperately need it





An algorithm to determine all the prime numbers up to some given limit. Start with Atkin's sieve. The problem with Atkin's algorithm is that if not coded properly it can be slower than the Sieve of Eratosthenes (an ancient algorithm).
...that mortally intolerable truth; that all deep, earnest thinking is but the intrepid effort of the soul to keep the open independence of her sea; while the wildest winds of heaven and earth conspire to cast her on the treacherous, slavish shore.





It all depends on what YOU are interested in. Tell folks what you are interested in, and some ideas might come about.
"The clue train passed his station without stopping."  John Simmons / outlaw programmer
"Real programmers just throw a bunch of 1s and 0s at the computer to see what sticks"  Pete O'Hanlon
"Not only do you continue to babble nonsense, you can't even correctly remember the nonsense you babbled just minutes ago."  Rob Graham





Simulated Annealing.
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
 Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
 Iain Clarke
[My articles]





Does that alter NP hard to NP soft?





It's interesting to study.
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
 Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
 Iain Clarke
[My articles]




