Click here to Skip to main content
15,906,329 members
Home / Discussions / Algorithms
   

Algorithms

 
QuestionExternal sorting: Which algorithm to select Pin
lizardking3d29-Sep-08 1:45
lizardking3d29-Sep-08 1:45 
AnswerRe: External sorting: Which algorithm to select Pin
Alan Balkany1-Oct-08 3:36
Alan Balkany1-Oct-08 3:36 
GeneralRe: External sorting: Which algorithm to select Pin
lizardking3d5-Oct-08 20:48
lizardking3d5-Oct-08 20:48 
GeneralRe: External sorting: Which algorithm to select Pin
Alan Balkany6-Oct-08 3:29
Alan Balkany6-Oct-08 3:29 
GeneralRe: External sorting: Which algorithm to select Pin
lizardking3d6-Oct-08 4:08
lizardking3d6-Oct-08 4:08 
GeneralRe: External sorting: Which algorithm to select Pin
Alan Balkany6-Oct-08 4:22
Alan Balkany6-Oct-08 4:22 
GeneralRe: External sorting: Which algorithm to select Pin
Mark Churchill6-Oct-08 5:23
Mark Churchill6-Oct-08 5:23 
GeneralRe: External sorting: Which algorithm to select Pin
supercat921-Oct-08 12:49
supercat921-Oct-08 12:49 
Did you understand Mr. Balkany's suggestion? If you can fit 1/16 of the data in RAM along with all the indices, then you should be able to process everything (after the sort) in 16 passes through the data file, with no random seeks.

Suppose, for example, that there are 16,000,000 records and you can hold an array recBuff of 1,000,000 records in RAM along with an array finalPos of 16,000,000 integers. First, fill in finalPos such that finalPos(0) says where record #0 in the original file should go; finalPos(1) says where record #1 should go, etc. This can be done in linear time.

Next, read through the entire source file; after reading record #n from the file, look at finalPos(n). If it's less than 1,000,000 then store the record in recBuff(finalPos(n)). Otherwise discard it. Once this is done, recBuff(0..999999) will hold the first million records. Write them to disk.

Now read through the source file again. This time, look for records where finalPos(n) is in the range 1,000,000 to 1,999,999 and store those records in recBuff(finalPos(n)-1000000). Once all records have been read, recBuff will hold the next million records. Write those to disk.

If recBuff and finalPos fit in RAM without swapping, the program should run very fast. Doubling the number of items in recBuff will double the speed, if it does not cause swapping. If it does cause swapping, it will dog the performance.

If there are so many records that the finalPos array itself takes an excessive amount of space, a temp file could be created which interleaves the source data with the finalPos items (since finalPos is always read in order). That would free up more space for recBuff.
GeneralRe: External sorting: Which algorithm to select Pin
lizardking3d22-Oct-08 1:57
lizardking3d22-Oct-08 1:57 
GeneralRe: External sorting: Which algorithm to select Pin
supercat922-Oct-08 7:38
supercat922-Oct-08 7:38 
NewsHuge new prime number discovered Pin
73Zeppelin27-Sep-08 22:41
73Zeppelin27-Sep-08 22:41 
GeneralRe: Huge new prime number discovered Pin
Bassam Abdul-Baki28-Sep-08 15:16
professionalBassam Abdul-Baki28-Sep-08 15:16 
JokeRe: Huge new prime number discovered Pin
Nelek14-Oct-08 0:52
protectorNelek14-Oct-08 0:52 
QuestionScapegoat Tree Issue Pin
loctarar25-Sep-08 2:47
loctarar25-Sep-08 2:47 
Generalsuggest me an algorithm Pin
niconicx22-Sep-08 23:08
niconicx22-Sep-08 23:08 
GeneralRe: suggest me an algorithm Pin
73Zeppelin22-Sep-08 23:14
73Zeppelin22-Sep-08 23:14 
GeneralRe: suggest me an algorithm Pin
Paul Conrad25-Sep-08 7:13
professionalPaul Conrad25-Sep-08 7:13 
GeneralRe: suggest me an algorithm Pin
CPallini27-Sep-08 22:07
mveCPallini27-Sep-08 22:07 
GeneralRe: suggest me an algorithm Pin
PIEBALDconsult21-Nov-08 5:29
mvePIEBALDconsult21-Nov-08 5:29 
GeneralRe: suggest me an algorithm Pin
CPallini22-Nov-08 21:00
mveCPallini22-Nov-08 21:00 
QuestionI'm sorry if this sounds like a weird question... Pin
mhtirogla18-Sep-08 20:53
mhtirogla18-Sep-08 20:53 
AnswerRe: I'm sorry if this sounds like a weird question... Pin
73Zeppelin18-Sep-08 21:02
73Zeppelin18-Sep-08 21:02 
QuestionInterpX function limitation (within Interp32.xll Excel addin) [modified] Pin
M.Slipper18-Sep-08 7:04
M.Slipper18-Sep-08 7:04 
AnswerRe: InterpX function limitation (within Interp32.xll Excel addin) Pin
73Zeppelin18-Sep-08 12:10
73Zeppelin18-Sep-08 12:10 
GeneralRe: InterpX function limitation (within Interp32.xll Excel addin) Pin
M.Slipper18-Sep-08 14:49
M.Slipper18-Sep-08 14:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.