Introduction
This tip will describe a method for sorting multiple word string
s by keyword. Sometimes, the sorting objective is not to sort alphabetically or numerically, but by some keyword. For example, the list below is a partial list of sensors from a crash test dummy containing multiple “words” to sort by.
Femur R Fx
Femur R Fy
Femur L Mx
Femur L My
Femur L Fz
Femur L Fx
Femur L Fy
Upper Neck Fy
Upper Neck Fx
For the following discussion, a word is a sub-string surrounded by white space or delimiting characters, R and L refer to right and left, M and F are moment and force sensors, and x, y, and z refer to the direction, or axis, of motion for the sensor. Assume the objective is to have these sorted with axes in order x, y, z, and the force sensors listed first and then moment sensors, and by left sensors, then right, and all these listed anatomically, head to foot. The desired sort of the list above would be:
Upper Neck Fx
Upper Neck Fy
Femur L Fx
Femur L Fy
Femur L Fz
Femur L Mx
Femur L My
Femur R Fx
Femur R Fy
The problem requires a series of sorting steps, each based on a different keyword in the string
.
Using the Code
The code, written in C#, is shown in the function below. It’s written for string
arrays, but could easily be changed to a List<string>
type. The function can be called more than once passing different arrays of keywords.
private string[] SortByArr(string[] inArr, string[] SortArr) {
string[] outarr = new string[inArr.Length];
int i = 0;
foreach (string sa in SortArr) {
for (int j = 0; j < inArr.Length; j++) {
if (inArr[j].ToUpper().Contains(sa)) {
outarr[i++] = inArr[j];
inArr[j] = "";
}
}
}
foreach (string s in inArr) {
if (s != "") outarr[i++] = s;
}
return outarr;
}
The function takes the string[]
array inArr
(the unsorted list), and a list of keywords in string[] SortArr
, and outputs the list sorted. The keywords used in SortArr
are listed in the desired sort order. For the example list of string
s above, the function could be called first to list by sensor axis, passing { “FX”, “FY”, “FZ”, “MX”, “MY”, “MZ”}
as the SortArr
list of keywords. This will sort the list so that sensors will be listed in order: Fx, Fy, Fz, Mx, My, Mz
. It could be called again with the desired left-right sort keywords: {“LEFT”, “RIGHT”, “L”, “R”}
to make string
s containing a “left
” come before string
s containing “right
”.
Since the original string
list may have variations in the wording, this can cause misses in matching. This can be remedied by passing extra keywords to match. For example, adding “LEFT
” to the list of keywords matches if “LEFT
” was used in the string
instead of “L
”, and doesn’t cause a problem if no match is found (except adding some additional search time). The input string
s are matched in upper case as well for the same reason (not case-sensitive).
As can be seen in the code, the outer loop goes through each keyword in SortArr
, while the inner loop checks if the current keyword is found in any of the unsorted input string
s. Matching is done using the .Contains()
method here, but each string
could also have been split into separate words using .Split()
, with comparison done by direct (boolean “==”
) word matches in an additional inner loop.
When a match is found, the string
is saved to a new string
array in the order in which it was found. The input string
s in inArr
that are matched are changed to “” (the empty string
), to effectively flag it for removal from further searching (with a List<string>
type, this action could be done with the .Remove()
method). The last foreach
loop takes up any leftover string
s that weren’t matched and just adds them to the end of the new string
array. The function then returns the new string
array sorted.
As an example, the function below calls SortByArr
several times with various sort
keywords to get the desired sorted string
array listed head to tibia, upper before lower, left before right, by force axis, then by moment axis, and ordered x, y, and z. Each subsequent call passes, as input, the output from the previous call, so that sorting proceeds.
public string[] AnatomicSort(string[] inArr) {
CStr cs = new CStr();
string[] outArr = new string[inArr.Length];
outArr = cs.SortByArr(inArr, new string[] { "ACX", "AX", "FX",
"ACY", "AY", "FY",
"ACZ", "AZ", "FZ",
"MX", "MY", "MZ" });
outArr = cs.SortByArr(outArr, new string[] { "LEFT", "LE", "L" });
outArr = cs.SortByArr(outArr, new string[] { "UPPER", "UP" });
outArr = cs.SortByArr(outArr, new string[] {
"HEAD", "NECK", "CHST", "CHEST", "THORAX", "LUSP", "LUMBAR",
"PELV", "PELVIS", "FEMR", "FEMUR", "TIBI", "TIBIA"});
return outArr;
}
Note again that adding extra sort keywords like “CHST
” or “THORAX
”, as shown above, will catch string
s that used those words instead of “CHEST
” and sort them correctly, although this does add some more processing time. Note also that the last sort keyword may sometimes be omitted as shown in the code. For example, if sorting by “left
” and “right
”, “right
” may be omitted, since once sorted by “left
”, the remaining string
s can only be “right
”.
Points of Interest
The disadvantage to this sort method is that you need to know what the keywords are in advance of calling the function, but sometimes this may be remedied by searching the list for keywords. For instance, suppose it’s desired to sort by the crash test dummy’s identification (ID) code. Although the IDs could be written in various ways and may not be exactly known in advance, an assumption could be made, for example, that they usually appear as the first word in the string
and begin with the character ‘H
’. Code could be written to go through the list to find these ID codes, saving only the unique ones in a keyword list, and then passing this list to the sort function as keywords.
It’s interesting to note that when sorting multiple times with different keyword lists, the order of sorting is very important. For example, if the “HEAD… TIBIA
” sort was done first, any of the subsequent sorts would disorder the list again. At first, it didn’t seem possible that the list could be sorted by this method as required, but the correct sort order happened to be found. In fact, the sort order shown above appears to be the unique solution (for the type of order required by the user).
In general, the order of a sort should be done from least to greatest “precedence”. For example, doing the calls to sort by “UPPER/LOWER
” first and then “LEFT/RIGHT
” (switched from above) places higher precedence on listing by “LEFT/RIGHT
” so that it lists left-uppers, then left-lowers, then right-uppers, then right-lowers. However, the desired order of left-uppers, then right-uppers, then left-lowers, then right-lowers, required the “UPPER/LOWER
” sort to be done first. Some experimentation may be required in practical usage. It's unknown at this time whether there is a branch of order theory which treats this particular sort order topic, but further discussion would be welcome.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.