|
Sorry, honey(only going by your username).
Wish I could help. But I am not even gonna use C# in my life. I'm just gonna be using HTML, Java, Python, and C++.
|
|
|
|
|
Having just seen an episode of the Simpsons, you almost lost me with the first "fat ran". I can almost imagine Homer saying "I have to run too?"
I do not know C# but maybe you are trying to do too much on a single list. Try looking at your problem from a different perspective. Instead of looking at the problem bottom-up try top-down, sideways, mad scientist user or... you know... like Homer
I have a custom list implemented (over the default list) in Java that, because it must be ordered, it does not use only a default list. It uses a default list to keep track of the order of the data, but the actual data is stored in a map of lists.
In your case data could be stored in the map indexed by "To" (your almost non-overlapping property) where each position of the map has a list with the actual data.
Good luck and try not to start the new year frustrated
|
|
|
|
|
Yeah, they could be indexed by To states. I actually do that in some of my previous code. I'd consider it here, but it means rewriting some very complicated code that acts on that FATransition list.
Real programmers use butterflies
|
|
|
|
|
I do not know what is possible to do in C# but, when a default data structure does not directly implement everything I need, I always try to make my data storage classes independent of the underlying storage type (data structure) by using templates and abstract classes.
Let us say that I want to store ClassA instances ordered by something that is a combination of some of its fields. I would probably make it ClassAStorage implements ClassStorageBase<actualstorageclass> where ClassStorageBase<t> implements the methods I need to make ClassStorage easier to work with including some abstract methods that will depend on ClassAStorage to be implemented and will be used in other methods of ClassStorageBase.
This also helps reusing code for, say, a ClassB that can be easily implemented as a ClassBStorage implements ClassStorageBase<anotherstorageclass>.
Simple snippet using my custom list:
- My storage class
public abstract class IndexedSet<TYPE_KEY extends Object,TYPE_DATA extends Object> {
private HashMap<TYPE_KEY,TYPE_DATA > cells;
}
- My ordered storage class
public abstract class IndexedSequence<TYPE_KEY extends Object,TYPE_DATA extends Object> extends IndexedSet<TYPE_KEY,TYPE_DATA>{
private ArrayList<TYPE_KEY> cellsOrder;
public Entry<TYPE_KEY, TYPE_DATA> get(int pos) {
TYPE_KEY k=cellsOrder.get(pos);
return new Pair<TYPE_KEY, TYPE_DATA>(k,internal().get(k));
}
}
- My base storage table (indexed rows that have indexed columns)
public abstract class TableCore<TYPE_KEY extends Object,TYPE_DATA extends Object>{
protected abstract TYPE_KEY getKey(TYPE_ROW r);
}
- finally, my class
public class TableBasic<TYPE_DATA extends Object> extends TableCore<Integer, TYPE_DATA>{
@Override
protected Integer getKey(TYPE_DATA r) {
return r.hashCode();
}
}
- and my other class
public class Table extends TableCore<UUID, TYPE_DATA extends MyData>{
@Override
protected UUID getKey(TYPE_DATA r) {
return r.id();
}
}
And, yes, it is a lot of work.
|
|
|
|
|
Yeah, I mean, I already do something like that in another part of my code, but it's not really appropriate here. It would add maintenance, but since it's only used by one thing, it doesn't need to be heavily abstracted.
Real programmers use butterflies
|
|
|
|
|
I don't quite see the problem here?
I mean, at the risk of being silly and obvious, I will state you should do a binary search when inserting (start an N/2, then N/4 or 3N/4 and so on...) for a quick match and then insert or merge. Elementary my dear Watson... or not?
FA being an ICollection<FATransition> , isn't it?
|
|
|
|
|
Normally, yes, but, I have to check the entry prior, to see if its range overlaps the one I just inserted, and I also have to check the entries after until I get one whose max value is less than or equal to the new max. Basically I need to combine overlapping ranges into one range.
This is further complicated because that only applies to ranges where the To entry is the same.
FA contains a list of FATransitions in addition to other data.
Real programmers use butterflies
|
|
|
|
|
ok, untested code (sorry) but I think it should work, in principle...
public struct FATransition : IComparable<FATransition>
{
public int Min;
public int Max;
public FA To;
public FATransition(int min, int max, FA to)
{
if (max < min)
throw new ArgumentOutOfRangeException(nameof(max));
Min = min;
Max = max;
To = to;
}
public FATransition(FA to)
{
Min = Max = -1;
To = to;
}
public int CompareTo(FATransition other)
{
var c = Min.CompareTo(other.Min);
if (c != 0) return c;
return Max.CompareTo(other.Max);
}
}
public class FA : ICollection<FATransition>
{
List<FATransition> _transitions = new List<FATransition>();
public bool IsDeterministic { get; private set; } = true;
int FindIndex(int min)
{
int iMin = 0, iMax = _transitions.Count;
while (iMin + 1 < iMax)
{
int i = (iMin + iMax) / 2;
var t = _transitions[i];
if (t.Min >= min)
iMax = i;
else
iMin = i;
}
return iMin;
}
public void Add(FATransition item)
{
int start = FindIndex(item.Min);
for (int i = _transitions.Count - 1; i >= start; i--)
{
var t = _transitions[i];
if (t.To == item.To && t.Min <= item.Max)
{
item.Max = t.Max;
_transitions.RemoveAt(i);
}
else if (t.To != item.To && t.Min <= item.Max && t.Max > item.Min)
{
IsDeterministic = false;
}
}
_transitions.Insert(start, item);
}
bool ICollection<FATransition>.Remove(FATransition item)
=> throw new NotSupportedException();
public int Count => _transitions.Count;
bool ICollection<FATransition>.IsReadOnly => false;
public void Clear()
{
_transitions.Clear();
IsDeterministic = true;
}
public bool Contains(FATransition item) => _transitions.Contains(item);
public void CopyTo(FATransition[] array, int arrayIndex)
=> _transitions.CopyTo(array, arrayIndex);
public IEnumerator<FATransition> GetEnumerator() => _transitions.GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
modified 30-Dec-21 16:27pm.
|
|
|
|
|
Nope, that just does a sorted insert. That's simple, but not what I need.
What do you do when a range overlaps?
You've done nothing in that case. You just insert the range. You aren't merging overlapping ranges that have the same To.
Edit: I see. never mind, I missed it the first time. sorry.
Real programmers use butterflies
|
|
|
|
|
your answer confuses me, it looks like you haven't read the code (lack of sleep perhaps?), or I misunderstand the question (lack of sleep on my part perhaps).
Because the code quite obviously merge overlap range on insert so... could you be more clear on what's wrong?
For clarity I paste the add method again below with added comment
public void Add(FATransition item)
{
int start = FindIndex(item.Min);
for (int i = _transitions.Count - 1; i >= start; i--)
{
var t = _transitions[i];
if (t.To == item.To && t.Min <= item.Max)
{
item.Max = t.Max;
_transitions.RemoveAt(i);
}
else if (t.To != item.To && t.Min <= item.Max && t.Max > item.Min)
{
IsDeterministic = false;
}
}
_transitions.Insert(start, item);
}
|
|
|
|
|
It was my fault - i had since edited my comment accordingly - I missed it the first time through.
Real programmers use butterflies
|
|
|
|
|
I guess brain ache doesn't help!
No worries!
|
|
|
|
|
just amended the Add() method
for (int i = _transitions.Count - 1; i >= start; i--)
obviously count down to start , not 0!
|
|
|
|
|
That is basically what the map does on my post above.
Unless we are misunderstanding something.
Best regards
|
|
|
|
|
I rarely load "ranges" since they rarely make sense as specified; e.g. 100-200; 200-300; etc.
I just load the max values for each range.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
Well I think? I see what you're saying, but in this case I need the minimums too.
In this case the ranges are Unicode Codepoints, and those are often laid out in runs of related characters, so you wind up, with for example, whitespace being, instead of a hundred individual characters, a series of i don't know, like 3 or 4 ranges.
If it were simply that, I'd store a marker id for each type of range (whitespace, digit, letter, etc), but ranges can and must be added to, removed from and intersected with other ranges.
I have to preserve the entire set of data, not just the maximums.
Real programmers use butterflies
|
|
|
|
|
|
Umm, you dramatically misunderstand the problem.
Unicode categories are not the only ranges I need.
Forget they exist, as they will only confuse things.
Let me put it this way.
ASCII has 127 characters.
UTF32 has 0x10FFFF (however many that is)
If I didn't use ranges, I would have to store each character transition individually.
That is not realistic for unicode. Period.
The way to make it realistic is to use ranges.
Not to use unicode categories, because that wouldn't make sense.
Real programmers use butterflies
|
|
|
|
|
Code Points, Code Points
|
|
|
|
|
yeah yeah, codepoints. point is there are a lot of them.
Real programmers use butterflies
|
|
|
|
|
Yeah; I still don't understand dramatically; particularly the drama.
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
|
|
|
|
|
Off topic,
But just wanted to mention that C++20 has std::ranges[^] along with some new additions to <algorithm>[^] that would make this easy.
|
|
|
|
|
That's cool. I heard C# might eventually wind up with something like that too.
Real programmers use butterflies
|
|
|
|
|
Ignore the wrinkle. Instead of trying to ever merge, split into smaller ranges in case of overlap. Exception is if new entry is totally encompassed by existing entry in which case new entry is discarded.
It is only safe to merge if there is not another To that conflicts with that range. For each new entry’s overlap with an existing entry, split both new and existing into matched smaller ranges.
This is classic “bin” (as in bucket) data structure problem with your twist.
[1,5]A
[1,5]B
Insert [3,7]A
[1,3]A
[1,3]B
[3,5]A
[3,5]B
[5,7]A
Or adding another level where the range holds a list which it sounds like you want to avoid.
[1,3] [A,B]
[3,5] [A,B]
[5,7] [A]
Good luck!
|
|
|
|
|
|