|
|
Comments and Discussions
|
|
 |
|

|
Hi,
Is it possible to declare a Multimap with collection inializer ?
Something like :
private static MultimapBK _domains = new MultimapBK<string, string>()
{
{ "MAILSETTING", new List&lt;string&gt; {"MAILPORT", "MAILPASS", "MAILLOGIN", "MAILSERV"} }
};
With MAILSETTING the key and following string the values ?
modified on Monday, September 27, 2010 9:16 AM
|
|
|
|

|
Bharath,
I like your idea - reminds me of a hashtable, something I have not seen in .net 2.0 which is the primary framework we use at the moment. Your article created some interesting feedback which I also enjoyed. Despite the criticisms, I find value in your idea - the various implementation scenarios/concerns brought up by others are also helpful in understanding requirements for a more general solution - but the value of your idea still holds, in my opinion. Thanks.
Regards
Bill Dickinson
|
|
|
|
|

|
There already is a Dictionary class in C#.
|
|
|
|

|
Yea, but not in Silverlight.
|
|
|
|

|
The concept of a multi-key per value map is interesting. I do not think this implementation is the best option, and has been complicated by needless concurrency details and performance has been diminished by intrinsic locking with a poorly performing lock.
|
|
|
|

|
I am surprised! I already wrote to you that this article is changing to an implementation without locks. Well, probably you missed to read that.
This does not mean Concurrent collection is *impossible* and *always bad* to evevn think about it. I just felt, there is a need for MultiMap without concurrency. And I wanted this article to have a simple start.
modified on Saturday, March 14, 2009 9:53 AM
|
|
|
|

|
I'd like to say, it's a good realization.
|
|
|
|
|

|
An aspect I dislike about this solution is the statefulness of iteration, and then tying it back into the managed thread ID to make it thread-safe. It all feels rather cumbersome, and not in keeping with the .Net IEnumerable/IEnumerator interface mechanisms.
Having a method (GetItemsMatching(Key item)?) return an IEnumerator would be, IMHO, far cleaner.
|
|
|
|
|
|

|
You could have provided more explanation about how the implementation is done. How each method works and a comparison with framework's Dictionary(TKey,TValue) class. I don't really see any need of repeating the key value.
What I understood from your implementation is, you are trying to keep multiple values for a key by repeating the key itself. This can be done in a better way using the framework's Dictionary(TKey,TValue) class. See belowDictionary<string,List<string>> multiMap = new Dictionary<string,List<string>>();
multiMap.Add("Alice", new List<string>(new string[] { "Home: 123-456-789" }));
multiMap.Add("Bob", new List<string>(new string[] { "Home: 234-567-7650" }));
multiMap.Add("Mason", new List<string>(new string[] { "Home: 675-444-4444" }));
multiMap.Add("Mahi", new List<string>(new string[] { "Home: 777-666-4444" }));
multiMap["Alice"].Add("Office: 445-676-9999");
multiMap["Mason"].Add("Office: 445-676-9999");
multiMap["Mahi"].Add("Office: 445-676-9999");
foreach (string address in multiMap["Alice"])
Console.WriteLine(address);So, what are you trying to solve?
|
|
|
|

|
The Multi-map collection class abstracts this internal data structure. This way the user gets a simple interface to code.
Well, providing multi-thread-safe collection relieves the user of this class from worrying about synchronization issues.
Also it provides 'thread-aware' iteration to elements in the collection. Consider two threads trying to iterating the elements from the same key. They both will get right sequence of elements.
Thanks for your feedback Navaneeth!
|
|
|
|
|

|
Bharath
Internally you are using a list to store multiple values for the same key. So your lookup is not O(1) but O(n) (unlike a Dictionary). If you had used a HashSet internally, your lookup would be O(1)
Also, locking for every accessor method on the data structure may cause performance issues. If concurrency is an issue, have you looked at Read/Write locking strategies?
Thanks
|
|
|
|

|
Both are very good ideas.
I wanted to support .net ver 2.0 users - hence didn't use HashSet. Also HashSet lacks the capability to point-pick Item by Index. i.e., 'Get 5th Item' - is not possible with HashSet. Could consider an ArrayList. But ArrayList growth pattern (copy to grow) is not desired.
But I could have used Dictionary itself .. like Dictionary<index,>. But Dictionary will use more memory. Already we are using Dictionary at first place (which is good for performance to locate an item quickly). Didn't want to use multiple small dictionaries inside this primary dictionary. However, If High speed is required (at the cost of memory) - Dictionary shall be used.
ReaderWriter Lock is good for performance. Did you read the below MSDN warning on ReaderWriter Lock ?
http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlock.aspx[^]
Thanks!
modified on Sunday, March 8, 2009 8:47 PM
|
|
|
|

|
Bharath K A wrote: ReaderWriter Lock is good for performance.
No. It has got lot of problems. You have to use it's successor ReaderWriterLockSlim[^]. But it requires framework 3.0 or later.
Actually you don't need to worry about thread safety here. Its your class users job to ensure thread safety when they call Add.
|
|
|
|

|
Technically speaking, locking shouldn't be built into a collection in the first place. A collection should provide the ability to support locking externally, in the event that it is used in a scenario that requires it...but locking, even with an RWL, is just wasteful if access will always be performed on a single thread. This is why .NET collections have a SyncRoot property, and have Synchronized static methods to provide a thread-synced wrapper version around the base version. A little poking around with Reflector will provide a lot of insight into better options for providing thread safety without requiring it.
|
|
|
|
|

|
I don't see how parallel concurency has anything to do with a generic map collection class. Standard collections, which is what this article appears to be about, should not include locking internally...it should always be externalized through standard facilities such as SyncRoot and a static Synchronized method (keeping your custom collection in line with all the built in .NET collections and the majority of third party collections).
If you wish to write a concurrent collection, you have a lot more to worry about than just locking...you have to deal with multiple concurrent updates and proper communication of those updates between threads to maintain consistency. Its not as simple as just locking when you add...and there are MUCH more efficient means of locking than Monitor.
Given that, if I needed parallel concurrency, I wouldn't write my own from scratch or use one written from scratch. I would use whatever Microsoft has written in their parallel extensions, and if they did not offer something I needed, I would be extending their parallel base types to accomplish what I needed. At least then I'd be using a piece of code that I knew was thouroughly tested, broadly used and vetted by the community, and generally guaranteed to produce appropriate results without me having to worry that the implementor missed something important.
|
|
|
|

|
Good read. Firstly thanks for your feedback.
I am planning to modify this article to remove internal locking. Because, I think many people just want a MultiMap. It may take a one or two days before you see an update.
Your Comment: If you wish to write a concurrent collection, you have a lot more to worry about than just locking.....
My Answer: I completly agree with you here.
Your Comment: ......and there are MUCH more efficient means of locking than Monitor.
My Answer: Agree with you as efficient lock improves the performance. Partially disagree as locking has little effect on Order(1) read and writes. However, professional libraries may worry about that micro-second performance. Codeproject library is to explain the concept and typically brevity is preferred.
Your Comment: I don't see how parallel concurency has anything to do with a generic map collection class
My Answer: Typical example would be a 'UI thread' refresh that enumerates (or iterates) the collection, while an 'Update thread' updates this collection. If we lock as you say, the UI thread will hang until the 'Update thread' finishes the update. Did I understand your comment correctly here ?
Your Comment:Given that, if I needed parallel concurrency, I wouldn't write my own from scratch or use one written from scratch.
My Answer: I completly disagree here. If everyone did this, we wouldn't be having a high-performing log4net logging library in the world at all. Everyone will be just extending MS Enterprise logging library (which is very slow). So sometimes it is better to write from scratch. Especially concurrent libraries is a technology at infancy now. What do you think ?
|
|
|
|

|
Bharath K A wrote: Typical example would be a 'UI thread' refresh that enumerates (or iterates) the collection, while an 'Update thread' updates this collection. If we lock as you say, the UI thread will hang until the 'Update thread' finishes the update. Did I understand your comment correctly here ?
There are many ways to perform work without blocking a UI thread. Most notable is the BackgroundWorker component, which allows you to perform asynchronous work with UI update notifications. You don't have to bother writing a collection with complex built-in locking once you spin up a background worker...you just use the MutiMap without worrying about concurrency, and fire updates to the UI using the BackgroundWorker. Beyond that, there are serious problems for users if you try to handle concurrent display and updates of a data set. Users don't expect their data to change while they are viewing it...they only expect it to change when they are editing it. In a well designed application, viewing and editing are separated...and editing is usually performed one entity/record/thing at a time, not concurrently with view operations on the same data.
I think you have completely misunderstood the goals of Microsofts Parallel Extensions to .NET. Their intent was much less likely to handle concurrent display and update of data in a UI, and much more to enable .NET developers to fully utilize the multi-core, multi-threaded platforms that are at our disposal. When data items in a collection can be atomically processed, concurrency becomes a real possability, allowing us to greatly improve performance. We can resort to implementing our own concurrency, or we can utilize the wonderful new Parallel Extensions to .NET that microsoft is providing us, and gain a significant boost to general data processing without any significant effort.
Just because we have the ability to concurrently process data, however, does NOT mean we should throw away the tried and true, good practices about UI usability. Users don't expect that what their viewing will be changing while they view it...they expect stability in what they view. They also expect that when they modify data, it won't affect other data they are viewing until they confirm their changes. In the event that you want to display a view that does update while the user is performing some action...say a real-time filter that eliminated non-matching entries as a user types in search criteria...there are better and much simpler alternatives such as utilizing the LINQ extension methods around IEnumerable to augment the criteria of ENUMERATION, or iteration over data (rather than modifying the data itself) to generate the appropriate results. This offers the benefits that your data source remains static, and each thread can manage its own IEnumerable to perform safe, stable, real-time filtering or the like.
That makes the goal of handling concurrent rendering and update of data...as your comments seem to allude to as the purpose of this article...a moot point and demonstration of bad practices (an anti-pattern if you will). If it is your intent to describe an anti-pattern, something developers should not be doing, then your article is great. Otherwise, I think you have entirely missed the point of the parallel extentions for .net and are oblivious to established best practices for UI usability.
Bharath K A wrote: I completly disagree here. If everyone did this, we wouldn't be having a high-performing log4net logging library in the world at all. Everyone will be just extending MS Enterprise logging library (which is very slow). So sometimes it is better to write from scratch. Especially concurrent libraries is a technology at infancy now. What do you think ?
The Enterprise Library is not part of the .NET framework. Its is, for all intents and purposes, a third-party open source library, just like log4net. It is not used nearly as frequently as the .NET framework itself, nor is it maintained in the same way. Comparing log4net and EntLib Logging to a new, core, integrated set of libraries for .NET 4.0, such as Parallel Extensions, is a bit ludicrous. The two are of entirely different classes of library. As for comparing log4net with your attempts at a concurrency implementation...log4net is filling a role that is simple in comparison to concurrent data processing. If I had to choose between an open-source implementation of a parallel execution library and microsofts, I would choose microsofts hands down. They will always understand the nuances of the windows scheduler better than anyone else...making them the best people to write a safe, stable, efficient, and flexible concurrency library. Logging, on the other hand, can be done by anyone...its a simple task that generally requires simple solutions.
modified on Saturday, March 14, 2009 5:42 AM
|
|
|
|

|
Your Comment: .... Users don't expect their data to change while they are viewing it...
My Answer: Really, I think you haven't seen any trading applications.
Your Comment: There are many ways to perform work without blocking a UI thread. Most notable is the BackgroundWorker component, which allows you to perform asynchronous work with UI update notifications.
My Answer: Here the problem is *NOT* threading. Consider the sample below:
// This is UI Thread, that wishes to enumerate
public static void ThreadProc()
{
Dictionary.Enumerator enumDict = myDict.GetEnumerator();
while (true)
{
lock (lockObj)
{
enumDict = myDict.GetEnumerator();
while (enumDict.MoveNext())
{
System.Console.WriteLine("Key = {0}, Value = {1}", enumDict.Current.Key, enumDict.Current.Value);
}
System.Console.WriteLine(" ------ End of List --------");
System.Console.ReadLine();
}
}
}
// This is the Updater thread. If you want this can be a background thread.
public static void ThreadProc1()
{
int Counter = 0;
while (true)
{
lock (lockObj)
{
myDict.Add(Counter.ToString(), "Just Value");
}
Counter++;
}
}
If you read this carefully, you will note that the UI thread may be blocked until the Updater thread finishes. This is what I meant in the article.
Your Comment: ...the goals of Microsofts Parallel Extensions to .NET. Their intent was much less likely to handle concurrent display and update of data in a UI, and much more to enable .NET developers to fully utilize the multi-core, multi-threaded platforms that are at our disposal
My Answer: I partially agree with you. I agree the multi-core, multi-thread part. But I disagree your comment on concurrent display update. There are many serious real-world use-case(s) available for this.
Your Comment: .... and demonstration of bad practices (an anti-pattern if you will). ...
My Answer: If you read nice articles on why allowing concurrent enumeration is quite dangerous (and why Microsoft is not doing it), you will understand clearly the pit-falls and the reason why is is termed 'bad' by some *well-informed* developers. I am with the class of developers who support that 'it is tough to develop concurrent enumeration', but always 'do'able. Also, a mature developer will try to understand if a library supports concurrent enumeration, are all those pit-falls are clearly ironed-out in the library ?. And a mature developer will not turn a blind-eye quickly, but will analyze with a fresh, open midset. If one of the use-case that draw to the above said 'pit-fall', is not handled - she/he will point out that use case/scenario that will lead to bad practices. The below is one such good read:
http://blogs.msdn.com/jaredpar/archive/2009/02/11/why-are-thread-safe-collections-so-hard.aspx
History
[^]
It will be a good discussion, if we foster such comments.
For your other comments: I respect Microsoft and their libraries. But I do not fall in the category of 'It's *ALWAYS* best to use what Microsoft gives. And we *SHOULD*!'. Many real-life use-cases/scenarios/business-needs are so unique that we need to write our own libraries.
That said, I think our discussion will be more constructive (and useful to everyone reading) if the nature of your questions are like:
You Write: Hey, consider this case, When you are enumerating - you are at MoveNext() in one thread, while another thread removes that content. How will you handle this scenario ?
.. instead of your comments like '..Just because we have the ability to concurrently process data, however, does NOT mean we should throw away the tried and true, good practices ...'
I hope you will take this discussion a pure technical debate. I mean no offence to your personal ideas.
|
|
|
|

|
Perhaps, I should have used List.
|
|
|
|

|
Well, at first glance, seems like a usable class, but an article on CP should explain the implementation at least at a basic level.
Stuff like the internal data structures used and likely performance of a collection class are very important, and the first thing somewhat experienced programmers will probably look for in an article like this.
My (humble) opinion is you should add the stuff mentioned to the article, in order to make it possible for someone to learn something or at least compare your implementation to their own ideas, without necessarily downloading the code.
"The motivation for Generic Multi map collection type is to have the Dictionary performance but capability to add more than one value for the same key as stated above. The Generic Multimap collection class is designed to have this capability."
-Well, it should be made clear how is the class designed to have this capability.
I don't think it's necessary to rate a 1, I'd rather not rate at all until I can see something worth rating.
However, do note that an article like yours is more of a code posting, and not a real CP article (or at least not one of those which get higher-end ratings).
Hope you find this a positive criticism.
"Have you heard about the object-oriented way to become wealthy?"
"No..."
"Inheritance."
|
|
|
|

|
Firstly, Thanks for your feedback.
Sure! I will add explanation on implementation details of this generic collection class. This article is already published. So I will either update this article or else add another part (part 2).
And Yes, your feedback is much appreciated.
|
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|
The generic Dictionary collection class in .NET requires the key to be unique. But many applications/services require a flexible, generic Dictionary collection type that can accept multiple values for a key. This article explains one such generic Dictionary collection type.
| Type | Article |
| Licence | CPOL |
| First Posted | 8 Mar 2009 |
| Views | 68,050 |
| Downloads | 1,075 |
| Bookmarked | 24 times |
|
|