|
This is the thread that hosts the discussion about my proposal for a CP Web Service.
A proposal document is available here[^]. Its current version is 1.0; it will be replaced by a future version if and when it becomes available.
|
|
|
|
|
My initial comments, in no particular order, and based on myown personal goals for the API, are:
1. I want the API to start out focussing on an individual developer: you. The API (initially) is not something to allow members to trawl through thousands, or millions, of member records. It should, instead, provide information relevant to you, and to your profile and your experience on CodeProject.com.
2. I will be leveraging our Object References system which relies on an ObjectType ID / Object ID pair. Eg a member is ObjectTypeID = 1, so member #1 is identified uniquely across all objects using (1,1).
3. IsMVP should in fact be "MemberTypes" since a member could be Mentor, MVP, SubEditor etc.
4. You've made no mention of ratings, yet this is one of the things I see being a focus of many CP scrapers. I will be adding something along the lines of GetVotes(int ObjectTypeID, int ObjectID, DateTime startDate) . This allows you to get all votes for all your items, filtered by ObjectType (eg articles, messages etc), a single object, or all votes past a given date. Vote info will be date, score and weight.
5. Do you really want GetXByY methods, or GetX(sortMethod) methods? My preference, in order to keep the API small and extendable, is the latter.
6. In your Article structure you have a list of member IDs. I would prefer to return usable objects (say, name, ID, profile URL) to save an expected followup lookup.
7. Protocol. You mention using Windows WebServices. I was thinking of starting with JSON data initially with XML to follow immediately afterwards.
cheers,
Chris Maunder
The Code Project | Co-founder
Microsoft C++ MVP
|
|
|
|
|
Thanks Chris. I'll await other inputs before reacting in extenso. Except this:
1. The need is wider than just for one member's info. CP Vanity lists the "highest achievers", i.e. all people appearing in the first 4 or 5 pages of Who's Who when sorted by article count or message count.
4. the proposal V1.0 has average ratings in the Member structure, and rating and voteCount in the Article and Message structures; it does not include more detailed voting info as that info is currently not available on the web site either.
|
|
|
|
|
I skimmed through it. It seems reasonable.
Have you tried passing a System.DateTimeOffset? (I haven't.)
|
|
|
|
|
Hi PIEBALD,
Thanks for your first reaction.
I haven't felt the need to pass a DateTimeOffset; in fact I never even used one. My thinking was to communicate UTC only, then a client could, and probably would, adjust everything for local time (i.e. I'm not going to ask CP what timezone I'm in). Do you see a need for DTO?
|
|
|
|
|
|
I just tried/wrestled with it -- svcutil insists on creating a System.DateTimeOffset struct that conflicts with .net on the client.
This requires messing with the generated file (generally a no-no as you know) to get things to work.
|
|
|
|
|
I still have no idea why one would use a DateTimeOffset at all, I didn't know it existed, and I don't understand it.
What is wrong with DateTime, it looks like WSDL serializes it properly.
|
|
|
|
|
Hi Luc,
Firstly, thanks for including me in your discussions, it an honour, and thanks for the reference on the document to my CPRepWatcher!
Looking a the document, you have put together some excellent methods for the webservice, some comments i would have are;
1) The member structure should be maybe be modified to remove some elements; e.g. the Twitter Name. There should be another method that can be called to get this information, if Chris adds other associations in the future, e.g. LinkedIn, Facebook, AnOther, then there would be a requirement for the structure to change. Perhaps have a GetMemberExternalAsscociations that returns a KeyValue Pair string dictionary, where each entry would have the site as the key, and the member name or reference url as the value e.g. Twitter=daveauld
2) I would also remove the things like the post counts average ratings etc, and have them associated with a separate method, for similar reasons to #1 above. Keep the GetMemberInfo to the basic info that is unlikely to ever change.
3) The GetMemberRating, should possibly also be changed to a dictionary list of key/value pairs, for the same reasom, as the site evolves, more items may be added, and the structure would have to change, by using a dictionary, the client would only then have to add another handler for the key value
4) I take it the count parameter in the GetMembersByxxxx returns the number of items being requested? Or is this also updated by the return call to return the number actually returned? Possibly need to limit this to MaxNumber internally per call, also, if you called 1000 and you started at 500 from the end, if it returned only 500, the client would know there are no more and to stop querying during recursive calls.
5) What about a GetMemberCount(); to return the number of site members.
Thats it for the moment!
Cheers.
Dave
Find Me On: Web| Facebook| Twitter| LinkedIn
CPRepWatcher now available as Packaged Chrome Extension, visit my articles for link.
|
|
|
|
|
Hi Dave,
you're one of the known HTML scrapers, so I included you in the review panel.
thanks for your feedback. I'll delay a full response till all have had the opportunity to provide their feedback.
Two clarifications though:
4. the methods returning a list of IDs have a count indicating how many results the caller hopes to get, and there is an imposed maximum (MAXCOUNT); no it isn't a ref parameter (I don't think Web Services can handle that), the returned results will contain count or fewer items; fewer implies end of list. This is in the doc already (principle 5, on page 3), I probably didn't formulate it well enough.
5. MAXCOUNT and the memberCount are returned by GetGeneralInfo(), see page 5.
Cheers.
|
|
|
|
|
I think we should consider implementing my tip/trick regarding web service access (one method returns all data as XML).
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
|
Sorry about that. I had read the comments everyone else had posted and the wife just got home, and wanted to eat, so I was pressed for time. I had only briefly scanned the document, but am about to give it a good read.
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
OK, thanks. Take your time and enjoy.
|
|
|
|
|
0) The GetMembersByXXXX methods should impose some sane limit on the number of users returned. Otherwise, someone could bog down the site with a request for all members.
1) The name GetMembersByID seems redundant. I think it would be more useful if it were GetMembers(int startingID, int count).
2) I don't personally see a need to return a user's name with HTML decoration, but that's just me.
3) The Articles struct needs a DownloadCount field
4) I think the web service should have a single method that accepts a dynamic list of parameters, which includes the name of the stored proc to be called. This REALLY helps in terms of maintainability. All returned data should be in XML format. The programmer can then deserialize the returned data in any way he sees fit.
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
Hi Luc,
Great news that this may actually get implemented
I thought about this a while back, and have a few suggestions for your proposal. Just my 2c
1) I presume you mean PascalCase ( EndOfFile ) instead of camelCase ( endOfFile ). We are .NET devs, after all
2) Reputation levels ( Bronze, Silver, Gold, Platinum, ? ) should be handled separately from the CP UI colours returned by GetColors .
enum ReputationLevel
{
Bronze,
Silver,
Gold,
Platinum,
...
}
ReputationLevel[] GetAllReputationLevels();
This enum should be used everywhere a reputation level is specified.
And possibly ( Chris: are the weights the same throughout the site? )
struct ReputationLevelDetails
{
ReputationLevel Level;
ColorDef Color;
int Weight;
}
ReputationLevelDetails[] GetAllReputationLevelDetails();
3) Is the .NET DateTime type portable? I mean, would non-.NET clients have problems consuming this type? Or is it handled by the binding?
4) For performance, I would also add a binary tcp/ip endpoint for .NET clients ( presuming this will be implemented using WCF ).
5) On page 13 "A note on data types". For a .NET client, you can "Add Service Reference" and it will match up .NET types in the WSDL with the correct framework class on the client. Obviously, non-.NET clients will have to work with just the plain data objects. Also, you can choose how to materialize collections on the client: Array, List<T>, ...
6) All DateTime 's should be in UTC.
I am mainly interested in votes for my own articles. I would like a method that retrieves a collection of summaries for all my articles, and then a method that gets the voting details for a particular article.
Your Article structure looks ok, but I would add: DownloadCount and LastVoteCast .
The method would just be:
Article[] GetArticleSummaries( int memberId )
For the votes, there are 3 possibilities: new, changed and deleted.
So I would declare a class for describing a vote ( which could be used across the site ):
class Vote
{
int ObjectTypeId;
int ObjectId;
DateTime DateTime;
int Rating;
ReputationLevel ReputationLevel;
int ReputationWeight;
int ReputationPoints;
}
Then a class that holds a pair of Votes :
class VoteCast
{
Vote Old;
Vote Now;
}
The 3 possibilities would be handled like this:
- New: Old = null, Now is populated
- Changed: both populated
- Deleted: Old is populated, Now = null
The method would be:
VoteCast[] GetVotes( int objectTypeId, int objectId, DateTime fromDateTime )
The overall rating and popularity at any point in time can be calculated from this data.
BTW: I've just finished a contract using WCF. I'm not an expert ( maybe Roman Kiss could help ), but I'll help if I can
Cheers,
Nick
|
|
|
|
|
Hi Nick,
thanks for your input.
1. yes PascalCase it should be.
2. I had been refraining from using enums to avoid potential comm problems; I know now returning enum values is no problem whatsoever, so I will propose to modify all ColorName by a color enum.
3. DateTime values get ToStringed in SOAP/HTTP, my test says it is now 2010-10-06T12:43:11.939Z; that looks sufficiently portable to me.
4. I'm not familiar with it; I think I'll open the proposal up to multiple communication schemes, with the SOAP/HTTP one as the primary.
5. I'm not familiar with it, will look into it.
6. I already spec'd UTC everywhere.
7. Voting info: you're not the only one asking such, I'll add it to "Possible Extensions". As a rule I did not add new functionality, everything could be gotten through HTTP/HTML as is, with one exception: Members by total rep, something we should have had for a long time.
BTW: IMO there are two major obstacles to voting info: privacy, and volatility of the system.
Thanks again. Shall I add you to my little mailing list?
Cheers.
|
|
|
|
|
Hi Luc,
Luc Pattyn wrote: thanks for your input
No problem
About voting: this information is obtainable at the moment, but you need to poll the site and catch every vote. For me this would be the most important part of a CP API.
Luc Pattyn wrote: Shall I add you to my little mailing list?
Yes please My email address is cp [at] my CP profile Homepage domain.
Nick
|
|
|
|
|
Okay, I went through the proposed methods and would like to suggest these :
(1) There needs to be a way to get a member's last 1000 posts along with forum information.
Collection<SomeStruct> GetPosts(int memberId, int count);
struct SomeStruct
{
Url MessageUrl;
String Forum;
IsThreadStarter;
DateTime TimeStamp;
. . .
}
(2) There needs to be a way to get a forum's last 1000 posts with the reverse info to (1) above:
Collection<AnotherStruct> GetPosts(string forum, int count);
or
Collection<AnotherStruct> GetPosts(ForumEnum forum, int count);
struct AnotherStruct
{
Url MessageUrl;
int memberId;
IsThreadStarter;
DateTime TimeStamp;
. . .
}
(3) Similar to (2), but gets threads instead of posts
Collection<ThreadStruct> GetThreads(ForumEnum forum, int count);
struct ThreadStruct
{
int UniqueThreadId,
Url MessageUrl;
int MemberId;
int Count;
DateTime TimeStamp;
}
(4) A method to get all thread messages:
Collection<AnotherStruct> GetThreadPosts(int threadId);
At this point, these are the major omissions I noticed in Luc's spec document, if I think of more, I'll post them here again. Meanwhile Luc and others, do let me know what you think.
|
|
|
|
|
Hmm. Do we need all this?
So far I tried to formalize what had been used (by scraping) in the existing apps, while keeping things pretty general, not excluding future extensions, not tailoring to exactly what would be used today.
We can ask for message information, OK, no problem. Not sure what use it would have. Do you want to create your own forum viewer?
We can ask for extensive voting information, I did not even consider that, as it is not available on the web site as is, and I do not really want to ask for information the web pages themselves aren't offering. The voting system is a delicate system, it is partially anonymous, etc. I don't feel like offering a second view on it, deviating from the first view, which is the web pages themselves. It would open a truckload of cans of worms IMO.
I could come up with a lot more ideas, but do we want them? do we need them?
My wants (not the needs from existing apps) would include:
- give me all threads where members M1 and M2 participate.
- give me all threads where "xyz" is in the subject line.
- give me arbitrary search facilities.
That is because I'm convinced the CP database is a gold mine, and we can barely scratch the surface with the current search facilities. However, here too, I would prefer the web site itself to offer better functionality (just like you would prefer MSDN help to work extremely well inside Visual Studio).
In summary: convince me I want your app, the one that needs these extra functions, right now. Do you have the concept for such a killer app? and will you offer it provided the info is available?
I'm a bit pragmatic here. I don't mind adding functionality if it increases the attractiveness of the proposal, and not just makes it bigger and less likely to get realized any time soon...
How about this: lets agree on some functionality with limited scope (e.g. similar to V1.0, with minor adjustments, based on existing feedback), get that approved, implemented, adopted by some apps, and enjoyed. Then perhaps consider a V2.0 with new features added. Rationale: we can dream up all kinds of things, is it worth delaying everything? Initially we had no scraping apps; then there was one, not enough to warrant a web service. Now there are four. I studied them all, saw similarities and possible synergy. I would like to grab the opportunity, right now.
|
|
|
|
|
Luc,
I was about to start work on a new CP-app that would have needed those. So no, I didn't think these up as future requirements but rather as immediate needs. None of those will be extra load on the servers since all the info I want can be obtained with SELECT TOP queries and perhaps CP already has wrapper layers around it (since we already have a paged message retrieval functionality on the website).
Of course if you think that asking too many features would put Chris off from doing this, I am fine with getting some minimal functionality now and then asking him to add stuff whenever he gets a chance or whenever we think it will majorly help one of these meta-applications.
|
|
|
|
|
OK, no problem. I'll take a couple of days and amend my proposal, trying to keep a balance. I'll include your little wish list.
|
|
|
|
|
Thank you
|
|
|
|
|
Luc Pattyn [Forum Guidelines] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
Luc Pattyn [Forum Guidelines] [My Articles] [My CP bug tracking] Nil Volentibus Arduum
Season's Greetings to all CPians.
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, and update CP Vanity to V2.0 if you haven't already.
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use <PRE> tags for code snippets, they preserve indentation, improve readability, and make me actually look at the code.
Luc Pattyn [Forum Guidelines] [Why QA sucks] [My Articles] Nil Volentibus Arduum
Please use < PRE > tags for code snippets, it preserves indentation, and improves readability.
Luc Pattyn [My Articles] Nil Volentibus Arduum
The quality and detail of your question reflects on the effectiveness of the help you are likely to get. Please use <PRE> tags for code snippets, they improve readability. CP Vanity has been updated to V2.4
modified on Friday, July 8, 2011 1:31 PM
|
|
|
|
|
Hi Luc! Something must be wrong with your Twitter account, I can't subscribe....
|
|
|
|
|