
Comments and Discussions



Thanks, I had fun reading your article. The next step is to prove that every article's rating is equal to 3.
Greetings  Jacek





The rating system is designed to counter votes that do not align with the expected distribution of silver and gold members. Specifically: if someone posts an article and creates new accounts to vote it a 5; or someone has a vendetta against and author and creates accounts to downvote this person. If these votes were not representative of the (assumed) more correct distribution from (hopefully) more sensible silver and gold members then they would be outweighed by the votes of the gold and silver members.
As as mentioned previously: you analysis is correct for when the distributions are the same, which is exactly as it should be. For the situation where the distrubtions are not the same then the higher level votes will, on a voteforvote basis, outweigh the lower level votes.
cheers,
Chris Maunder
CodeProject.com : C++ MVP





Well, you seem to be right, on the other hand there are much more regular that silver and golden users. Therefore I also think that the proportion of user types for users participated in the vote should be taken into account.
Ivan S Zapreev
ivan.zapreev@gmail.com





Yeah, a world without trolls would be somewhat good, but it could get a little boring  especially if we would generalize it to other aspects of life. The communism is a system which main assumption is that people are generally equal  an this is the biggest mistake one can make.
Greetings  Jacek





What about publishing user's votes in their profile? This way, users would be more accountable of their votes.





This sounds like a good idea to me. It might work!
Ivan S Zapreev
ivan.zapreev@gmail.com





…At least when giving a score of 1. This way, author would get feedback helping him to improve his article. Other users may also judge the credibility of the voter by reading its comment.





> Note that if vote values depend from person's weights then the rating mean
> value is different and involves the weights distribution.
I like your article, rational and scientific
But the system is designed to favour to the gold members, and this is only meaningful if the distributions are not the same, and then E(fg) != E(f)E(g).
In other words, if we think the two distributions will be the same for most of the time, of course we don't need to consider the weight...






It'd be interesting to see how many people voted for each rating, sort of like MSDN articles[^]. I find for example in cases where there's a lot of 1s and a lot of 9s (5s in the case of Code Project), the 1s are negligible because they're usually just entered by the overzealous or bored. But you wouldn't know that's the case unless you could see the rating distribution the way you can on MSDN, you just see a 3 instead.





This would be great indeed!
Ivan S Zapreev
cerndan@mail.ru






I personally do not like the voting model wherein you choose how much you like something, from 1 to 5 (or 1–10, etc.) Each person interprets this range differently. I, for instance, use it as a quality scale starting at 1 for a letter grade similar to C–, and scaling up to 5: ‘perfect A+’, others use the middle as ‘ok’, 1 as ‘horrible’, and 5 as ‘good enough’. The result is that the average of all these numbers does not mean much. It’s marginally better than sheer randomness.
The problem is that people interpret the scale differently, and that there are too many grades. We have lots of ‘markers’ on the net, let’s use them!
I think voting on the Internet should be constrained to just three options with welldefined meanings and colorcodings:
 Good/Yes/Quality
 Meh/Average/Maybe/‘I agree’
 Poor/No/‘Disagree’
All items start out with an average rating, and this initial rating has notinsignificant weight, which should be about half the average number of ratings an item on the site receives. Starting out in the middle means the first few votes move the average a small amount, instead of a large amount. Authors can then vote ‘Good’ on their own articles, and not affect the ranking so drastically at the beginning.
By voting ‘yes’, you increase the rating. By voting ‘no’, you decrease the rating, and by voting ‘meh’, you are in effect saying: “The current rating is accurate, increase its weight.”
Gold/Silver/Bronze weighting and whatnot can be added on top of this model pretty trivially.
Another thing I just thought of is a fourth category of member. This category is only achieved when a member posts some number of articles which achieve a good rating by so many other members, or something along those lines. Perhaps these members can have an additional ‘doubleplusgood’ button for voting, which counts as 3 goldlevel ‘good’ ratings. Having the extra option would be a good reward for people who achieve this high level of membership. Maybe this group is shortlisted for a draw every month, or something.





The only thing I find as a fault of the system is to assume that because a person is listed as a rookie because of the following:
Newly joined
Low levels of postings
No or little articles
By no way does this make a person either a guru (sh***y articles, long membership, overposting) or a rookie. By determining if a person's weight by using guru or rookie status is simply not going to work.
Maybe we should take tests to prove our levels, then true weighting can be established.
And by simply posting this in no way makes me a guru.
Which I am definetly not.
Rant finished.
Fear is the mind killer. So most of you have everything to worry about, now don't you!





Testing raises its own inescapable problems, and it's just not going to happen on the site anyway. I agree with your sentiments completely, though.
The way these issues are often tackled on community sites is to have members "trust" each other; someone gets more trust points the more people that trust them, and the more highly trusted the people are that trust them. This can result in cliquishness if not implemented correctly or with too small a group, but with a group the size of the Code Project membership it could work well.
Did you ever read Michael Shermer's column in Scientific American, on being a skeptic? He often talks about the difference between science and pseudoscience. Pseudoscientists often reference each others' work, in an attempt to better trick the public; here you have a trust network that is outside that of science. However, it is neither as large nor as wellestablished as the trust network populated by genuine scientists.
It helps to have something real to which to "anchor" things; for instance, the pseudoscientists' trust network probably contains very few, if any, Nobel prize winners. In our situation, it could be something like testing as you suggest, or the mechanisms already built into the site: articles, voting, etc. It's not necessary to have "anchors", though; their importance decreases as the population size increases. It's possible for someone like Daniel Stephen Rule to spam day and night and get a respectable score on something just by creating bullsh*t accounts, but it would be impossible for him to stem the tide of public opinion all on his own, no matter how hard he tried.
A related idea: IBM was the first to create a fullyfunctional search engine based on ranking ideas drawn from scientific research papers. I'm talking about the references that scientists make to other papers in their own papers. A paper gets more acclaim if more important scientists refer to it in their own papers you get the idea. IBM adapted this basic methodology to web searches; sites are either considered to be "hubs" or "authorities" (they can be both, but don't tend to be). Highlyrespected hubs point to many highlyrespected authorities; authorities are trusted to contain useful information. The "respect" calculations take place in a series of passes over the network; sites with weak connections wind up dropping off.
Regards,
Jeff Varszegi





In the case of IBM (remember I am basically ignorant of these types of things), would you consider a case where a more "important" scientist is disproving the other's work. Wouldn't that give the "disproved" scientist more credability.
I was being sarcastic about the testing because I have met plenty of programmers that have taken certifications and have high scores that couldn't really do squat in the real world. ie booksmart
As for the Danile Stephen syndrome, that is a complete idiot that should be banned by IP address. I wrote one article but pulled it after I realized that I hadn't delivered what the population expected.
All the best
Eric C. Tomlinson
No comment, Mr. Senator





That's an interesting question. I don't know as much about the research paper ranking as the search engine itself, and I'd have to go read up again on that to really have an indepth conversation. I'm not trying to pass myself off as an authority , just throw some stuff out there that I found interesting in the past.
I agree totally with you about the certification stuff. I have some certifications myself that I think were too easy to get (I only tout them on my resume), and I've met enough overcertified shmucks in the last few years to last me the rest of my life.
A friend of mine said years ago that the main problem is that programming is not a profession, in the way that being a lawyer or a doctor is a profession. It's missing several criteria:
1) The uniform is usually an indicator of professional status
2) Industrystandard certification required to practice
3) Some sort of code of conduct, often accompanied by a "sacred" oath
4) Legal recognition of the above (often but not always excluding the uniform!)
You get the idea. Policemen are closer to being true professionals than we are. The certifications out there exist so that companies can claim wide acceptance of their respective technologies; it's in the best interests of a company to certify as many people as possible, as long as they can maintain some semblance of professionalism as they go about it. I think this explains why almost all testing processes for IT folks are mediocre.
I think that with the pressures exerted on government and other important projects by ISO certification requirements and the like, we're gradually moving towards professionalism for the industry. I hate to hear everybody prattling on about 6 Sigma certification etc. as if it's some magic bullet, but I guess that's more about project management than development anyway.
Regards,
Jeff Varszegi





Jeff Varszegi wrote:
1) The uniform is usually an indicator of professional status
2) Industrystandard certification required to practice
3) Some sort of code of conduct, often accompanied by a "sacred" oath
4) Legal recognition of the above (often but not always excluding the uniform!)
This is an excellent point. But how the hell can we get enough programmers to agree on what we need to do to establish our credability. We can even decide on multiple OS.
No comment, Mr. Senator





multiple OS
You're totally right! And I suspect that if we were to split up the profession properly, so that we had graphics specialists, etc., the way they have in the medical field, we'd probably wind up with the most specializations of any profession yet seen. I think that the OS schism is driven by company loyalty/hatred, which I think you also wouldn't see in other professions very much. We're a motley bunch. I'm sure all of these questions will be answered someday, but not during my career. It's a crappy time to be a programmer we're past the heady dotcom times, but without anything good to replace them. Now we suddenly find ourselves "commoditized" in a global marketplace, without any sort of professional standards on the basis of which to compete; but the companies that are getting the big outsourcing contracts are all trumpeting their certifications. (I'm not railing against those companies in places like India everybody's gotta eat. It doesn't mean that I have to like what's happening to my livelihood, though.)
Regards,
Jeff Varszegi





I agree that everyone has the right to eat, but I have been called in after a "overseas" implementation to clean up for several months. The documentation was horrible, the delivered product didn't match the requirements and there was no plan in place as to who would maintain it. The company decided that having either consultants or staff on hand (even though it seems more expensive from the first look) to develop and maintain the system.
Even those service related companies, for example Dell, are going to end up bringing back the customer service teams to the US. I had to call for service on my laptop and I had to call 4 times to finally reach a "chap" who could understand me as well as I understand him. Needless to say I was pretty hot under the collar and complained to company. After 3 months they announced that they were rehiring the displaced US workers and moving the call center back to the US.
I have nothing against Indians nor Pakastanis. I have worked with both on multiple projects and have seen nothing less than perfection from them.
No comment, Mr. Senator





I used to be mathematician. (Finished MechanicoMathematical department of Moscow State Univirsity).
I was amazed to see complex formulas i managed to forget long time ago.
George





Follow the following link to get some more complex formulas
http://www.codeproject.com/useritems/CodeProject_s_Voting.asp
BTW I have graduated the Novosibirsk State University, MechanicoMathematical department.
Ivan S Zapreev
cerndan@mail.ru






Do not worry ....
If you read this article then you wasted your time too
Ivan S Zapreev
cerndan@mail.ru





Hi Ivan. Hi all.
First of all, I had a good time reading big words
about small problems, especially in the posts!
(4th moment and so on)
The temptation of writing what is good and what is
bad in the article is there, but I wouldn't like
to play the same game (big words for small problems).
Anyway, the assumption than votes and weights are
independent is really too strong!
(perhaps they are *uncorrelated*)
Is you say so, you say "all people vote the same way"
(same distribution).
But weights are used because experienced people
(should) vote better...
P.S. I won't tell you my vote, but don't worry,
it has a really small weight!
Luca Piccareta





You are right
>>Anyway, the assumption than votes and weights are
>>independent is really too strong!
>>(perhaps they are *uncorrelated*)
is a weaker assumption and in this case all conclusions stay the same.
But please note that you have also payed attention to this article and besides, you also wrote some big words about small problems
BTW I really think that people vote the same way as I presume that they vote ideally i.e. depending on the quality of an article.
Ivan S Zapreev
cerndan@mail.ru





For Movie rating there is no gurus and rookies, but rating is still useful to estimate potential interest of movie based on peoples affinity : if someone like same movies as me, I should see also the movies he likes that I have not seen yet, and viceversa. Perhaps this affinity theory is also useful for CodeProject’s Article rating ?





Do you mean that if someone likes the same kind of articles as you do then he will not only read the same article with you but also will vote the same way?
Ivan S Zapreev
cerndan@mail.ru





Statistically, yes ! If I think I am a guru, I will naturally read other guru's article in priority, and viceversa.





I think that this influences ony the distribution of voters' weigts for each certain article.
Ivan S Zapreev
cerndan@mail.ru





Some speculation on an implementation:
Each user’s vote would have to be stored; as well as their voting history (normalized, probably).
Then, when you are presented with a list of articles, those who vote ‘similar’ to you will have more weight with how your listing is sorted.
This system would depend on how you vote, so those who vote very little should fall back on a default algorthm, or the algorithm should introduce the ‘similar votes’ term as a function of how many votes you’ve cast. As you vote for articles, the system gets more certain of your similitude to other members of the site. Perhaps an n–space could be used to classify users.
This would all be very expensive in terms of CPU time, but I see several optimizations. One would be to run the algorithm only once every few days, perhaps offserver. The vote hisory could be discarded after each one of these runs.





I entirely agree with You for all points !
nspace could be used to classify users : Yes ! and classifying this nspace would produce a psychological profile of the users ! (for example, for movie rating, an nspace can correspond to movie category : fantasy, sciencefiction, ... and for article voting, this can correspond to level of expertise : didactic article or expert article)
CPU time : I think this is the reason why IMDB as not yet a affinity classifier for voting member : it is very expensive of time consuming, but that is worth the blow for predicting movie attractiveness.





See also : Collaborative Filtering : www.vsnetfr.com/lien.aspx?ID=3923





No math here :
guru : people with high membership level
rookies : people with low membership level
(I) "Wi and Vi independent" imply : guru and rookie vote in similar way
(D) "Wi and Vi dependent" imply : guru and rookie don't vote in similar way
(given you level of math, and to keep that post short, I don't think I need to proove or explain better the aboves implication, but I can if you ask so)
if (I) is true (that is your hypothesis when you write E(fg) = E(f)*E(g)), then the weigth don't change the vote. That's correct, and nobody need stat to understand that if the guru alone would give the same note that the rookies alone, the combined note would be the same.
if (D) is true (that is code project hypothesis), the weight system allow the combined note to be closer to the opinion of the guru. And also, if (D) is true, your demonstration is false.
So, rather than a show off of mathematical formulas, I expected, from the title of the article, a dicussion on :
Does the guru vote so differently than the rooky ?
or:
Why weights with 1 2 3 4 5 instead of 10 20 30 40 50 ?
or:
Why not a note by membership level in addition to the global one ?
or
Why not discarding the 10 percent lower and 10 percent higher note to reduce data polution.
and there are many others.
As a matter of fact, the ranting system is an ARBITRARY choice, reflecting codeguru personal opinions on the best way to give feedback on article interest.
Weighted sum as they are used seems ok to me, as they give a weighted mean if (D), and the correct mean for (I) too (you only proved that).
But your concusion is wrong. When you say :
"It was discovered that although the weight of each person in the system is taken into account, the mean value doesn’t depend from it."
, you make a 'statistician rooky' mistake, forgetting that a demonstration as no value as proof if the hypothesis are no checked.
Of course, up to you to pretend that it's a matter of personal opinion to think that experienced programmer judge an article the same way than unexperienced one,
and up to me to think that as long as the rating system is logical mathematically correct and make common sens assomptions, I won't find it smart or dumb, but just fitting for the job.





Hello,
>>(I) "Wi and Vi independent" imply : guru and rookie vote in similar way
>>(D) "Wi and Vi dependent" imply : guru and rookie don't vote in similar way
>>(given you level of math, and to keep that post short, I don't think I need
>>to proove or explain better the aboves implication, but I can if you ask so)
Please provide me with the proof and also explain what are Wi and Vi and why do you speak of their dependency and independency.
>>if (D) is true (that is code project hypothesis), the weight system allow the combined note to
>>be closer to the opinion of the guru. And also, if (D) is true, your demonstration is false.
Right because in this case the vote and weight random variables are dependent.
>>So, rather than a show off of mathematical formulas, I expected, from the title of the article,
>>a dicussion on :
>>Does the guru vote so differently than the rooky ?
>>or: Why weights with 1 2 3 4 5 instead of 10 20 30 40 50 ?
>>or: Why not a note by membership level in addition to the global one ?
>>or: Why not discarding the 10 percent lower and 10 percent higher note to reduce data
>>polution.
>>and there are many others.
These questions are beyond the scope of my article but may be later I will also concern them.
>>Weighted sum as they are used seems ok to me, as they give a weighted mean if (D), and
>>the correct mean for (I) too (you only proved that).
This makes sense. That is why I voted 5 for your post J. But what do you mean by “as they give a weighted mean if (D)” can you predict and estimate the influence?
>>But your concusion is wrong.
>> When you say :
>>"It was discovered that although the weight of each person in the system is taken into
>>account, the mean value doesn’t depend from it."
>>, you make a 'statistician rooky' mistake, forgetting that a demonstration as no value as proof
>>if the hypothesis are no checked.
I would not be so captious if I was you but that is yours opinion. Although I agree that to make a proper model you have to do statistical analysis. This requires a lot of data I do not have access to. That is why I made some assumptions and proved something for them this is better then not to do anything at all.
>>Of course, up to you to pretend that it's a matter of personal opinion to think that
>>experienced programmer judge an article the same way than unexperienced one,
>>and up to me to think that as long as the rating system is logical mathematically correct and
>>make common sens assomptions, I won't find it smart or dumb, but just fitting for the job.
So many men, so many minds … Good conclusion!
Thanks for the valuable feedback!
Ps: By the way the title of your post seems to be strange...
Ivan S Zapreev
cerndan@mail.ru





BTW. I think that in general you are right and the approach with weights is more general that is why I have fixed my conclusions.
Thanks,
Ivan S Zapreev
cerndan@mail.ru





Hello,
>>Please provide me with the proof and also explain what are Wi and Vi >>and why do you speak of their dependency and independency.
You are right, I typed too fast. I meant Wt and Vt, and will use Wi and Vi for instances of Wt and Vt.
I speak about dependency because you said in you article :
"If they are independent, then E(fg) = E(f)E(g)"
Well, if they are NOT independent, E(fg) become different from E(f)E(g), and you calculation of E(Rn) can't be done this way because :
sigma(E(WiVi/sigma(Wi))) is not equal to sigma(E(Wi/sigma(Wi)))*E(Vi)
and you can't conclude the E(Rn) doesn't depend from weight.
Definition for statistical independence :
P(AB) = P(A)
In english : the probability P of the event A knowing the event B is the same as the probaility of the event A alone.
In our case, A=Vi is a vote between the possible vote values, and B=Wi the membership level of the voter.
If Vt is independant of Wt, the probability for a rooky to give a 1 on a specific article is the same as the probability for a guru to do so. And the same for 2, 3, 4 and 5. In plain english, it mean they vote the same way.
If P(VtWt) is not equal to P(Vt), it mean the probability to get a vote Vi is different when you know Wt for at least some Vi. I othere words, it means that the probability to get a given vote is not the same depending on the voter weight.
>>But what do you mean by “as they give a weighted mean if (D)” can >>you predict and estimate the influence?
Absolutly no The impact depend of the the shape of the vote distributions for each weight level, so you would need the "real" distribution of the votes. I would postulate it may be different from one article to the other(normal, lognormal, bimodal etc...), and many article don't get enough vote to assess the distribution.
The simpler way to describe the weighted sum is to say it's equivalent to have people vote as many time as their weight. A level 5 guru vote as the same impact on the average vote as 5 level 1 rookies giving the same vote.
But you made a good point : The need for stats and 'model' arise when you want to make predictions.
>> This requires a lot of data I do not have access to
Yes, you need data to make an analysis, but it does not mean you can always find a model out of the data. In that case, you would have to use the parcimony principle, and give a article's value estimator using as few assumptions as possible : the aritmetic mean
>> Ps: By the way the title of your post seems to be strange...
I would have expected, after you found E(Rn) to ignore weight, that you went back and explain why and in which cases. But you just stopped on a conclusion hurting common sens and made it look as a fact because of the statistical formalism.
Thank for your article, even (or because) it forced me to look deeper in math books (I didn't remebered the mathematical definiton of dependency, while I use the concept all the time)
minox





Assume a uniform independent distribution of random variables
such that samples are drawn from the following sample space [1,5].
Consider a possible sample population:
{1, 1, 1, 4, 5}
Given equal weights, the mean of the sample population is:
2.4
Using your notation, let a = 2.4.
We therefore expect to find means of our sample independent
of the weights and equal to 2.4, the expected value of our
distribution.
Now, consider the following two weight vectors:
{.3, .3, .3, .05, .05}
The mean of our distribution is now:
1.35
Consider an alternate set of weights:
{.1, .1, .1, 0.35, 0.35}
The mean of the distribution is now:
2.19
Neither mean is equal to the expected mean of 2.4.





Dear, John Theal
It was very interesting to learn your counterexample.
Unfortunately it doesn''t sound.
First of all:
>> Assume a uniform independent distribution of random variables
>> such that samples are drawn from the following sample space [1,5].
In this case the mean value of the vote is 3.
As far as you have 5 possible values:
1,2,3,4,5
According to your assumption about uniform distribution for each of them the probability equals 1/5.
Thus from the definition of the mean value we have:
1*1/5+2*1/5+3*1/5+4*1/5+5*1/5 = 3
>> Consider a possible sample population:
>> {1, 1, 1, 4, 5}
>> Given equal weights, the mean of the sample population is:
>> 2.4
>> Using your notation, let a = 2.4
>> We therefore expect to find means of our sample independent
>> of the weights and equal to 2.4, the expected value of our
>> distribution.
>> Now, consider the following two weight vectors:
>> {.3, .3, .3, .05, .05}
>> The mean of our distribution is now:
>> 1.35
>> Consider an alternate set of weights:
>> {.1, .1, .1, 0.35, 0.35}
>> The mean of the distribution is now:
>> 2.19
From your reasoning represented above it is obvious that you do not see the difference between the rating value itself and the mean value of the random variable in the sense of the probability theory.
Unfortunately this is completely wrong and it looks like you have to recall at least the definitions of the mean value of the random variable.
The values 1.35 and 2.19 that you have calculated are the values of the rating. They represent the concrete example and have almost noting to do with the mean value of the rating variable.
The point is that there is always dispersion relative to the mean value that can be estimated by calculation of the variance of the random variable.
Sorry for disappointing you but thanks for your interest!
Ivan S Zapreev
cerndan@mail.ru






This only showed that you are not aware of the definition of the mean value.
It seems to me that you didn’t read my article carefully but I am not going to explain everything again. Simply try to read it once more.
BTW information that you have provided about Markov chains is quite known and thus represents nothing new to me.
Ps: It is much easier to write about possibility to do smth than to really do it.
Ivan S Zapreev
cerndan@mail.ru





(These calculations are trivial and self descriptive)
That line gave me a chuckle. I used to be a math geek, but I never got into statistics so my eyes started to glaze over...
Mike
Personal stuff:: Ericahist  Homepage
Shareware stuff:: 1ClickPicGrabber  RightClickEncrypt
CP stuff:: CP SearchBar v2.0.2  C++ Forum FAQ

Pinky, are you pondering what I'm pondering?
I think so Brain, but if we shaved our heads, we'd look like weasels!





I just to played the fool not to give the detailed explanation
This is widely used among mathematicians it not it?
But still if something is unclear I can give additional explanations.
Ivan S Zapreev
cerndan@mail.ru





This is not a proof.
You have shown nothing of interest here other than the mean of
some (unspecified) distribution is equivalent to the mean that
you have assumed. You should look up the implications of leptokurticity. Then you
can calculate an estimator for a (specific) distribution instead of
showing that the mean of a distribution (assumed to have a mean of a) is equal to a.
I don't buy your independence assumption. Something of more interest
would have considered the kurtosis of the distribution, NOT the mean.





John Theal wrote:
implications of leptokurticity
Hey , I didnt say anything!
^{top secret AdvancedTextBox}





Hi,
>>You have shown nothing of interest here other than the mean of
>>some (unspecified) distribution is equivalent to the mean that
>>you have assumed.
You are right but the mean value itself shows that long time average value of the rating is simply the same as the average value of the vote and doesn't depend on the members’ status.
Even more it is the same as the mean value for other voting types I have discussed in other article. But the way this value is calculated is more complicated.
>>You should look up the implications of leptokurticity.
Sorry I didn't get it. Can you express it in more popular way?
>>Then you can calculate an estimator for a (specific) distribution instead
>>of showing that the mean of a distribution (assumed to have a mean of a) is >>equal to a.
The distribution of the rating might be different from the distribution of the vote itself especially as the weight distribution in explicitly involved. I only showed that the mean value of rating distribution doesn't depend from the distribution of weights although it could be expected.
>>This isn't even a model, this is a textbook exercise.
True!
Even more this is an exercise for the introduction to the theory of probability that anyone should know! But it still allows us to do some analysis.
By the way I called it a model only because I made some assumptions about independency of the random variables etc.
>>I don't even buy your independence assumption, it is more likely that
>>these votes are conditionally heteroskedastic. Something of more interest
>>would have considered the kurtosis of the distribution, NOT the mean.
This sounds smart but meaningless to me. Try to use standard terms to express your ideas.
Ivan S Zapreev
cerndan@mail.ru





Kurtosis is the 4th order moment of the distribution (the mean being the 1st order moment and the std. dev. the 2nd and variance the 3rd.) It measures the peakedness of the distribution.
Kurtosis, leptokurtic and conditional heteroskedasticity are all common terms used in the literature. You cannot read articles in the statistical journals without knowing them.
You can find more information here: http://mathworld.wolfram.com/Kurtosis.html[^]





Thank you for provided information.
Why do you think that the 4th order moment is so interesting for the voting system? What about the 2th order moment?
Ivan S Zapreev
cerndan@mail.ru





All you showed in this article is that assuming independence of votes and weights, the expected rating has the same value as the expected vote.
This seems quite obvious, and in fact I wouldn't trust a rating algorithm that didn't have a similar property. Yet in your conclusion you say that this algorithm is not good. Care to explain why?
Besides, you assumption of independence is incorrect. When a user votes, he can see the current rating and his vote is influenced by it. Also, it's quite possible that people with different weights have different voting distributions.





Hello there,
>>All you showed in this article is that assuming independence of votes and
>>weights, the expected rating has the same value as the expected vote.
Yes, It seems to me that you have got it.
>>This seems quite obvious,
Good for you.
>>and in fact I wouldn't trust a rating algorithm
>>that didn't have a similar property.
To use weights is a good idea the question is how to use them to get better results.
>>Yet in your conclusion you say that
>>this algorithm is not good. Care to explain why?
I didn't say anything like that. Care to read it more carefully.
The only point is that for the given model the mean values of three approaches are the same. Why to choose the complicated one then?
>>Besides, you assumption of independence is incorrect. When a user votes, he
>>can see the current rating and his vote is influenced by it.
Very doubtfully. This concerns only people who have no personal opinion.
>>Also, it's quite possible that people with different weights have
>>different voting distributions.
Everything is possible. This is only a model and I give the solution to it.
For another model there could be another solution.
Besides, to choose a proper model you have to do statistical analysis.
Until that your assumptions are not better than mine.
Ivan S Zapreev
cerndan@mail.ru







General News Suggestion Question Bug Answer Joke Rant Admin Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Type  Article 
Licence  
First Posted  16 Apr 2004 
Views  124,130 
Bookmarked  19 times 

