STAR vs. Score

Keith Edmonds

@Jack-Waugh Yes that is the same concept as I gave the link to. My comments still apply.

Essenzia

@Jack-Waugh
Defect SV
Voting without strategies, range [0,9]:
A[9] B[3] C[0]
The voter thinks that B and C are the 2 frontrunners, therefore:
SV: A[9] B[9] C[0]
Whether with the min-max strategy, or without, the voter wants his vote to be worth the maximum in the clash between B and C (certainly not B[1] C[0]).
STAR: A[9] B[3] C[0]
In the clash between only B and C, the vote would automatically become B[9] C[0], so the voter does not need to lie at the beginning.
If he uses min-max the vote becomes:
A[9] B[1] C[0] which is still better than SV.

STAR majoritarian
Votes without strategy like this:
55%: A[9] B[8] C[0]
45%: A[0] B[8] C[9]
STAR wins A while SV wins B, with almost double the points of A.
In a strategic min-max context (and forecast on frontrunners) the votes would become as follows:
55%: A[9] B[0] C[0]
45%: A[0] B[0] C[9]
and A would also win in SV.
The point is that we should make B win as much as possible in a similar context and with STAR, B certainly never wins (strategies or not).
The practical example would actually be:
55%: A > B
45%: A < B
with A and B finalists. The individual ratings of A and B can change but in the majority methods, A always wins, while in the utilitarian ones, B can win when it has greater utility.

Other methods such as STLR and DV avoid the SV defect, without however being majoritarian like STAR. However, they can have other flaws (es. STLR can lose its positive sides in the presence of clones making STAR a better alternative for semplicity, DV fails monotony).

cfrank

STAR does still suffer from some of the same drawbacks as Score, mainly that it is majoritarian, although it does manage to approach a consensus model more closely and to reduce the effectiveness of strategy. I do think that STAR is a pretty good system and that it is superior to Score.

One modification I would make to STAR is something I have advocated before, which is an exponential demerit score system. Basically, there are N score choices to give to each candidate, each score option being an order of magnitude higher than the next smallest. Candidates are approved by a lack of demerits, and the two candidates with the fewest demerits have a runoff between them based on which was scored more favorably than the other more often. The runoff is for the same reason as in STAR while the exponential demerit system empowers a broader consensus over a majoritarian front.

I wonder if there is a systematic way to vary the weight between consensus candidates and majority candidates. For example, a candidate can be called (S,P)-consensual if at least a P-fraction of voters scored that candidate at least an S. If you plot all (S,P) such that there is an (S,P)-consensual candidate, you can choose a method to try to trade-off S versus P in a controlled way. For example, you could select a candidate with a minimum value of

((S-Smin)/(Smax-Smin)-1)^2+(1-P)^2

Really any reasonable metric could do.

Keith Edmonds

@cfrank Score is not Majoritarian. It is Utilitarian.

Score is Utilitarian but does not adjust voter impact to reduce strategic incentives. STAR is Majoritarian but does adjust voter impact to reduce strategic incentives. I played with this issue a lot and came to the compromise that STLR voting was the best tradeoff.

https://electowiki.org/wiki/STLR_voting

Its also worth noting that some people actually prefer Majoritarian systems and think that the tyranny of the majority is justified. Many of these people are the IRV supporters. This makes STAR more desirable to them and therefor a good system to campaign for strategically.

Jack Waugh

I think all the replies to date on this topic are from people who either, like me, have skepticism about STAR, or who outright oppose it on various grounds. The STAR advocates haven't yet spoken up. I want one of them to present an example where the results differ and say that they think STAR produced better voter satisfaction overall than Score would have. I expect I will be able to counter that when they worked their example, they did not use the optimal strategy with Score. So the end product I expect to happen from such exchanges is a lack of cases that count for STAR. If the outcomes aren't any better, there is no justification for the extra complexity.

Jack Waugh

STAR is proposed as an "IRV 2.0." In response I propose an "IRV 3.0," which has more to do with IRV, because it accepts votes in IRV style, which STAR does not.

cfrank

@Keith Yes that is true, although I don't really understand why people would be content with tyranny by the majority when it may be possible to avoid it (unless they are a part of the majority). I'm actually not much of a fan of utilitarianism, I am in favor of something that is distributionally just. STLR voting is interesting.

@Jack-Waugh, in my opinion STAR is way better than IRV. I'm not exactly an advocate of STAR, but I think in general that rank-order voting is probably not going to be a great solution. I think that "independent" scores is really the best I think we can hope for for now, and that means it's what we do with those scores and how that interacts with voters' decision-making that's important.

Keith Edmonds

@Jack-Waugh What STAR does is it renormalizes everybodies vote weight to give them the same impact. This is an attempt to reduce the amount of strategy needed. I do not think that it would outperform somebody who used optimal strategy with score. The point is that most people do not or cannot use optimal strategy. STAR then puts people bad at strategy on a closer level to those who are good at strategy. So I do not think you are wrong in what you say. If all people where fully informed, rational and strategic then score would likely be better. However, people are not any of those things in general. I do not think your like of argument will hold up under this consideration.

An example of where score produces a better outcome than score is

40% = A:5 B:0 C:0
31% = A:0 B:5 C:1
29% = A:0 B:1 C:5

Score give A and STAR gives B. This is an engineered and somewhat extreme example to illustrate the issue. Is 5 infinitely more than 0 or just 5. Is 5 weighted as 4 more than 1 or 5 times. There is no universal metric and different people will choose different metrics. STAR normalizes it all away and compares the two most favoured with full weight to each voter.

STAR is a simplified version of Baldwin's Method. When you think about it that way you see the intent.

Essenzia

@Keith
Given these 3 types of ratings (assuming they are the ratings of the 2 frontrunners, after eliminating all the others):
[0,1] - [2,3] - [4,5]
STLR normalizes them like this:
[0,5] - [3.33,5] - [4,5]
Baldwin normalizes them like this:
[0,5] - [0,5] - [0,5]

For me, STLR uses better normalization but I don't think it's the best.
If a vote like this: [4,5] remain the same in the clash between the two finalists, the voter from the start will be encouraged to downplay the rating of the worst candidate of the 2 (i.e., to vote like this from the start [0,5] ).
I prefer this normalization in clash between two finalists:

if you have a couple [0,0] or [5,5] the vote is irrelevant.
if one of the two candidates has a score of 5, the other is put at 0.
if one of the two candidates has a score of 0, the other is put at 5.
if both candidates have intermediate scores, then STLR normalization applies.

For simplicity, I call START the STAR that uses this normalization.
In this way, at the beginning the voter:

first assigns 5 to his most favorite candidates and 0 to the most hated ones.
then he can feel freer in assigning intermediate scores.

Such normalization is proposed indirectly in Tragni's method, although in that context it is used to make comparisons between couples.

SaraWolk

STAR Voting is designed to maximize both utilitarianism and finding majority supported winners where possible. I look at it like a debate between quality and quantity. Both are important. In STAR Voting the scoring round measures quality of support, (how much do the voters like the various candidates. Then, the runoff measures quantity or number of supporters, (between the two front-runners, which do you prefer.)

As for why I believe STAR Voting is more fair and representative compared to Score, of course I have to start with the disclaimer that Score is a very good system, and they get the same winner most of the time, but in Score Voting if I vote honestly and don't give any front-runners a top score, then my vote's impact is less than if I had strategically given my lesser-evil a top score.

Another shortcoming with Score that is addressed by STAR is if some voters fail to use the full scale. Strategically speaking, the best strategy in both methods is always to give your favorite 5 stars, but it's to be expected that some voters will give a mediocre score to a mediocre favorite, especially if they are new to the system. Unless you normalize scores, Score voting gives voters a chance at an equally weighted vote, but doesn't actually guarantee it. Hopefully this will be corrected with voter education and good instructions, but with STAR there's the added failsafe that the runoff is binary. Ultimately your vote is just as powerful as everyone else's.

These voters, voters who are currently marginalized in our current system, should have just as powerful a vote as a voter who does support a frontrunner. Guaranteed. STAR Voting does that. If your favorite can't win, you can give your favorite 5 stars, give your lesser evil 1 star, and that will still ensure that if it comes down to it, your fully weighted vote will help prevent your worst case scenario. That's how STAR Voting prevents tyranny of the Majority. If your lesser-evil is actually substantially better than your worst case scenario you can give them a better score.

As far as I know STAR may be the only method where even if none of the candidates you like can win, your vote can still make a difference and help prevent your worst case scenario.

SaraWolk

@Keith Exactly. And the intent is not only to reduce the need for strategic voting, but to actually incentivize honest voting, and to ensure that the system is fair and equal. This is the key to eliminating an "electability" bias, or status quo glass ceiling. There are a lot of reasons why and it's not just about any one of these reasons in isolation. I see it as a very empowering voting method overall.

For voters who don't like the frontrunners, their vote is still as powerful as a voter who does have a strong candidate on their side. The full repercussions of this are hard to quantify, but this is one reason that I think STAR is the most powerful single winner voting method to break two party domination.

Jack Waugh

@cfrank said in STAR vs. Score:

in my opinion STAR is way better than IRV.

No question.

Jack Waugh

@SaraWolk Nevertheless, you have not as yet provided an example, starting with voter desire, and leading to different outcomes between STAR and Score.

Jack Waugh

@SaraWolk said in STAR vs. Score:

Strategically speaking, the best strategy in both methods is always to give your favorite 5 stars

Not according to STAR and the Nader problem.

And I suppose that if STAR can exhibit this kind of behavior, so can cardinal Baldwin. I see STAR as fundamentally, abbreviated cardinal Baldwin. The abbreviation is achieved by combining the rounds of tallying except the last one.

SaraWolk

STAR does not pass FB criterion, so yes, there is a hypothetical scenario possible where giving your favorite less than 5 could be beneficial, but that does not mean that there's a real election scenario where that's actionable or incentivised in real time with a realistic amount of information on voter behavior available.

The mark of an ideal system is to balance competing considerations and incentives to give something that's robust all around. Score in most cases will get the same outcomes, and so I personally don't think that accuracy is the principle to look at to differentiate between them. The biggest difference is in terms of real world advocacy. Score is a dealbreaker because vote weight isn't normalized. We get attacks on STAR regularly that are not true about STAR, but that are about Score.

Strategic voting aside look at this example:
Voter Vicki is a disenfranchised voter who typically doesn't like the frontrunners in her city, which amazingly uses Score voting to elect the mayor. In this race the frontrunners are named Bad and Worse, and there are a few other options as well. She gets her ballot and fills it out honestly like so:
Bad: 1
Worse: 0
Boring: 2
Lame: 3
Obscure: 5
Because Vicki really dislikes both frontrunners, her vote is predictably less powerful than someone who actually does like one of the frontrunners and dislikes the other. Vicki's vote is thus dependably less powerful than other voters and she remains marginalized. In contrast, STAR Voting guarantees Vicki an equal and fully powerful vote for the finalist she prefers.

Sure, some people could argue that since her strength of preference is weaker it's fair that her vote cary less weight, but most would disagree.

PS. Cardinal Baldwin isn't monotonic. The extra rounds and drawn out process make a difference, so they really aren't the same systems. Just similar.

Jack Waugh

@SaraWolk said in STAR vs. Score:

some people could argue that since her strength of preference is weaker it's fair that her vote carry less weight

No, of course not. Everyone deserves the same weight. But in Score, she has to vote Bad 4 or 5.

Multiround tallying systems are confusing. They produce results that belie the expectation that all balanced systems would behave identically. And for me this expectation came from the logic that if two systems behave differently, at least one of them must be cheating some voter out of some of her rightful power, which contradicts the assumption that both systems are balanced.

I guess some of your points are:

the decision between the top two may matter more than the decision between a random two. So STAR makes sure everyone has full strength in that decision even if they vote their desires without regard to any estimate of where the other voters stand.
STAR performs much better than Score when voters vote that way.
STAR makes it difficult to find a better performing strategy, even though theoretically, one exists. The signal--to-noise ratio for finding it is prohibitively low.

Let's add to the candidate field of your example, Bad II, a clone of Bad, and Worse II, a clone of Worse. Of course, all the voters are aware this has happened, so can adjust their strategies. Since the four bad and worse candidates are the front runners, unless there is a significant upset (difference between perception of where the voters stand and where they turn out to actually stand), the finalists will be Bad and Bad II or Worse and Worse II. How should Vicki vote?

SaraWolk

@Jack-Waugh
Bad1: 1 star
Bad2: 1 star
Worse1: 0 stars
Worse2: 0 stars

I disagree that the fact that score and STAR don't produce identical results means that one or the other is cheating. Neither is cheating voters. They are both good methods and the they optimize for slightly different things.

STAR optimizes for both strength of support and number of supporters.
Score optimizes for strength of support specifically.

Both are valid goals and methods, but there's a real world benefit to narrowing down the list of proposals to help lay people make a good choice. If we promote both loudly (and also list all other good methods we can think of) the considerations would be overwhelming to most and would lead most to get overwhelmed and quit researching, or worse, come to a decision after only considering a one-sided set of considerations.

Take Condorcet for example. Condorcet has largely failed to get adopted anywhere because of lack of consensus around the best version, despite that all versions are quite a bit better than most methods in use. If Condorcet advocates had come together around a good well rounded proposal and simplified their pitch a long time ago it would likely be the dominant RCV method, but no, they focused on academic debate over cohesive advocacy. We cardinal advocates should take note. Are we debating because we want better democracy in the real world, or because we find the question interesting and enjoy the debate for its own sake?

Jack Waugh

@SaraWolk, I think that if the conditions here are that 99% of the electorate considers the race as being between the Bad party and the Worse party, and they aren't even taking Obscure into consideration as a possible winner, Vicky must give Bad[a] and Bad[b] scores of at least 4 so as to exert sufficient pressure to do her part toward preventing Worse[a] and Worse[b] from being the finalists. The existence of the clones reduces STAR to Score and so the situation demands the same strategy as would be appropriate for Score.

As to why pose questions and try to answer them, it's because I need to learn what is going on with these systems, to try to prevent being pulled into error again.

I am involved with a little political group that thinks it is drafting platform planks for a national-level party. I joined it for the sole purpose of trying to prevent error in its stance on voting systems. I would be concerned for the rhetorical effect on State-level parties if a national-level party publishes a severely misleading stance in this regard. The first draft of a platform on which this group bases its work (from another group) requires a ranking voting system in all cases. I feel that allowing that stance to stand would be severely misleading, because choosing ranking eliminates rating, and there are grounds to judge that several rating systems are more democratic than even the most democratic ranking systems. And especially more so than IRV, which is in practice what people mean when they call for ranked-choice voting.

I believe that of the people involved in the group, I have by far the most knowledge on single-winner voting systems. I think most of the group either don't care, or think I am the one who has the deepest understanding. Of course, they don't think I am infallible, and I have taken care to present myself as fallible. I said, I am not God and my opinions might not be correct, but, I keep saying, I can present arguments to support them. Interest in the details of these arguments has been slight to nonexistent. But I have been asked questions about what opinions I have, going outside of those I initially stated when approaching the other members of this group on this subject. For example, I have been asked whether I think IRV is better than FPtP.

I have been telling this group that STAR is at least as democratic as Score. I don't want egg all over my face from finding out later that it is false.

I supported IRV for years because it made intuitive sense to think that it gives third parties and independents a chance. After all, it tallies in rounds, and in an early round, you get a chance to support, effectively, your favorite candidate, and if that effort fails, you get a say in the final round as between the bad and the worse. It's very strongly intuitively attractive. It took discussion and argument and deeper study to see that my intuition was simply not correct. Intuition in general is not guaranteed to amount to a correct understanding of the facts in all cases, and neither is "common sense." Sometimes I think common sense is correct 80% of the time, and sometimes I think it is so only 20% of the time. Intuition and common sense are heuristics, mental shortcuts, useful for making emergency decisions when we do not have time for study in depth.

Recognizing that there are reasons for seeking deeper and more nearly rigorous understanding, I nevertheless encounter an effective obstacle in that I am neither practiced nor talented in math. I think for people who are, their intuition more closely matches the reality, quite as how people who are good at chess can assess a position. If my level of familiarity with math matched that of Turing, and Euler, and Ramanujan, and von Neumann and the uncredited females he stole ideas from, and Curry, and Amy Noether, I could probably work this out by myself. But I'm not at that level, and so tend to ask for help.

When you or I or any of the readers asserts that a single-winner voting system gives equal power to the voters, one voter to another, the correctness or incorrectness of that assertion turns on the matter of who is selected as the winner by that system.

Suppose groups of us are engaged in a literal tug of war. But rather than a single rope, there is a hub device and several ropes attached to it, leading to the groups of people who are going to pull on it. The hub device and the ropes are free to move over the ground. A circle is drawn in the grass, and the hub device placed at the center of that. Every group picks up their respective rope and pulls on it. The hub device will stay in the center if our forces balance to zero. Otherwise, it will be pulled toward some point on the circle. If our forces, person for person, are equal, surely only one outcome is possible. How can a contest go two different ways without changing the relative power of the participants? This point still confuses me.

At this point, I do not have a complete mathematical definition of voting equality. The closest thing I have to it is a pair of conditions that I argue are necessary. I give provisional credence to the idea that these conditions may also be sufficient, simply for lack, for now, of clear evidence to the contrary.

First condition: Frohnmayer balance. If one voter can move the needle, another voter must be able to move it back.
Second condition: best known freedom of expression. The best known is that shared by Score/STAR/Approval. Counterexamples that still meet the first condition include Borda count (requiring ranking all candidates), "vote for and against", and "vote for or against."

Clearly, Score and STAR meet both of these conditions.

The first condition is directly related to the final result, as it is defined in terms thereof. Being able to move the needle is defined in terms of effect on the final result under certain conditions, which can happen.

The second condition is indirectly related to the final result. The argument goes that if a system balances the power of voters whose honest stances or even strategic stances match votes that the system allows them to cast, but if there are other voters whose stances do not have corresponding votes that the system allows them to cast, they are being cheated because they are being partially muzzled. Clearly a system that allows them votes corresponding to their stances, and takes those votes fully into account in the tally, is giving them more power than a system that gives them a Sophie's Choice of votes that do not so precisely correspond to their stances as to the possible stances of other voters.

But anyway I'm still left confused about whether equality implies a unique result. Intuitively, it should.

rob

Here is an example where STAR produces a different result than Score on a Nader scenario, assuming that:

Gore and Bush are the two front runners,
it is very close between Gore and Bush
most voters that like Nader best, prefer Gore over Bush.

These are pretty reasonable assumptions based on the 2000 election. (right?)

Nader voters who attempted to best express their preferences might vote Nader: 5, Gore: 3, Bush: 0. Under Score, lots of people voting this way, rather than giving Gore a 5, could cause Gore to lose. But giving Gore a 5 disallows that voter from expressing their preference for Nader over Gore.

In STAR, they could express that preference without handing the election to Bush (their least favorite), since Gore and Bush end up being the two front runners, and 3 vs 0 counts as much as 5 vs 0 in the second round. In fact, no matter which two are the front runners, they have expressed their vote in the most effective (i.e. strategic) way.

The big problem with Score in this sort of scenario is that it can help entrench the 2 party system, since a 3rd party candidate like Nader would be discouraged from running (unless he runs under one of the major parties), since he can hurt those that like him, by causing their least liked candidate to win. That is, he has still split the vote, albeit not as strongly as under FPTP.

So yes, you could get different results under Score vs. STAR in that scenario, especially if you assume that not all voters are 100% sure who the front runners will be. (i.e. the more likely people are to wrongly guess that Nader might be a front runner, the more likely it would be for them to rate Gore lower than Nader)

In a 3 person race, I think STAR does really well. I have my doubts when it gets to be more than 3, which is why I'd prefer a method that selected the Condorcet candidate if one exists, and only hold that second round if there is no Condorcet winner.

Jack Waugh

@rob said in STAR vs. Score:

Nader voters who attempted to best express their preferences might vote Nader: 5, Gore: 3, Bush: 0.

But I'm pretty sure that's not the optimal strategy for Score. I think that many of them should vote Gore 5, and a few should vote Gore 4. They don't have to coordinate, to achieve that kind of a mix. If each individual dithers mentally between 5 and 4, the result can be random, so with everyone's random behavior, relative frequency follows probability.