STAR vs. Score

Keith Edmonds

@Jack-Waugh The word strategy appears several times on that page. I do not now exactly how he coded the different strategies but I figure that could help you on your way to doing research.

I am unaware of any property called "the balance condition" and electowiki does not have such a page. Do you intend to refer to The Test of Balance given here. If so I think what you are saying is that in Nash equilibrium two systems which pass this criteria should behave such that the strategy of different factions cancel each other out and they both produce the same winner. The flaw in that logic is the assumption of an underlying symmetry in the size of groups and how that interacts with compromise/utilitarian or majoritarian winners. As I said above, STAR is majoritarian and Score is Utilitarian. In the absence of strategy these systems will give different winners. So even if all the strategy cancelled you would not expect the same winners.

Jack Waugh

@Keith, by "the balance condition", I meant Frohnmayer balance.

Jack Waugh

@Keith, I was not thinking anything about a Nash equilibrium. I do not know what that is.

Keith Edmonds

@Jack-Waugh Yes that is the same concept as I gave the link to. My comments still apply.

Essenzia

@Jack-Waugh
Defect SV
Voting without strategies, range [0,9]:
A[9] B[3] C[0]
The voter thinks that B and C are the 2 frontrunners, therefore:
SV: A[9] B[9] C[0]
Whether with the min-max strategy, or without, the voter wants his vote to be worth the maximum in the clash between B and C (certainly not B[1] C[0]).
STAR: A[9] B[3] C[0]
In the clash between only B and C, the vote would automatically become B[9] C[0], so the voter does not need to lie at the beginning.
If he uses min-max the vote becomes:
A[9] B[1] C[0] which is still better than SV.

STAR majoritarian
Votes without strategy like this:
55%: A[9] B[8] C[0]
45%: A[0] B[8] C[9]
STAR wins A while SV wins B, with almost double the points of A.
In a strategic min-max context (and forecast on frontrunners) the votes would become as follows:
55%: A[9] B[0] C[0]
45%: A[0] B[0] C[9]
and A would also win in SV.
The point is that we should make B win as much as possible in a similar context and with STAR, B certainly never wins (strategies or not).
The practical example would actually be:
55%: A > B
45%: A < B
with A and B finalists. The individual ratings of A and B can change but in the majority methods, A always wins, while in the utilitarian ones, B can win when it has greater utility.

Other methods such as STLR and DV avoid the SV defect, without however being majoritarian like STAR. However, they can have other flaws (es. STLR can lose its positive sides in the presence of clones making STAR a better alternative for semplicity, DV fails monotony).

cfrank

STAR does still suffer from some of the same drawbacks as Score, mainly that it is majoritarian, although it does manage to approach a consensus model more closely and to reduce the effectiveness of strategy. I do think that STAR is a pretty good system and that it is superior to Score.

One modification I would make to STAR is something I have advocated before, which is an exponential demerit score system. Basically, there are N score choices to give to each candidate, each score option being an order of magnitude higher than the next smallest. Candidates are approved by a lack of demerits, and the two candidates with the fewest demerits have a runoff between them based on which was scored more favorably than the other more often. The runoff is for the same reason as in STAR while the exponential demerit system empowers a broader consensus over a majoritarian front.

I wonder if there is a systematic way to vary the weight between consensus candidates and majority candidates. For example, a candidate can be called (S,P)-consensual if at least a P-fraction of voters scored that candidate at least an S. If you plot all (S,P) such that there is an (S,P)-consensual candidate, you can choose a method to try to trade-off S versus P in a controlled way. For example, you could select a candidate with a minimum value of

((S-Smin)/(Smax-Smin)-1)^2+(1-P)^2

Really any reasonable metric could do.

Keith Edmonds

@cfrank Score is not Majoritarian. It is Utilitarian.

Score is Utilitarian but does not adjust voter impact to reduce strategic incentives. STAR is Majoritarian but does adjust voter impact to reduce strategic incentives. I played with this issue a lot and came to the compromise that STLR voting was the best tradeoff.

https://electowiki.org/wiki/STLR_voting

Its also worth noting that some people actually prefer Majoritarian systems and think that the tyranny of the majority is justified. Many of these people are the IRV supporters. This makes STAR more desirable to them and therefor a good system to campaign for strategically.

Jack Waugh

I think all the replies to date on this topic are from people who either, like me, have skepticism about STAR, or who outright oppose it on various grounds. The STAR advocates haven't yet spoken up. I want one of them to present an example where the results differ and say that they think STAR produced better voter satisfaction overall than Score would have. I expect I will be able to counter that when they worked their example, they did not use the optimal strategy with Score. So the end product I expect to happen from such exchanges is a lack of cases that count for STAR. If the outcomes aren't any better, there is no justification for the extra complexity.

Jack Waugh

STAR is proposed as an "IRV 2.0." In response I propose an "IRV 3.0," which has more to do with IRV, because it accepts votes in IRV style, which STAR does not.

cfrank

@Keith Yes that is true, although I don't really understand why people would be content with tyranny by the majority when it may be possible to avoid it (unless they are a part of the majority). I'm actually not much of a fan of utilitarianism, I am in favor of something that is distributionally just. STLR voting is interesting.

@Jack-Waugh, in my opinion STAR is way better than IRV. I'm not exactly an advocate of STAR, but I think in general that rank-order voting is probably not going to be a great solution. I think that "independent" scores is really the best I think we can hope for for now, and that means it's what we do with those scores and how that interacts with voters' decision-making that's important.

Keith Edmonds

@Jack-Waugh What STAR does is it renormalizes everybodies vote weight to give them the same impact. This is an attempt to reduce the amount of strategy needed. I do not think that it would outperform somebody who used optimal strategy with score. The point is that most people do not or cannot use optimal strategy. STAR then puts people bad at strategy on a closer level to those who are good at strategy. So I do not think you are wrong in what you say. If all people where fully informed, rational and strategic then score would likely be better. However, people are not any of those things in general. I do not think your like of argument will hold up under this consideration.

An example of where score produces a better outcome than score is

40% = A:5 B:0 C:0
31% = A:0 B:5 C:1
29% = A:0 B:1 C:5

Score give A and STAR gives B. This is an engineered and somewhat extreme example to illustrate the issue. Is 5 infinitely more than 0 or just 5. Is 5 weighted as 4 more than 1 or 5 times. There is no universal metric and different people will choose different metrics. STAR normalizes it all away and compares the two most favoured with full weight to each voter.

STAR is a simplified version of Baldwin's Method. When you think about it that way you see the intent.

Essenzia

@Keith
Given these 3 types of ratings (assuming they are the ratings of the 2 frontrunners, after eliminating all the others):
[0,1] - [2,3] - [4,5]
STLR normalizes them like this:
[0,5] - [3.33,5] - [4,5]
Baldwin normalizes them like this:
[0,5] - [0,5] - [0,5]

For me, STLR uses better normalization but I don't think it's the best.
If a vote like this: [4,5] remain the same in the clash between the two finalists, the voter from the start will be encouraged to downplay the rating of the worst candidate of the 2 (i.e., to vote like this from the start [0,5] ).
I prefer this normalization in clash between two finalists:

if you have a couple [0,0] or [5,5] the vote is irrelevant.
if one of the two candidates has a score of 5, the other is put at 0.
if one of the two candidates has a score of 0, the other is put at 5.
if both candidates have intermediate scores, then STLR normalization applies.

For simplicity, I call START the STAR that uses this normalization.
In this way, at the beginning the voter:

first assigns 5 to his most favorite candidates and 0 to the most hated ones.
then he can feel freer in assigning intermediate scores.

Such normalization is proposed indirectly in Tragni's method, although in that context it is used to make comparisons between couples.

SaraWolk

STAR Voting is designed to maximize both utilitarianism and finding majority supported winners where possible. I look at it like a debate between quality and quantity. Both are important. In STAR Voting the scoring round measures quality of support, (how much do the voters like the various candidates. Then, the runoff measures quantity or number of supporters, (between the two front-runners, which do you prefer.)

As for why I believe STAR Voting is more fair and representative compared to Score, of course I have to start with the disclaimer that Score is a very good system, and they get the same winner most of the time, but in Score Voting if I vote honestly and don't give any front-runners a top score, then my vote's impact is less than if I had strategically given my lesser-evil a top score.

Another shortcoming with Score that is addressed by STAR is if some voters fail to use the full scale. Strategically speaking, the best strategy in both methods is always to give your favorite 5 stars, but it's to be expected that some voters will give a mediocre score to a mediocre favorite, especially if they are new to the system. Unless you normalize scores, Score voting gives voters a chance at an equally weighted vote, but doesn't actually guarantee it. Hopefully this will be corrected with voter education and good instructions, but with STAR there's the added failsafe that the runoff is binary. Ultimately your vote is just as powerful as everyone else's.

These voters, voters who are currently marginalized in our current system, should have just as powerful a vote as a voter who does support a frontrunner. Guaranteed. STAR Voting does that. If your favorite can't win, you can give your favorite 5 stars, give your lesser evil 1 star, and that will still ensure that if it comes down to it, your fully weighted vote will help prevent your worst case scenario. That's how STAR Voting prevents tyranny of the Majority. If your lesser-evil is actually substantially better than your worst case scenario you can give them a better score.

As far as I know STAR may be the only method where even if none of the candidates you like can win, your vote can still make a difference and help prevent your worst case scenario.

SaraWolk

@Keith Exactly. And the intent is not only to reduce the need for strategic voting, but to actually incentivize honest voting, and to ensure that the system is fair and equal. This is the key to eliminating an "electability" bias, or status quo glass ceiling. There are a lot of reasons why and it's not just about any one of these reasons in isolation. I see it as a very empowering voting method overall.

For voters who don't like the frontrunners, their vote is still as powerful as a voter who does have a strong candidate on their side. The full repercussions of this are hard to quantify, but this is one reason that I think STAR is the most powerful single winner voting method to break two party domination.

Jack Waugh

@cfrank said in STAR vs. Score:

in my opinion STAR is way better than IRV.

No question.

Jack Waugh

@SaraWolk Nevertheless, you have not as yet provided an example, starting with voter desire, and leading to different outcomes between STAR and Score.

Jack Waugh

@SaraWolk said in STAR vs. Score:

Strategically speaking, the best strategy in both methods is always to give your favorite 5 stars

Not according to STAR and the Nader problem.

And I suppose that if STAR can exhibit this kind of behavior, so can cardinal Baldwin. I see STAR as fundamentally, abbreviated cardinal Baldwin. The abbreviation is achieved by combining the rounds of tallying except the last one.

SaraWolk

STAR does not pass FB criterion, so yes, there is a hypothetical scenario possible where giving your favorite less than 5 could be beneficial, but that does not mean that there's a real election scenario where that's actionable or incentivised in real time with a realistic amount of information on voter behavior available.

The mark of an ideal system is to balance competing considerations and incentives to give something that's robust all around. Score in most cases will get the same outcomes, and so I personally don't think that accuracy is the principle to look at to differentiate between them. The biggest difference is in terms of real world advocacy. Score is a dealbreaker because vote weight isn't normalized. We get attacks on STAR regularly that are not true about STAR, but that are about Score.

Strategic voting aside look at this example:
Voter Vicki is a disenfranchised voter who typically doesn't like the frontrunners in her city, which amazingly uses Score voting to elect the mayor. In this race the frontrunners are named Bad and Worse, and there are a few other options as well. She gets her ballot and fills it out honestly like so:
Bad: 1
Worse: 0
Boring: 2
Lame: 3
Obscure: 5
Because Vicki really dislikes both frontrunners, her vote is predictably less powerful than someone who actually does like one of the frontrunners and dislikes the other. Vicki's vote is thus dependably less powerful than other voters and she remains marginalized. In contrast, STAR Voting guarantees Vicki an equal and fully powerful vote for the finalist she prefers.

Sure, some people could argue that since her strength of preference is weaker it's fair that her vote cary less weight, but most would disagree.

PS. Cardinal Baldwin isn't monotonic. The extra rounds and drawn out process make a difference, so they really aren't the same systems. Just similar.

Jack Waugh

@SaraWolk said in STAR vs. Score:

some people could argue that since her strength of preference is weaker it's fair that her vote carry less weight

No, of course not. Everyone deserves the same weight. But in Score, she has to vote Bad 4 or 5.

Multiround tallying systems are confusing. They produce results that belie the expectation that all balanced systems would behave identically. And for me this expectation came from the logic that if two systems behave differently, at least one of them must be cheating some voter out of some of her rightful power, which contradicts the assumption that both systems are balanced.

I guess some of your points are:

the decision between the top two may matter more than the decision between a random two. So STAR makes sure everyone has full strength in that decision even if they vote their desires without regard to any estimate of where the other voters stand.
STAR performs much better than Score when voters vote that way.
STAR makes it difficult to find a better performing strategy, even though theoretically, one exists. The signal--to-noise ratio for finding it is prohibitively low.

Let's add to the candidate field of your example, Bad II, a clone of Bad, and Worse II, a clone of Worse. Of course, all the voters are aware this has happened, so can adjust their strategies. Since the four bad and worse candidates are the front runners, unless there is a significant upset (difference between perception of where the voters stand and where they turn out to actually stand), the finalists will be Bad and Bad II or Worse and Worse II. How should Vicki vote?

SaraWolk

@Jack-Waugh
Bad1: 1 star
Bad2: 1 star
Worse1: 0 stars
Worse2: 0 stars

I disagree that the fact that score and STAR don't produce identical results means that one or the other is cheating. Neither is cheating voters. They are both good methods and the they optimize for slightly different things.

STAR optimizes for both strength of support and number of supporters.
Score optimizes for strength of support specifically.

Both are valid goals and methods, but there's a real world benefit to narrowing down the list of proposals to help lay people make a good choice. If we promote both loudly (and also list all other good methods we can think of) the considerations would be overwhelming to most and would lead most to get overwhelmed and quit researching, or worse, come to a decision after only considering a one-sided set of considerations.

Take Condorcet for example. Condorcet has largely failed to get adopted anywhere because of lack of consensus around the best version, despite that all versions are quite a bit better than most methods in use. If Condorcet advocates had come together around a good well rounded proposal and simplified their pitch a long time ago it would likely be the dominant RCV method, but no, they focused on academic debate over cohesive advocacy. We cardinal advocates should take note. Are we debating because we want better democracy in the real world, or because we find the question interesting and enjoy the debate for its own sake?