Ordinal Score "(S,P)"systems

In a score voting system, there is the issue of what scores to allow voters to submit, as well as of how to bring together the various score ballots and produce an election winner. As such, my goal here is to produce a scorebased system with results that are independent of what particular score set is chosen, as long as the scores are ordinally ranked. My hypothesis is that, even with unlabeled scores, simply blank boxes on a line from left to right with an indication of positivity, would accrue meaning to voters who used such a system as will be described.
In a score system, an (S,P)consensual candidate is one who is scored at least an S by at least a Pfraction of the electorate. It is easy to show that a candidate is (S,P)consensual if and only if it is (R,Q)consensual for all R less than or equal to S and all Q less than or equal to P. The statement that a candidate is (S,P)consensual is essentially an argument supporting their fitness as the election winnerthe strength of the argument increases as S and P increase independently.
Let X be the random variable that is the score assigned to candidate C by a voter chosen uniformly at random. For each candidate, there is an (S,P) production possibilities frontier of points corresponding to the boundary of obtainable (S,P) values such that the candidate is (S,P)consensualit is simply the plot of Prob(X>= S) against scores S.
A candidate C is (S,P)dominated if there is another candidate K whose production possibilities engulfs that of C. A voting system is (S,P)efficient if (S,P)dominated candidates are precluded from winning an election. I think it is almost selfevident that any reasonable scorebased system definitely needs to be (S,P)efficient.
For an example of an (S,P)efficient election strategy, consider the voting system that chooses a candidate with a maximal area bounded in (S,P)space by their productionpossibilities frontier. This precludes (S,P)dominated candidates from winning the election. More generally, any measure over (S,P) space will allow us to assign a score to each candidate consisting of the measure contained within their productionpossibilities frontier, and this is also (S,P)efficient.
Another method that is independent of both the measure and the scores is given by first eliminating all (S,P)dominated candidates, and then using a determined method of generating a "random" productionpossibilities frontier in (S,P)space and calculating the probability that a candidate's frontier will be engulfed by a randomly produced curve. The candidate with the lowest probability of being engulfed seems like a very reasonable choice, although the algorithm to produce the random frontier will vary what exactly this means in practice.
Here is one method to produce a "random" productionpossibilities frontier: if there are N scores, generate N1 real numbers in [0,1] chosen uniformly at random. Let these numbers be X(1), X(2), ..., X(N1). Then the random frontier is given by X(1) at score 1, X(1)*X(2) at score 2, X(1)*X(2)*X(3) at score 3, etc. Then the probability that a frontier {(S1, P1), (S2,P2),...,(Sn,Pn)} is engulfed by a random frontier is
(1P1)(1P2/P1)(1P3/P2)...(1Pn/P(n1))
However, this measure is problematicif P1=1 for example, then the measure is automatically 0 independent of the other values of P(n), which definitely does not seem logically admissible. More generally it is zero if we ever have P(n)=P(n1). Although events like that are rare, if we are looking for a system that is in line with our intuitive social understanding of what should make one frontier superior to another, we should look for an alternative that does not produce these aberrations.
An alternative measure could be the expected value of the fraction of vertices on the frontier that are not engulfed by a random frontier. There is an easily computed (in a computer) form for this fraction: if F(N,P)is given by
F(N,P):=1/(N1)! * INTEGRAL{from t=0 to P}[(log(t))^{N1}]dt=1gamma(N,log(P))/(N1)!
where gamma(s,x) is the lower incomplete Gamma function, then for a score system with N scores, the expected value of the fraction of vertices that will not be engulfed by a random frontier is
1/N * SUM{from k=1 to N} F(K,P(K))
This seems reasonable to me, but I am open to any concerns or questions. Even without first excluding (S,P)dominated candidates, the method of maximizing this fraction is easily shown to be (S,P)efficient, and it does not seem to produce any aberrations. Of course, this is based on a particular model of construction for random frontiers using uniformlydistributed i.i.d. random variables, so it is not canonical. Unfortunately, while the arbitrary nature of the scores has been eliminated, we have not been able to eliminate the arbitrary nature of the production of frontiers.
A more interesting variant of this would be a model that takes account of past values for Prob(X>= S) for each score S and where X is the score given to any fixed candidate C by a voter chosen uniformly at random. Let P(j)=Prob(X>=S(j)) over the population of past, recorded candidates in "relevant" elections using the same frameworki.e. with the same set of allowed scores and over the "same" population of voters. Then if we use the measure that is the expected value of the fraction of vertices that are not engulfed by a "random" frontier based on the past constructions of (S,P)frontiers, a record can be kept for the distribution of the P values for each score from past candidates, and that data can be used to estimate the expected value of the fraction of unengulfed vertices. This is possible because the expected value operator is linear even over variables that are not independent.
Let each score S(K) have a CDF of P values as F{K}(p). Then there is an indicator random variable 1{K} for each vertex (S(K),P(K)) that is 0 if the vertex is engulfed and 1 otherwise. The sum of all of these indicator random variables is the number of vertices that are not engulfed. The expected value of this is the sum of the expected values of these indicator variables. The expected value of the indicator 1{K} is simply the probability that the vertex (S(K),P(K)) is not engulfed, which is simply F{K}(P(K)). Hence the expected number of unengulfed vertices is
SUM{from K=1 to N}F{K}(P(K))
(And the expected fraction is this divided by N+1)

Here is Python code that runs the SPV algorithm and reads and updates priors into a text file named 'SPVdata.txt':
def is_type_list(data): #Checks to see if data is a list of items all of the same type if type(data) is list: typ = type(data[0]) for row in data: if (typ is type(row))==False: return False else: return False return True def isValidSPVData(data, N): #Checks to see if the data is a list of ballots, each of which is a list of integers from 0 to N. if is_type_list(data)==False: return False for item in data: if is_type_list(item)==False: return False for entry in item: if type(entry) != type(0): return False if entry <0 or entry > N: return False def mergeAscendingLists(listA,listB): #Merges two ascending lists into a single ascending list, and also returns the percentiles of the items in the first list with respect to their positions in the merged list. A=len(listA) B=len(listB) mergedList=[] firstListPercentiles=[] #Ensures that listA refers to the smaller of the two lists, and that listB refers to the larger. This is for convenienceit is easier to think about sorting a small list into a large list. swapped=False if B<A: temp=listB listB=listA listA=temp swapped=True temp=B B=A A=temp #We initialize variables to index the items in the small list and the large list, and to keep track of whether the last item inputted into the merged list was from the smaller list, listA. a=0 b=0 lastA=True stretch=0 FirstsInStretch=0 while a<=A1 and b<=B1: itemA=float(listA[a]) itemB=float(listB[b]) if itemA<itemB: mergedList.append(itemA) a+=1 T=a+b stretch+=1 if swapped==False: FirstsInStretch+=1 if a==A or float(listA[a])!=mergedList[Tstretch]: for k in range(FirstsInStretch): firstListPercentiles.append(T) stretch=0 FirstsInStretch=0 lastA=True elif itemA==itemB: mergedList.append(itemA) mergedList.append(itemB) a+=1 b+=1 stretch+=2 FirstsInStretch+=1 T=a+b if a==A: if b==B: for k in range(FirstsInStretch): firstListPercentiles.append(T) stretch=0 FirstsInStretch=0 elif float(listB[b])!=mergedList[Tstretch]: for k in range(FirstsInStretch): firstListPercentiles.append(T) stretch=0 FirstsInStretch=0 elif b==B: if float(listA[a])!=mergedList[Tstretch]: for k in range(FirstsInStretch): firstListPercentiles.append(T) stretch=0 FirstsInStretch=0 else: check=mergedList[Tstretch] if float(listA[a])!=check and float(listB[b])!=check: for k in range(FirstsInStretch): firstListPercentiles.append(T) stretch=0 FirstsInStretch=0 lastA=True else: lastA=False mergedList.append(itemB) b+=1 stretch+=1 T=a+b if swapped==True: FirstsInStretch+=1 if b==B or float(listB[b])!=mergedList[Tstretch]: for k in range(FirstsInStretch): firstListPercentiles.append(T) stretch=0 FirstsInStretch=0 if lastA==True: while b<=B1: mergedList.append(float(listB[b])) b+=1 stretch+=1 T=a+b if swapped==True: FirstsInStretch+=1 if b==B or float(listB[b])!=mergedList[Tstretch]: for k in range(FirstsInStretch): firstListPercentiles.append(T) stretch=0 FirstsInStretch=0 else: while a<=A1: mergedList.append(float(listA[a])) a+=1 stretch+=1 T=a+b if swapped==False: FirstsInStretch+=1 if a==A or float(listA[a])!=mergedList[Tstretch]: for k in range(FirstsInStretch): firstListPercentiles.append(T) for k in range(len(firstListPercentiles)): firstListPercentiles[k]=firstListPercentiles[k]/T return [mergedList, firstListPercentiles] def SPVote(ballots, N, prior=None, record=False, candidates=None): #Catches some potential errors regarding the function inputs if len(ballots)==0: print('There are no ballots. The election result is arbitrary. ') return None if isValidSPVData(ballots, N)==False: print('Error: The ballot entry is invalid SPVote data. ') return None numCands=len(ballots[0]) for ballot in ballots: if len(ballot) != numCands: print('Error: At least one ballot has an incorrect number of scores. ') return None #Defines Cands (the candidates) depending on whether a candidate list was given as an input. Cands = {} if type(candidates)!=type(None): if numCands != len(candidates): print('Error: The number of scores on each ballot does not match the number of candidates. There are ' + str(len(candidates)) + 'candidates, but there are ' + str(numCands) + ' ballot slots. ') return None i=0 for C in candidates: Cands[C]=i i+=1 else: for C in range(numCands): Cands[C]=C numVotes=len(ballots) Scores=range(N+1) #Creates Argument Set Boundaries for each candidate ArgSetBound = {} for C in Cands: ArgSetBound[C]={} for S in Scores: ArgSetBound[C][S]=0 #For each candidate C and score S, counts ballots which give C a score of at least S for ballot in ballots: for C in Cands: candScore = ballot[Cands[C]] for i in range(0,candScore+1): ArgSetBound[C][i]+=1 #Normalizes the counts to probabilities/fractions for C in Cands: for S in Scores: ArgSetBound[C][S]=ArgSetBound[C][S]/numVotes Measure = {} for C in Cands: Measure[C]=0 #If no prior data is given, then a simple variant is used if type(prior)==type(None): increment=1/numCands/(N+1) for C in Cands: for S in Scores: candPs=ArgSetBound[C][S] for K in Cands: if ArgSetBound[K][S] <= candPs: Measure[C]+=increment print(ArgSetBound[C]) print(Measure) #Otherwise, we can use the prior data to assign a more informed SPV measure to candidates: else: with open(prior,'r+') as datafile: #Retrieves the score data for each score as a list of lists of past P values ScoreData = [] lines = [] for line in datafile: lines.append(line) stripped_line = line.strip() line_list = stripped_line.split() ScoreData.append(line_list) #If record mode is turned on, checks to see if the write file is empty. If it is, space is made for the new data. datafile.seek(0) if record==True and len(lines)==0: for S in Scores: datafile.write('\n') #Combines the new score data with the prior score data and records the percentiles of the new score data with respect to their positions in the merged list using the mergeAscendingLists function. datafile.seek(0) ScorePercentiles={} for S in Scores: ScorePercentiles[S]=[] newScoreData = [] for C in Cands: newScoreData.append(ArgSetBound[C][S]) newScoreData.sort() oldDataLength=len(ScoreData) if oldDataLength==0: mergedList=newScoreData L=len(mergedList) for k in range(L): ScorePercentiles[S].append((k+1)/L) elif S<oldDataLength: mergedList=mergeAscendingLists(newScoreData,ScoreData[S]) ScorePercentiles[S]=mergedList[1] mergedList=mergedList[0] #If record mode is turned on, the prior is updated. if record==True: ScoreDataString='' for item in mergedList: ScoreDataString+=str(item)+' ' datafile.write(ScoreDataString+'\n') #The measures of the candidates are then calculated according to their percentile scores: for C in Cands: for S in Scores: ThisScorePosition=0 candPs=ArgSetBound[C][S] for K in Cands: if ArgSetBound[K][S] <= candPs: ThisScorePosition+=1 Measure[C]+=ScorePercentiles[S][ThisScorePosition1]/(N+1) print(Measure) Candidates = ['Alice','Bob','Charlie','Dale','Elaine'] ScoreBins=5 DataFile = 'SPVdata.txt' #DataFile = None RecordMode = True Ballots = [ [0,0,0,4,5], [5,4,0,0,0], [5,0,0,0,0], [5,4,1,0,0], [0,1,4,5,1], [3,2,0,4,5], [0,0,0,0,5], [0,1,2,4,5], [0,0,0,5,5], [0,5,4,1,0], [0,5,4,1,0], [0,5,4,1,0] ] SPVote(Ballots, ScoreBins, DataFile, RecordMode, Candidates)

I hope you are not hoping for this to be adopted for elections. I might be useful in other situations but it is too complex for people to trust.
Anyway, you are correct that the issue of people scoring on a different scale is a real one. There are other systems like Cardinal Baldwin and distributed voting. The issue with these is that they tend to be nonmonotonic. STAR and STLR are the variants were the normalization/scaling is only applied at the last round to keep them monotonic.
I must admit I do not entirely follow your system. Can you give a clear example? Or a step by step procedure. I do read python but what you have given is very long considering score voting is just a sum.

@Keith I don't expect this to be adopted for national elections but it could be something used for small businesses for example. The procedure is actually a lot simpler than what the code or my explanatory powers might suggest. A picture might also help, I may include one soon. Here is a simple example:
Let there be just three ballots and four candidates A, B, C, D for simplicity, with ordinal scores from 0 to 5 and ballots indicated as follows:
[0,1,4,5]
[5,4,0,0]
[0,1,5,3]What we do first is for each of the four candidates, determine the probability that a score given to that candidate by a voter chosen uniformly at random is larger than each score value S. This creates a profile for the candidate. For example, the first candidate A has the following profile:
A:[1, 1/3, 1/3, 1/3, 1/3, 1/3]
because every score is at least 0, and 1/3 of scores are maximal, i.e. greater than or equal to each individual score. The other candidates have profiles
B:[1, 1, 1/3, 1/3, 1/3, 0]
C:[1, 2/3, 2/3, 2/3, 2/3, 1/3]
D:[1, 2/3, 2/3, 2/3, 1/3, 1/3]
You can automatically see that D will not win the election, since C has a strictly superior profile. In fact C also dominates A, so the election is between B and C.
Now we can calculate the final measure for each candidate. Starting with B, we look at each entry in the profile and compare it with the corresponding entries in the profiles of the other candidates. The first entry of every candidate is always a 1 (every score is at least the minimum score), so we can ignore it. For the second entry in the profile, based only on this election, B is the 100th percentile, since its second entry is greater than or equal to 100% of the corresponding entries. So this entry adds a 1.00 to the measure of B. The corresponding entry for C adds only 3/4, since the second entry of C's profile is greater than or equal to only 3/4 of the corresponding entries.
The final measure of a candidate is essentially the sum of the percentiles over each entry of a candidate's profile. For example,
Meas(B)=1+1/2+1/2+3/4+1/4=3
Meas(C)=3/4+1+1+1+1=4.75
So the winner would be C. In the code above I normalized the measures so that they are always between 0 and 1the measure of a candidate is the expected value of the fraction of vertices in their profile that are not engulfed by a random candidate's profile.
This was an example where no prior distribution for the profile entries was given, so we used the ballots to supply the percentiles. It's possible to assign the percentiles according to a distribution based on past election data to give a more informed measure.
One thing that interests me is to compare this system with ordinary score and see where they are in agreement. In this case the same result is given by both methods.
A lot of the code is the mergeAscendingLists function which is just an auxiliary for the SPVote algorithm. Most of the code has to do with updating prior data and calculating percentiles from the prior data. I made it so that you can turn "record mode" on or off, so you can set the prior data to whatever you want, or have no prior data and use the simple variant exemplified here.

I made a video explaining this system: https://app.vmaker.com/record/SGSydGYcwOW9Vf6d

Suppose you allow six grades. Call them 0 (worst), 1, 2, 3, 4, and 5 (best). The candidates are Bush, Gore, and Nader. The Republicans bullet vote, because they do not like Nader any better than they like Gore. They think both of those are basically the same as Stalin. The DINOs bullet vote, because they are not used to the system, and cannot conceive of a better way to keep the hated Bush down than by throwing their full support behind Gore. They either have not heard that Nader is running, or wouldn't pay attention to such an announcement, on the grounds that fringe candidates cannot win.
Consider some different possible proportions of the electorate being DINOs and Republicans, and the remainder being Nader supporters. Let's say the Nader supporters honestly value Gore at 20% of the value of Nader, given a baseline of zero for Bush.
Are there some proportions of numerosities among these three factions such that in Score, the Nader supporters would see a better outcome if they voted Gore 4 than Gore 1? In your system, would they also see a better outcome if they voted Gore 4 than Gore 1, for some of the possible proportions? Are there some proportions for which the best outcome they could get in your system, using their optimal strategy for your system, is better than what they could get with Score using the same granularity in the range, and using their optimal strategy for that system?
I am working on a simulation framework to be based on a slider control that could be used to vary the proportions of the electorate belonging to factions. Coders could add any number of voting systems along with one or more alternative strategies for each system (the easiest to code, in each case, would probably be the naïve strategy, and I think it should be included with every system). Noncoder researchers could define factions, give the true sentiments for these factions, select strategies for them to use from among those provided by the coder contributors, and set their numerosities as a constant count of voters or a multiplier by the master slider control's leftward or rightward aspect. I'm thinking these aspects would evaluate to 1 when the slider is in the middle, and from 0 to 2 otherwise.

@cfrank From your video, I notice that the system would depend on past elections. I repeat that here for readers who don't hear out the video.

@JackWaugh Thank you for your thoughtful reply. Each candidate is given an “independent” ordinal score, so those kinds of results carry over from cardinal score. Altering the score given to Nader will only affect Nader’s score profile, it can’t cause Gore to beat out Bush or vice versa unless the voters change the scores they give to Gore and/or Bush. If a voter wants Nader to win, doesn’t mind Gore and wants Bush to lose, for example, then they should vote something like [B,N,G]=[0,5,3]. If they are really worried about Bush, then they should vote [0,5,5]. But this obviously increases the chance that Nader loses to Gore, so it’s up to the voter to manage their conflicting interests and indicate a ballot they think will serve them most effectively.
The system is very similar to score voting, the alterations I made were only to make the choice of cardinal scores less arbitrary.

@JackWaugh that can be true, but the system can also be made into a deterministic one by choosing a distribution or even by putting in place a deterministic method to select a distribution based on the present ballots.

@cfrank said in Ordinal Score "(S,P)"systems:
The system is very similar to score voting, the alterations I made were only to make the choice of cardinal scores less arbitrary.
Choice by the election designers, or choice by the voters?

@JackWaugh choice by the system designers. If anything I would actually prefer the scores to be more arbitrary for the voters, since this will prevent strategic behavior. That’s a part of why the system abstains from assigning the scores cardinal values. In the learning variant, the scores obtain their cardinal values via the manner in which they are used by the electorate, rather than the other way around.

@cfrank said in Ordinal Score "(S,P)"systems:
For an example of an (S,P)efficient election strategy, consider the voting system that chooses a candidate with a maximal area bounded in (S,P)space by their productionpossibilities frontier. This precludes (S,P)dominated candidates from winning the election.
Isn't this just score voting? Since for random variable X >= 0 with probability 1, EX = integral{0 to inf}[P(X>t)dt]
@cfrank said in Ordinal Score "(S,P)"systems:
Unfortunately, while the arbitrary nature of the scores has been eliminated, we have not been able to eliminate the arbitrary nature of the production of frontiers.
So the method you proposed could be considered part of a general class of methods of the following form:
Suppose X_1, X_2 ..., X_N are random variables between 0 and 1, where X_1 >= X_2 >= ... >= X_N with probability 1.
Then {(1,X_1), (2,X_2), ..., (N, X_N)} defines a random frontier.
For candidate C, let p(s) be the proportion of ballots such that candidate C is scored at least s.
Then the expected proportion of vertices on which p(s) >= X_s is (1/N)*sum{k=1 to N}[X_s <= p(s)]
(Note that if X_1 = ... = X_N ~ Unif(0,1), we get score voting.)Anyway, a criticism I have with the system that you have proposed in which the X_i are products of uniform distributions is that as i becomes larger, the distribution of the X_i become very leftskewed (edit: this should say rightskewed). For a 5 point scale, a candidate who receives a score of 5 on 1% of ballots, and 0 on all others will score about 34%.
Basically, it will be absolutely essential for a candidate to attain a (fairly small) critical mass of maxvalue scores, since that will gain them almost all of the points available on the highervalued scores. Among candidates who reach the critical mass, the winner will probably be whoever gets the most 1s (de facto turning into approval voting where anything greater than a 0 is full approval). 
@marylander Yes, with fixed uniform distributions SP voting does become score voting. But explicitly the system I am currently proposing (see the video explanation here, and don’t mind my brain fart at the beginning: https://app.vmaker.com/record/SGSydGYcwOW9Vf6d) uses distributions that are relevant models for the electoral data, not uniform. The toy model using products uniform distributions was a discarded idea.
I think from the SP framework one can make the argument that standard Score voting is very crude, since adopting the uniform distributions in an SP voting system seems fairly nonsensical compared with alternatives that reflect actual electoral behavior. One could use the order statistics of the uniform distribution, though, and that might not be terrible, but to me it seems like it would just be better to directly incorporate the actual data.
My guess is that if past electoral data were used to produce or to inform the distributions, they would become centered at lower fractions as the ordinal score is increased, and would actually skew to the right. I believe this eliminates the problem you suggested.
But I have no idea what the distributions would actually look like! They might not even be unimodal, and on the large scale they might not even stabilize. That’s partly why I also think it would be important for the distributions to be updated and for there to be a rigorous agreement about how and how often and when that should be done if the voting system were implemented for longrun use.

@cfrank said in Ordinal Score "(S,P)"systems:
My guess is that if past electoral data were used to produce or to inform the distributions, they would become centered at lower fractions as the ordinal score is increased, and would actually skew to the right. I believe this eliminates the problem you suggested.
I made a mistake, I should have said rightskewed.
In the past, various people have proposed using nonlinear levels for score voting. I have generally criticized this on the basis that it will seem arbitrary to voters. This method is not exactly one of those, but it is similar. Increasing a candidate's score from a 0 to a 1, a 1 to a 2, and so on, on a ballot, will all be worth different amounts (probably), but what amount it is worth depends on the distributions and the amount of support each candidate gets.
You have said that you are concerned that the values of score levels are arbitrary, so you might not be concerned that the amount that an additional point is worth under the proposed system is dependent on these factors; perhaps the approach you have proposed to assigning values to score levels gives them meaning. However, I am a bit skeptical of this.
First, the system might not pass Independence of Irrelevant Ballots. If several maxvalue ballots are added, then it will tend to reduce the value of highervalue scores relative to lowervalue ones on the rest of the ballots.
Second, I can think of cases where basing the distributions on past empirical data leads to points being devalued that clearly should not be devalued. Suppose that the first time a system like this is used, there are two polarizing candidates, and each candidate gets about half of the vote, with each ballot giving one candidate a 5 and the other a 0. The next election is similar, but this time, there is a compromise candidate who gets about a 3 from everybody. The compromise candidate will get a score of 60%, but if slightly less than half of voters gave this compromise candidate no points, then they will still score 60%, because no candidate in the first election got more than, say, 51% of scores above 0. While calibrating the distributions on the performances of two candidates is clearly insufficient and this example is probably unfair, I think it does highlight a general problem that a compromise candidate who gets an unprecedented amount of midrange support will stop receiving credit for additional midrange support beyond what has previously occurred.

@marylander Your criticisms are definitely fair and especially I hadn’t considered the midrange score devaluation problem, but I also agree that the calibration of the distributions is probably not an example that would reflect actual use. A winning candidate in this system probably needs to effectively pass thresholds of support of various kinds.
Also, any distribution or class of distributions could be selected as the prior, and more sophisticated methods could be implemented to update the distributions such as Bayesian statistics. It could be that a fixed distribution is chosen, I think the order statistics of uniform variables is not a terrible idea as the initial. But anyway it’s also all somewhat arbitrary, choosing a distribution is the same kind of problem as choosing cardinal score values. Furthermore with relevant updating the midrange support candidate will inform and reinforce the potential for compromise in future elections.
The irrelevant ballot problem can be fixed by simply ignoring or disallowing any ballots where the same score is given to all candidates, or more strictly any ballot that does not make use of both the minimal and maximal scores. I am not sure why rightskewed distributions would be problematic, in fact that is part of the functionality of the system.