# Ordinal Score "(S,P)"-systems

• In a score voting system, there is the issue of what scores to allow voters to submit, as well as of how to bring together the various score ballots and produce an election winner. As such, my goal here is to produce a score-based system with results that are independent of what particular score set is chosen, as long as the scores are ordinally ranked. My hypothesis is that, even with unlabeled scores, simply blank boxes on a line from left to right with an indication of positivity, would accrue meaning to voters who used such a system as will be described.

In a score system, an (S,P)-consensual candidate is one who is scored at least an S by at least a P-fraction of the electorate. It is easy to show that a candidate is (S,P)-consensual if and only if it is (R,Q)-consensual for all R less than or equal to S and all Q less than or equal to P. The statement that a candidate is (S,P)-consensual is essentially an argument supporting their fitness as the election winner---the strength of the argument increases as S and P increase independently.

Let X be the random variable that is the score assigned to candidate C by a voter chosen uniformly at random. For each candidate, there is an (S,P) production possibilities frontier of points corresponding to the boundary of obtainable (S,P) values such that the candidate is (S,P)-consensual---it is simply the plot of Prob(X>= S) against scores S.

A candidate C is (S,P)-dominated if there is another candidate K whose production possibilities engulfs that of C. A voting system is (S,P)-efficient if (S,P)-dominated candidates are precluded from winning an election. I think it is almost self-evident that any reasonable score-based system definitely needs to be (S,P)-efficient.

For an example of an (S,P)-efficient election strategy, consider the voting system that chooses a candidate with a maximal area bounded in (S,P)-space by their production-possibilities frontier. This precludes (S,P)-dominated candidates from winning the election. More generally, any measure over (S,P) space will allow us to assign a score to each candidate consisting of the measure contained within their production-possibilities frontier, and this is also (S,P)-efficient.

Another method that is independent of both the measure and the scores is given by first eliminating all (S,P)-dominated candidates, and then using a determined method of generating a "random" production-possibilities frontier in (S,P)-space and calculating the probability that a candidate's frontier will be engulfed by a randomly produced curve. The candidate with the lowest probability of being engulfed seems like a very reasonable choice, although the algorithm to produce the random frontier will vary what exactly this means in practice.

Here is one method to produce a "random" production-possibilities frontier: if there are N scores, generate N-1 real numbers in [0,1] chosen uniformly at random. Let these numbers be X(1), X(2), ..., X(N-1). Then the random frontier is given by X(1) at score 1, X(1)*X(2) at score 2, X(1)*X(2)*X(3) at score 3, etc. Then the probability that a frontier {(S1, P1), (S2,P2),...,(Sn,Pn)} is engulfed by a random frontier is

(1-P1)(1-P2/P1)(1-P3/P2)...(1-Pn/P(n-1))

However, this measure is problematic---if P1=1 for example, then the measure is automatically 0 independent of the other values of P(n), which definitely does not seem logically admissible. More generally it is zero if we ever have P(n)=P(n-1). Although events like that are rare, if we are looking for a system that is in line with our intuitive social understanding of what should make one frontier superior to another, we should look for an alternative that does not produce these aberrations.

An alternative measure could be the expected value of the fraction of vertices on the frontier that are not engulfed by a random frontier. There is an easily computed (in a computer) form for this fraction: if F(N,P)is given by

F(N,P):=1/(N-1)! * INTEGRAL{from t=0 to P}[(-log(t))^{N-1}]dt=1-gamma(N,-log(P))/(N-1)!

where gamma(s,x) is the lower incomplete Gamma function, then for a score system with N scores, the expected value of the fraction of vertices that will not be engulfed by a random frontier is

1/N * SUM{from k=1 to N} F(K,P(K))

This seems reasonable to me, but I am open to any concerns or questions. Even without first excluding (S,P)-dominated candidates, the method of maximizing this fraction is easily shown to be (S,P)-efficient, and it does not seem to produce any aberrations. Of course, this is based on a particular model of construction for random frontiers using uniformly-distributed i.i.d. random variables, so it is not canonical. Unfortunately, while the arbitrary nature of the scores has been eliminated, we have not been able to eliminate the arbitrary nature of the production of frontiers.

A more interesting variant of this would be a model that takes account of past values for Prob(X>= S) for each score S and where X is the score given to any fixed candidate C by a voter chosen uniformly at random. Let P(j)=Prob(X>=S(j)) over the population of past, recorded candidates in "relevant" elections using the same framework---i.e. with the same set of allowed scores and over the "same" population of voters. Then if we use the measure that is the expected value of the fraction of vertices that are not engulfed by a "random" frontier based on the past constructions of (S,P)-frontiers, a record can be kept for the distribution of the P values for each score from past candidates, and that data can be used to estimate the expected value of the fraction of unengulfed vertices. This is possible because the expected value operator is linear even over variables that are not independent.

Let each score S(K) have a CDF of P values as F{K}(p). Then there is an indicator random variable 1{K} for each vertex (S(K),P(K)) that is 0 if the vertex is engulfed and 1 otherwise. The sum of all of these indicator random variables is the number of vertices that are not engulfed. The expected value of this is the sum of the expected values of these indicator variables. The expected value of the indicator 1{K} is simply the probability that the vertex (S(K),P(K)) is not engulfed, which is simply F{K}(P(K)). Hence the expected number of unengulfed vertices is

SUM{from K=1 to N}F{K}(P(K))

(And the expected fraction is this divided by N+1)

• Here is Python code that runs the SPV algorithm and reads and updates priors into a text file named 'SPVdata.txt':

``````def is_type_list(data):
#Checks to see if data is a list of items all of the same type
if type(data) is list:
typ = type(data)
for row in data:
if (typ is type(row))==False:
return False
else:
return False
return True

def isValidSPVData(data, N):
#Checks to see if the data is a list of ballots, each of which is a list of integers from 0 to N.
if is_type_list(data)==False:
return False
for item in data:
if is_type_list(item)==False:
return False
for entry in item:
if type(entry) != type(0):
return False
if entry <0 or entry > N:
return False

def mergeAscendingLists(listA,listB):
#Merges two ascending lists into a single ascending list, and also returns the percentiles of the items in the first list with respect to their positions in the merged list.
A=len(listA)
B=len(listB)
mergedList=[]
firstListPercentiles=[]

#Ensures that listA refers to the smaller of the two lists, and that listB refers to the larger. This is for convenience---it is easier to think about sorting a small list into a large list.
swapped=False
if B<A:
temp=listB
listB=listA
listA=temp
swapped=True

temp=B
B=A
A=temp

#We initialize variables to index the items in the small list and the large list, and to keep track of whether the last item inputted into the merged list was from the smaller list, listA.
a=0
b=0
lastA=True
stretch=0
FirstsInStretch=0
while a<=A-1 and b<=B-1:
itemA=float(listA[a])
itemB=float(listB[b])
if itemA<itemB:
mergedList.append(itemA)

a+=1
T=a+b
stretch+=1
if swapped==False:
FirstsInStretch+=1
if a==A or float(listA[a])!=mergedList[T-stretch]:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)
stretch=0
FirstsInStretch=0

lastA=True
elif itemA==itemB:
mergedList.append(itemA)
mergedList.append(itemB)
a+=1
b+=1
stretch+=2
FirstsInStretch+=1
T=a+b
if a==A:
if b==B:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)
stretch=0
FirstsInStretch=0
elif float(listB[b])!=mergedList[T-stretch]:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)
stretch=0
FirstsInStretch=0
elif b==B:
if float(listA[a])!=mergedList[T-stretch]:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)
stretch=0
FirstsInStretch=0
else:
check=mergedList[T-stretch]
if float(listA[a])!=check and float(listB[b])!=check:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)
stretch=0
FirstsInStretch=0
lastA=True
else:
lastA=False

mergedList.append(itemB)

b+=1
stretch+=1
T=a+b
if swapped==True:
FirstsInStretch+=1
if b==B or float(listB[b])!=mergedList[T-stretch]:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)
stretch=0
FirstsInStretch=0

if lastA==True:
while b<=B-1:
mergedList.append(float(listB[b]))

b+=1
stretch+=1
T=a+b
if swapped==True:
FirstsInStretch+=1
if b==B or float(listB[b])!=mergedList[T-stretch]:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)
stretch=0
FirstsInStretch=0
else:
while a<=A-1:
mergedList.append(float(listA[a]))

a+=1
stretch+=1
T=a+b
if swapped==False:
FirstsInStretch+=1
if a==A or float(listA[a])!=mergedList[T-stretch]:
for k in range(FirstsInStretch):
firstListPercentiles.append(T)

for k in range(len(firstListPercentiles)):
firstListPercentiles[k]=firstListPercentiles[k]/T

return [mergedList, firstListPercentiles]

def SPVote(ballots, N, prior=None, record=False, candidates=None):

#Catches some potential errors regarding the function inputs
if len(ballots)==0:
print('There are no ballots. The election result is arbitrary. ')
return None

if isValidSPVData(ballots, N)==False:
print('Error: The ballot entry is invalid SPVote data. ')
return None

numCands=len(ballots)
for ballot in ballots:
if len(ballot) != numCands:
print('Error: At least one ballot has an incorrect number of scores. ')
return None

#Defines Cands (the candidates) depending on whether a candidate list was given as an input.
Cands = {}

if type(candidates)!=type(None):
if numCands != len(candidates):
print('Error: The number of scores on each ballot does not match the number of candidates. There are ' + str(len(candidates)) + 'candidates, but there are ' + str(numCands) + ' ballot slots. ')
return None
i=0
for C in candidates:
Cands[C]=i
i+=1
else:
for C in range(numCands):
Cands[C]=C

Scores=range(N+1)

#Creates Argument Set Boundaries for each candidate
ArgSetBound = {}
for C in Cands:
ArgSetBound[C]={}
for S in Scores:
ArgSetBound[C][S]=0

#For each candidate C and score S, counts ballots which give C a score of at least S
for ballot in ballots:
for C in Cands:
candScore = ballot[Cands[C]]
for i in range(0,candScore+1):
ArgSetBound[C][i]+=1
#Normalizes the counts to probabilities/fractions
for C in Cands:
for S in Scores:

Measure = {}
for C in Cands:
Measure[C]=0

#If no prior data is given, then a simple variant is used
if type(prior)==type(None):
increment=1/numCands/(N+1)
for C in Cands:
for S in Scores:
candPs=ArgSetBound[C][S]
for K in Cands:
if ArgSetBound[K][S] <= candPs:
Measure[C]+=increment
print(ArgSetBound[C])
print(Measure)
#Otherwise, we can use the prior data to assign a more informed SPV measure to candidates:
else:
with open(prior,'r+') as datafile:
#Retrieves the score data for each score as a list of lists of past P values
ScoreData = []
lines = []
for line in datafile:
lines.append(line)
stripped_line = line.strip()
line_list = stripped_line.split()
ScoreData.append(line_list)

#If record mode is turned on, checks to see if the write file is empty. If it is, space is made for the new data.
datafile.seek(0)
if record==True and len(lines)==0:
for S in Scores:
datafile.write('\n')

#Combines the new score data with the prior score data and records the percentiles of the new score data with respect to their positions in the merged list using the mergeAscendingLists function.
datafile.seek(0)
ScorePercentiles={}
for S in Scores:
ScorePercentiles[S]=[]
newScoreData = []
for C in Cands:
newScoreData.append(ArgSetBound[C][S])
newScoreData.sort()
oldDataLength=len(ScoreData)
if oldDataLength==0:
mergedList=newScoreData
L=len(mergedList)
for k in range(L):
ScorePercentiles[S].append((k+1)/L)
elif S<oldDataLength:
mergedList=mergeAscendingLists(newScoreData,ScoreData[S])
ScorePercentiles[S]=mergedList
mergedList=mergedList

#If record mode is turned on, the prior is updated.
if record==True:
ScoreDataString=''
for item in mergedList:
ScoreDataString+=str(item)+' '
datafile.write(ScoreDataString+'\n')

#The measures of the candidates are then calculated according to their percentile scores:
for C in Cands:
for S in Scores:
ThisScorePosition=0
candPs=ArgSetBound[C][S]
for K in Cands:
if ArgSetBound[K][S] <= candPs:
ThisScorePosition+=1
Measure[C]+=ScorePercentiles[S][ThisScorePosition-1]/(N+1)

print(Measure)

Candidates = ['Alice','Bob','Charlie','Dale','Elaine']
ScoreBins=5
DataFile = 'SPVdata.txt'
#DataFile = None
RecordMode = True
Ballots = [
[0,0,0,4,5],
[5,4,0,0,0],
[5,0,0,0,0],
[5,4,1,0,0],
[0,1,4,5,1],
[3,2,0,4,5],
[0,0,0,0,5],
[0,1,2,4,5],
[0,0,0,5,5],
[0,5,4,1,0],
[0,5,4,1,0],
[0,5,4,1,0]
]
SPVote(Ballots, ScoreBins, DataFile, RecordMode, Candidates)
``````

• I hope you are not hoping for this to be adopted for elections. I might be useful in other situations but it is too complex for people to trust.

Anyway, you are correct that the issue of people scoring on a different scale is a real one. There are other systems like Cardinal Baldwin and distributed voting. The issue with these is that they tend to be non-monotonic. STAR and STLR are the variants were the normalization/scaling is only applied at the last round to keep them monotonic.

I must admit I do not entirely follow your system. Can you give a clear example? Or a step by step procedure. I do read python but what you have given is very long considering score voting is just a sum.

• @Keith I don't expect this to be adopted for national elections but it could be something used for small businesses for example. The procedure is actually a lot simpler than what the code or my explanatory powers might suggest. A picture might also help, I may include one soon. Here is a simple example:

Let there be just three ballots and four candidates A, B, C, D for simplicity, with ordinal scores from 0 to 5 and ballots indicated as follows:

[0,1,4,5]
[5,4,0,0]
[0,1,5,3]

What we do first is for each of the four candidates, determine the probability that a score given to that candidate by a voter chosen uniformly at random is larger than each score value S. This creates a profile for the candidate. For example, the first candidate A has the following profile:

A:[1, 1/3, 1/3, 1/3, 1/3, 1/3]

because every score is at least 0, and 1/3 of scores are maximal, i.e. greater than or equal to each individual score. The other candidates have profiles

B:[1, 1, 1/3, 1/3, 1/3, 0]

C:[1, 2/3, 2/3, 2/3, 2/3, 1/3]

D:[1, 2/3, 2/3, 2/3, 1/3, 1/3]

You can automatically see that D will not win the election, since C has a strictly superior profile. In fact C also dominates A, so the election is between B and C.

Now we can calculate the final measure for each candidate. Starting with B, we look at each entry in the profile and compare it with the corresponding entries in the profiles of the other candidates. The first entry of every candidate is always a 1 (every score is at least the minimum score), so we can ignore it. For the second entry in the profile, based only on this election, B is the 100th percentile, since its second entry is greater than or equal to 100% of the corresponding entries. So this entry adds a 1.00 to the measure of B. The corresponding entry for C adds only 3/4, since the second entry of C's profile is greater than or equal to only 3/4 of the corresponding entries.

The final measure of a candidate is essentially the sum of the percentiles over each entry of a candidate's profile. For example,

Meas(B)=1+1/2+1/2+3/4+1/4=3

Meas(C)=3/4+1+1+1+1=4.75

So the winner would be C. In the code above I normalized the measures so that they are always between 0 and 1---the measure of a candidate is the expected value of the fraction of vertices in their profile that are not engulfed by a random candidate's profile.

This was an example where no prior distribution for the profile entries was given, so we used the ballots to supply the percentiles. It's possible to assign the percentiles according to a distribution based on past election data to give a more informed measure.

One thing that interests me is to compare this system with ordinary score and see where they are in agreement. In this case the same result is given by both methods.

A lot of the code is the mergeAscendingLists function which is just an auxiliary for the SPVote algorithm. Most of the code has to do with updating prior data and calculating percentiles from the prior data. I made it so that you can turn "record mode" on or off, so you can set the prior data to whatever you want, or have no prior data and use the simple variant exemplified here.

• I made a video explaining this system: https://app.vmaker.com/record/SGSydGYcwOW9Vf6d