Ordinal Score "(S,P)"-systems



  • In a score voting system, there is the issue of what scores to allow voters to submit, as well as of how to bring together the various score ballots and produce an election winner. As such, my goal here is to produce a score-based system with results that are independent of what particular score set is chosen, as long as the scores are ordinally ranked. My hypothesis is that, even with unlabeled scores, simply blank boxes on a line from left to right with an indication of positivity, would accrue meaning to voters who used such a system as will be described.

    In a score system, an (S,P)-consensual candidate is one who is scored at least an S by at least a P-fraction of the electorate. It is easy to show that a candidate is (S,P)-consensual if and only if it is (R,Q)-consensual for all R less than or equal to S and all Q less than or equal to P. The statement that a candidate is (S,P)-consensual is essentially an argument supporting their fitness as the election winner---the strength of the argument increases as S and P increase independently.

    Let X be the random variable that is the score assigned to candidate C by a voter chosen uniformly at random. For each candidate, there is an (S,P) production possibilities frontier of points corresponding to the boundary of obtainable (S,P) values such that the candidate is (S,P)-consensual---it is simply the plot of Prob(X>= S) against scores S.

    A candidate C is (S,P)-dominated if there is another candidate K whose production possibilities engulfs that of C. A voting system is (S,P)-efficient if (S,P)-dominated candidates are precluded from winning an election. I think it is almost self-evident that any reasonable score-based system definitely needs to be (S,P)-efficient.

    For an example of an (S,P)-efficient election strategy, consider the voting system that chooses a candidate with a maximal area bounded in (S,P)-space by their production-possibilities frontier. This precludes (S,P)-dominated candidates from winning the election. More generally, any measure over (S,P) space will allow us to assign a score to each candidate consisting of the measure contained within their production-possibilities frontier, and this is also (S,P)-efficient.

    Another method that is independent of both the measure and the scores is given by first eliminating all (S,P)-dominated candidates, and then using a determined method of generating a "random" production-possibilities frontier in (S,P)-space and calculating the probability that a candidate's frontier will be engulfed by a randomly produced curve. The candidate with the lowest probability of being engulfed seems like a very reasonable choice, although the algorithm to produce the random frontier will vary what exactly this means in practice.

    Here is one method to produce a "random" production-possibilities frontier: if there are N scores, generate N-1 real numbers in [0,1] chosen uniformly at random. Let these numbers be X(1), X(2), ..., X(N-1). Then the random frontier is given by X(1) at score 1, X(1)*X(2) at score 2, X(1)*X(2)*X(3) at score 3, etc. Then the probability that a frontier {(S1, P1), (S2,P2),...,(Sn,Pn)} is engulfed by a random frontier is

    (1-P1)(1-P2/P1)(1-P3/P2)...(1-Pn/P(n-1))

    However, this measure is problematic---if P1=1 for example, then the measure is automatically 0 independent of the other values of P(n), which definitely does not seem logically admissible. More generally it is zero if we ever have P(n)=P(n-1). Although events like that are rare, if we are looking for a system that is in line with our intuitive social understanding of what should make one frontier superior to another, we should look for an alternative that does not produce these aberrations.

    An alternative measure could be the expected value of the fraction of vertices on the frontier that are not engulfed by a random frontier. There is an easily computed (in a computer) form for this fraction: if F(N,P)is given by

    F(N,P):=1/(N-1)! * INTEGRAL{from t=0 to P}[(-log(t))^{N-1}]dt=1-gamma(N,-log(P))/(N-1)!

    where gamma(s,x) is the lower incomplete Gamma function, then for a score system with N scores, the expected value of the fraction of vertices that will not be engulfed by a random frontier is

    1/N * SUM{from k=1 to N} F(K,P(K))

    This seems reasonable to me, but I am open to any concerns or questions. Even without first excluding (S,P)-dominated candidates, the method of maximizing this fraction is easily shown to be (S,P)-efficient, and it does not seem to produce any aberrations. Of course, this is based on a particular model of construction for random frontiers using uniformly-distributed i.i.d. random variables, so it is not canonical. Unfortunately, while the arbitrary nature of the scores has been eliminated, we have not been able to eliminate the arbitrary nature of the production of frontiers.

    A more interesting variant of this would be a model that takes account of past values for Prob(X>= S) for each score S and where X is the score given to any fixed candidate C by a voter chosen uniformly at random. Let P(j)=Prob(X>=S(j)) over the population of past, recorded candidates in "relevant" elections using the same framework---i.e. with the same set of allowed scores and over the "same" population of voters. Then if we use the measure that is the expected value of the fraction of vertices that are not engulfed by a "random" frontier based on the past constructions of (S,P)-frontiers, a record can be kept for the distribution of the P values for each score from past candidates, and that data can be used to estimate the expected value of the fraction of unengulfed vertices. This is possible because the expected value operator is linear even over variables that are not independent.

    Let each score S(K) have a CDF of P values as F{K}(p). Then there is an indicator random variable 1{K} for each vertex (S(K),P(K)) that is 0 if the vertex is engulfed and 1 otherwise. The sum of all of these indicator random variables is the number of vertices that are not engulfed. The expected value of this is the sum of the expected values of these indicator variables. The expected value of the indicator 1{K} is simply the probability that the vertex (S(K),P(K)) is not engulfed, which is simply F{K}(P(K)). Hence the expected number of unengulfed vertices is

    SUM{from K=1 to N}F{K}(P(K))

    (And the expected fraction is this divided by N+1)



  • Here is Python code that runs the SPV algorithm and reads and updates priors into a text file named 'SPVdata.txt':

    def is_type_list(data):
        #Checks to see if data is a list of items all of the same type
        if type(data) is list:
            typ = type(data[0])
            for row in data:
                if (typ is type(row))==False:
                    return False
        else:
            return False
        return True
    
    def isValidSPVData(data, N):
        #Checks to see if the data is a list of ballots, each of which is a list of integers from 0 to N.
        if is_type_list(data)==False:
            return False
        for item in data:
            if is_type_list(item)==False:
                return False
            for entry in item:
                if type(entry) != type(0):
                    return False
                if entry <0 or entry > N:
                    return False
    
    def mergeAscendingLists(listA,listB):
        #Merges two ascending lists into a single ascending list, and also returns the percentiles of the items in the first list with respect to their positions in the merged list.
        A=len(listA)
        B=len(listB)
        mergedList=[]
        firstListPercentiles=[]
    
        #Ensures that listA refers to the smaller of the two lists, and that listB refers to the larger. This is for convenience---it is easier to think about sorting a small list into a large list.
        swapped=False
        if B<A:
            temp=listB
            listB=listA
            listA=temp
            swapped=True
    
            temp=B
            B=A
            A=temp
    
        #We initialize variables to index the items in the small list and the large list, and to keep track of whether the last item inputted into the merged list was from the smaller list, listA.
        a=0
        b=0
        lastA=True
        stretch=0
        FirstsInStretch=0
        while a<=A-1 and b<=B-1:
            itemA=float(listA[a])
            itemB=float(listB[b])
            if itemA<itemB:
                mergedList.append(itemA)
    
                a+=1
                T=a+b
                stretch+=1
                if swapped==False:
                    FirstsInStretch+=1
                if a==A or float(listA[a])!=mergedList[T-stretch]:
                    for k in range(FirstsInStretch):
                        firstListPercentiles.append(T)
                    stretch=0
                    FirstsInStretch=0
    
                lastA=True
            elif itemA==itemB:
                mergedList.append(itemA)
                mergedList.append(itemB)
                a+=1
                b+=1
                stretch+=2
                FirstsInStretch+=1
                T=a+b
                if a==A:
                    if b==B:
                        for k in range(FirstsInStretch):
                            firstListPercentiles.append(T)
                        stretch=0
                        FirstsInStretch=0
                    elif float(listB[b])!=mergedList[T-stretch]:
                        for k in range(FirstsInStretch):
                            firstListPercentiles.append(T)
                        stretch=0
                        FirstsInStretch=0
                elif b==B:
                    if float(listA[a])!=mergedList[T-stretch]:
                        for k in range(FirstsInStretch):
                            firstListPercentiles.append(T)
                        stretch=0
                        FirstsInStretch=0
                else:
                    check=mergedList[T-stretch]
                    if float(listA[a])!=check and float(listB[b])!=check:
                        for k in range(FirstsInStretch):
                            firstListPercentiles.append(T)
                        stretch=0
                        FirstsInStretch=0
                lastA=True
            else:
                lastA=False
    
                mergedList.append(itemB)
    
                b+=1
                stretch+=1
                T=a+b
                if swapped==True:
                    FirstsInStretch+=1
                if b==B or float(listB[b])!=mergedList[T-stretch]:
                    for k in range(FirstsInStretch):
                        firstListPercentiles.append(T)
                    stretch=0
                    FirstsInStretch=0
    
        if lastA==True:
            while b<=B-1:
                mergedList.append(float(listB[b]))
    
                b+=1
                stretch+=1
                T=a+b
                if swapped==True:
                    FirstsInStretch+=1
                if b==B or float(listB[b])!=mergedList[T-stretch]:
                    for k in range(FirstsInStretch):
                        firstListPercentiles.append(T)
                    stretch=0
                    FirstsInStretch=0
        else:
            while a<=A-1:
                mergedList.append(float(listA[a]))
    
                a+=1
                stretch+=1
                T=a+b
                if swapped==False:
                    FirstsInStretch+=1
                if a==A or float(listA[a])!=mergedList[T-stretch]:
                    for k in range(FirstsInStretch):
                        firstListPercentiles.append(T)
    
        for k in range(len(firstListPercentiles)):
            firstListPercentiles[k]=firstListPercentiles[k]/T
    
        return [mergedList, firstListPercentiles]
    
    def SPVote(ballots, N, prior=None, record=False, candidates=None):
    
        #Catches some potential errors regarding the function inputs
        if len(ballots)==0:
            print('There are no ballots. The election result is arbitrary. ')
            return None
    
        if isValidSPVData(ballots, N)==False:
            print('Error: The ballot entry is invalid SPVote data. ')
            return None
    
        numCands=len(ballots[0])
        for ballot in ballots:
            if len(ballot) != numCands:
                print('Error: At least one ballot has an incorrect number of scores. ')
                return None
    
        #Defines Cands (the candidates) depending on whether a candidate list was given as an input.
        Cands = {}
    
        if type(candidates)!=type(None):
            if numCands != len(candidates):
                print('Error: The number of scores on each ballot does not match the number of candidates. There are ' + str(len(candidates)) + 'candidates, but there are ' + str(numCands) + ' ballot slots. ')
                return None
            i=0
            for C in candidates:
                Cands[C]=i
                i+=1
        else:
            for C in range(numCands):
                Cands[C]=C
    
        numVotes=len(ballots)
        Scores=range(N+1)
    
        #Creates Argument Set Boundaries for each candidate
        ArgSetBound = {}
        for C in Cands:
            ArgSetBound[C]={}
            for S in Scores:
                ArgSetBound[C][S]=0
    
        #For each candidate C and score S, counts ballots which give C a score of at least S
        for ballot in ballots:
            for C in Cands:
                candScore = ballot[Cands[C]]
                for i in range(0,candScore+1):
                    ArgSetBound[C][i]+=1
        #Normalizes the counts to probabilities/fractions
        for C in Cands:
            for S in Scores:
                ArgSetBound[C][S]=ArgSetBound[C][S]/numVotes
        
        Measure = {}
        for C in Cands:
            Measure[C]=0
    
        #If no prior data is given, then a simple variant is used
        if type(prior)==type(None):
            increment=1/numCands/(N+1)
            for C in Cands:
                for S in Scores:
                    candPs=ArgSetBound[C][S]
                    for K in Cands:
                        if ArgSetBound[K][S] <= candPs:
                            Measure[C]+=increment
                print(ArgSetBound[C])
            print(Measure)
        #Otherwise, we can use the prior data to assign a more informed SPV measure to candidates:
        else:
            with open(prior,'r+') as datafile:
                #Retrieves the score data for each score as a list of lists of past P values
                ScoreData = []
                lines = []
                for line in datafile:
                    lines.append(line)
                    stripped_line = line.strip()
                    line_list = stripped_line.split() 
                    ScoreData.append(line_list)
        
                #If record mode is turned on, checks to see if the write file is empty. If it is, space is made for the new data.
                datafile.seek(0)
                if record==True and len(lines)==0:
                    for S in Scores:
                        datafile.write('\n')
                
                #Combines the new score data with the prior score data and records the percentiles of the new score data with respect to their positions in the merged list using the mergeAscendingLists function.
                datafile.seek(0)  
                ScorePercentiles={}
                for S in Scores:
                    ScorePercentiles[S]=[]
                    newScoreData = []
                    for C in Cands:
                        newScoreData.append(ArgSetBound[C][S])
                    newScoreData.sort()
                    oldDataLength=len(ScoreData)
                    if oldDataLength==0:
                        mergedList=newScoreData
                        L=len(mergedList)
                        for k in range(L):
                            ScorePercentiles[S].append((k+1)/L)
                    elif S<oldDataLength:
                        mergedList=mergeAscendingLists(newScoreData,ScoreData[S])
                        ScorePercentiles[S]=mergedList[1]
                        mergedList=mergedList[0]
    
                    #If record mode is turned on, the prior is updated.
                    if record==True:
                        ScoreDataString=''
                        for item in mergedList:
                            ScoreDataString+=str(item)+' '
                        datafile.write(ScoreDataString+'\n')
                    
                #The measures of the candidates are then calculated according to their percentile scores:
                for C in Cands:
                    for S in Scores:
                        ThisScorePosition=0
                        candPs=ArgSetBound[C][S]
                        for K in Cands:
                            if ArgSetBound[K][S] <= candPs:
                                ThisScorePosition+=1
                        Measure[C]+=ScorePercentiles[S][ThisScorePosition-1]/(N+1)
    
                print(Measure)
                
    Candidates = ['Alice','Bob','Charlie','Dale','Elaine']
    ScoreBins=5  
    DataFile = 'SPVdata.txt'
    #DataFile = None
    RecordMode = True
    Ballots = [
        [0,0,0,4,5],
        [5,4,0,0,0],
        [5,0,0,0,0],
        [5,4,1,0,0],
        [0,1,4,5,1],
        [3,2,0,4,5],
        [0,0,0,0,5],
        [0,1,2,4,5],
        [0,0,0,5,5],
        [0,5,4,1,0],
        [0,5,4,1,0],
        [0,5,4,1,0]
    ]
    SPVote(Ballots, ScoreBins, DataFile, RecordMode, Candidates)
    


  • I hope you are not hoping for this to be adopted for elections. I might be useful in other situations but it is too complex for people to trust.

    Anyway, you are correct that the issue of people scoring on a different scale is a real one. There are other systems like Cardinal Baldwin and distributed voting. The issue with these is that they tend to be non-monotonic. STAR and STLR are the variants were the normalization/scaling is only applied at the last round to keep them monotonic.

    I must admit I do not entirely follow your system. Can you give a clear example? Or a step by step procedure. I do read python but what you have given is very long considering score voting is just a sum.



  • @Keith I don't expect this to be adopted for national elections but it could be something used for small businesses for example. The procedure is actually a lot simpler than what the code or my explanatory powers might suggest. A picture might also help, I may include one soon. Here is a simple example:

    Let there be just three ballots and four candidates A, B, C, D for simplicity, with ordinal scores from 0 to 5 and ballots indicated as follows:

    [0,1,4,5]
    [5,4,0,0]
    [0,1,5,3]

    What we do first is for each of the four candidates, determine the probability that a score given to that candidate by a voter chosen uniformly at random is larger than each score value S. This creates a profile for the candidate. For example, the first candidate A has the following profile:

    A:[1, 1/3, 1/3, 1/3, 1/3, 1/3]

    because every score is at least 0, and 1/3 of scores are maximal, i.e. greater than or equal to each individual score. The other candidates have profiles

    B:[1, 1, 1/3, 1/3, 1/3, 0]

    C:[1, 2/3, 2/3, 2/3, 2/3, 1/3]

    D:[1, 2/3, 2/3, 2/3, 1/3, 1/3]

    You can automatically see that D will not win the election, since C has a strictly superior profile. In fact C also dominates A, so the election is between B and C.

    Now we can calculate the final measure for each candidate. Starting with B, we look at each entry in the profile and compare it with the corresponding entries in the profiles of the other candidates. The first entry of every candidate is always a 1 (every score is at least the minimum score), so we can ignore it. For the second entry in the profile, based only on this election, B is the 100th percentile, since its second entry is greater than or equal to 100% of the corresponding entries. So this entry adds a 1.00 to the measure of B. The corresponding entry for C adds only 3/4, since the second entry of C's profile is greater than or equal to only 3/4 of the corresponding entries.

    The final measure of a candidate is essentially the sum of the percentiles over each entry of a candidate's profile. For example,

    Meas(B)=1+1/2+1/2+3/4+1/4=3

    Meas(C)=3/4+1+1+1+1=4.75

    So the winner would be C. In the code above I normalized the measures so that they are always between 0 and 1---the measure of a candidate is the expected value of the fraction of vertices in their profile that are not engulfed by a random candidate's profile.

    This was an example where no prior distribution for the profile entries was given, so we used the ballots to supply the percentiles. It's possible to assign the percentiles according to a distribution based on past election data to give a more informed measure.

    One thing that interests me is to compare this system with ordinary score and see where they are in agreement. In this case the same result is given by both methods.

    A lot of the code is the mergeAscendingLists function which is just an auxiliary for the SPVote algorithm. Most of the code has to do with updating prior data and calculating percentiles from the prior data. I made it so that you can turn "record mode" on or off, so you can set the prior data to whatever you want, or have no prior data and use the simple variant exemplified here.



  • I made a video explaining this system: https://app.vmaker.com/record/SGSydGYcwOW9Vf6d



  • Suppose you allow six grades. Call them 0 (worst), 1, 2, 3, 4, and 5 (best). The candidates are Bush, Gore, and Nader. The Republicans bullet vote, because they do not like Nader any better than they like Gore. They think both of those are basically the same as Stalin. The DINOs bullet vote, because they are not used to the system, and cannot conceive of a better way to keep the hated Bush down than by throwing their full support behind Gore. They either have not heard that Nader is running, or wouldn't pay attention to such an announcement, on the grounds that fringe candidates cannot win.

    Consider some different possible proportions of the electorate being DINOs and Republicans, and the remainder being Nader supporters. Let's say the Nader supporters honestly value Gore at 20% of the value of Nader, given a baseline of zero for Bush.

    Are there some proportions of numerosities among these three factions such that in Score, the Nader supporters would see a better outcome if they voted Gore 4 than Gore 1? In your system, would they also see a better outcome if they voted Gore 4 than Gore 1, for some of the possible proportions? Are there some proportions for which the best outcome they could get in your system, using their optimal strategy for your system, is better than what they could get with Score using the same granularity in the range, and using their optimal strategy for that system?

    I am working on a simulation framework to be based on a slider control that could be used to vary the proportions of the electorate belonging to factions. Coders could add any number of voting systems along with one or more alternative strategies for each system (the easiest to code, in each case, would probably be the naïve strategy, and I think it should be included with every system). Non-coder researchers could define factions, give the true sentiments for these factions, select strategies for them to use from among those provided by the coder contributors, and set their numerosities as a constant count of voters or a multiplier by the master slider control's leftward or rightward aspect. I'm thinking these aspects would evaluate to 1 when the slider is in the middle, and from 0 to 2 otherwise.



  • @cfrank From your video, I notice that the system would depend on past elections. I repeat that here for readers who don't hear out the video.



  • @Jack-Waugh Thank you for your thoughtful reply. Each candidate is given an “independent” ordinal score, so those kinds of results carry over from cardinal score. Altering the score given to Nader will only affect Nader’s score profile, it can’t cause Gore to beat out Bush or vice versa unless the voters change the scores they give to Gore and/or Bush. If a voter wants Nader to win, doesn’t mind Gore and wants Bush to lose, for example, then they should vote something like [B,N,G]=[0,5,3]. If they are really worried about Bush, then they should vote [0,5,5]. But this obviously increases the chance that Nader loses to Gore, so it’s up to the voter to manage their conflicting interests and indicate a ballot they think will serve them most effectively.

    The system is very similar to score voting, the alterations I made were only to make the choice of cardinal scores less arbitrary.



  • @Jack-Waugh that can be true, but the system can also be made into a deterministic one by choosing a distribution or even by putting in place a deterministic method to select a distribution based on the present ballots.



  • @cfrank said in Ordinal Score "(S,P)"-systems:

    The system is very similar to score voting, the alterations I made were only to make the choice of cardinal scores less arbitrary.

    Choice by the election designers, or choice by the voters?



  • @Jack-Waugh choice by the system designers. If anything I would actually prefer the scores to be more arbitrary for the voters, since this will prevent strategic behavior. That’s a part of why the system abstains from assigning the scores cardinal values. In the learning variant, the scores obtain their cardinal values via the manner in which they are used by the electorate, rather than the other way around.


Log in to reply