Computing Voter Transitions : The Elections for the Catalan Parliament , from 2010 to 2012

Purpose: To estimate the transition rates corresponding to the 2010 and 2012 elections to the Catalan Parliament for the four constituencies in which Catalonia is divided for this purpose. The main features of the results, which are obtained by means of mathematical programming, are commented. Design/methodology/approach: Mathematical programming optimization models are formulated in order to find the transition rates that yield a better adjust between the actual results in 2012 and those computed applying the transition rates to the 2010 results. The transition rate matrices are estimated for each one of the four constituencies, since the set of options is not the same for all them. No other assumptions that those of numerical consistency are adopted. Findings and Originality/value: The transition rate models provide satisfactory goodness of fit. Mathematical programming turns out to be an easy-to-use tool for estimating the transition rates and, at the same time, very flexible, since, if necessary, it allows incorporating the constraints corresponding to additional assumptions. Originality/value: The transition rates from 2010 to 2012 in Catalonia are particularly interesting, since 2012 results implied a significant change in the composition of the Catalan Parliament. To the best of our knowledge, no other scientific journal paper has dealt with this


Introduction
When considering the results of two consecutive elections in the same electoral area, a usual way of trying to interpret the results of the later polls is seeing them as the consequence of voter transitions from the options they preferred in the former.Politicians, the media, political scientists and most citizens are interested in the changes of the preferences of people having the right to vote.
According to Hawkes (1969) the historian Trevor Lloyd was the first of stating the question.
Formally, given the results of two consecutive (or even simultaneous) elections for each one of t h e divisions (constituencies, municipalities, polling stations or any other partition) of an electoral area, the problem is to find the matrix of transition rates from the options available in the first election (rows) to the options in the second one (columns).Of course, if one considers only aggregate results of the whole territory or of any set of constituencies, in general there are infinitely many solutions to the matrix.On the other hand, applying a unique transition rate matrix to diverse constituencies or groups of constituencies, the computed results will not always coincide with the real ones.Clearly, the elements of the matrix must be nonnegative and those belonging to any given file must sum up to 1 (these two conditions imply that the elements must be less than or equal to 1).
A survey can be used to estimate the elements of the transition rate matrix.However, the results are highly unreliable, because of many reasons that are discussed, for instance, in Brown and Payne (1986) and in Van der Ploeg, Van de Pol and Kampen (2006).Moreover, unless the number of elements in the sample is very high, many elements of the matrix (those corresponding to small values of the transition rate) will be equal to zero.Therefore, at the expense of greater modelling and computing efforts, the use of the results of both elections is a more reliable way to obtain the matrix.Hawkes (1969) tried "to estimate the number of people voting for a particular party at one election who subsequently vote for another specified party at the next election".In order to do this, proposed three methods.However, these did not guarantee that the results fulfil the stated above conditions.Therefore, the author concluded that although "the attempt has not been as successful as one would wish, some useful results are obtainable".Miller (1972) and Upton (1977) adopt an approach that may be inscribed in the same stream that Hawkes (1969).Other related works are Hayes (1976) and Moores (1987).
Instead, Irwin and Meeter (1969) and McCarthy and Ryan (1977) use quadratic programming to estimate the transition rates, thus guaranteeing from the outside the fulfilling of the above mentioned conditions, which are imposed by means of constraints in the mathematical programming.Tziafetas (1986) uses an approach similar to that of McCarthy and Ryan (1977) and a variant of it, which consists in using the absolute value of the deviations instead of the squares of them.Upton (1978), concerning McCarthy and Ryan's approach, observes that it gives a high proportion of zeroes in the matrix and concludes that overestimates the proportion of stayers (voters that do not change their preferences between the to elections -actually, electors that vote for options whose name is the same in both elections−, contrasting with movers).This criticism, which we do not deem fully justified, has been assumed by other authors, as, for instance Johnston and Hay (1983).
The estimation of voters' transition rates is often seen as a particular case of the ecological inference problem, i. e., to deduct individual behaviours from aggregate data, a problem which was deemed impossible, with the methods available at that time, in Robinson (1950).This notwithstanding, many methods have been proposed for dealing with it.Some of them assume, perhaps implicitly, that the behaviour pattern is the same or very similar in all the areas (ecological regressions; see: Goodman, 1953;Goodman, 1959;concerning transition rates: Fülle, 1994;van der Ploeg et al., 2006).Others, consider that the behaviour pattern may depend on the areas and usually adopt a probabilistic approach (ecological inference; see: Glynn & Wakefield, 2010;Greiner & Quinn, 2009;Grofman & Merrill, 2004;King, 1997; concerning transition rates -probabilities−: Andreadis & Chadjipadelis, 2009;Antweiler, 2007;Brown & Payne, 1986;Johnston & Hay, 1983).
The purpose of the present paper is to determine, for the 2010 and 2012 elections to the Parliament of Catalonia, a transition matrix for each constituency that (i) fulfils the nonnegative and sum-to-one constraints; (ii) applied to the aggregate results of the first election give exactly the results of the second one for each one of the available options in this later and (iii) minimises a function of the discrepancies between the results obtained with the matrix and those given by the count of votes in every division.Note that we neither formulate any assumption about the differences or coincidences between the behaviour patterns of the electors corresponding to diverse polling stations nor adopt a probabilistic point of view.
Therefore, the problem can be stated as a mathematical programming model, which for some kinds of discrepancy functions is easy to solve.Even though our purpose is to find transition matrices for any set of divisions (and for different kinds of divisions), without introducing any a priori assumption about the values of the transition rates, we will comment some results given by the models in order to facilitate their interpretation.
The layout of the rest of the article is as follows.Section 2 presents the problem and its mathematical programming formulations.The data and the obtained results are presented and commented in Section 3. Section 4 ends the paper with some short conclusions.

Statement of the Problem and its Mathematical Programming Formulation
It is assumed that we have the results corresponding to two elections in the same electoral area, such that the first one happened at t and the second one at t'(≥t).
When t' is very close to t (or even equal to t) it may be that the censuses corresponding to both elections are identical.If actually they are not, however, it is usual to circumvent this difficulty assuming that the behaviour of the electors not belonging to the intersection of both censuses is not different from those that belong to it.In fact, this is equivalent to assume that both censuses are identical and this assumption is reasonable when t' is not far from t (Brown & Payne, 1986;Hawkes, 1969;McCarthy & Ryan, 1977), as happen with the two elections considered in this paper, separated by only two years (2010,2012).
The electoral area is partitioned into constituencies and these, at the end, into polling stations.
Therefore the results are available for all the polling stations of the electoral area.It may happen that the polling stations belonging to a given constituency do not coincide from one election to another (because some are created, suppressed or divided) and in this case one can only compare the results corresponding to the polling stations common to both elections.
On another hand, the available options (including blank vote, null vote and abstention) may be different from one constituency to another (in the case of the elections for the Catalan Parliament, they are, since the candidates to win the seats are different, even for the options with the same name).Therefore, the constituencies have to be considered separately.
Hence, the data must refer always to a given constituency or a subset of polling stations belonging to a given constituency.The considered polling units can be grouped to form divisions (therefore, a division is a set of one ore more polling stations; every considered polling station must belong to one and only one division).The divisions may be, for instance, municipalities, districts or any sets of polling units that be convenient for the analysis.
The notation that we use for the data are as follows: m Number of divisions (common to both elections).
n, n' Number of options at t and t', respectively.pik Proportion of votes obtained at t by the option k in the division i Proportion of votes obtained at t' by the option j in the division i (i = 1, …, m; j = 1, …, n').
ci Census (i.e., number of people having the right to vote) of division i at t', assumed to be equal to that of t (i = 1, …, m).
And for the decision variables: rkj Transition rate from option k at t to option j at t' (k = 1, …, n; j = 1, …, n').
The matrix R made up of the transition rates rkj must belong to the set F defined by the following constraints: (1) (2) (3) Equation 1 impose that the transition rates from an option at t to every other at t' must sum up to one; Equation 2, that the total number of votes obtained in the constituency by an option at t' equals the number that results when applying the transition rates to the number of votes corresponding to t (under the assumption that the census at t and at t' are the same); Equation 3 enforce the obvious non-negativity condition.
These constraints define a set of matrices having generally infinitely many elements.One way to select one of these elements is to minimise the discrepancies between the actual results of elections at t' and those resulting from the application of the transition rates.Of course, the selected matrix depends on the used measure of the discrepancies.
This way we define the following four models: The objective functions of the models are (M1) the sum of the squared discrepancies between actual and modelled number of votes, (M2) the sum of the squared discrepancies between actual and modelled proportions, (M3) the value of the maximum discrepancy between actual and modelled number of votes and (M4) the value of the maximum discrepancy between actual and modelled proportions.
Since the constraints that define the set F are linear, M1 and M2 are quadratic programs.For their part, M3 and M4 can be reformulated as linear programs as follows: As it is known, however, minmax problems have usually multiple optimums, since the objective function does not take into account the values of the discrepancies that are strictly less than the optimum value.Therefore, after solving M3 and M4, more satisfactory solutions may be obtained using a second criterion (the sum of the absolute values of all the discrepancies); this leads to the following two quadratic programming models: To solve the six mathematical programming models described in the preceding section for each one of the constituencies the software CPLEX 12.2 was used.The computing times were not significant in any case.The transition matrices obtained with models M3 and M4 can be disregarded, since those corresponding to models M3' and M4', respectively, are always preferable; only the optimum values of their respective objective functions are used (as inputs for models M3' and M4', respectively).
In order to evaluate the goodness of fit of the solutions provided for the different models we will use the following criteria, respectively related to the objective functions of models M1, M2, M3' and M4': • , coefficient of determination corresponding to the numbers of votes, defined as follows: where (i.e., the proportion of votes obtained globally by the option j in the set of divisions).
• , coefficient of determination corresponding to the proportions: • max, the maximum discrepancy between actual and modelled numbers of votes.
• max, the maximum discrepancy between actual and modelled proportions of votes.
Coefficients of determination, which are useful as complementary information to assess the goodness of fit of the models, may be defined for each option as well: Of course, models M1 and M2 will yield always the best values of and , respectively (since the denominators in the definition of these criteria are constants and the respective numerators are the objective functions of M1 and M2).In a similar way, models M3-M3' and M4-M4' will be the best for max and max, respectively.The behaviour of the mentioned models with regard to the other criteria is difficult to forecast, excepting that M1 and M2 are more robust than M3' and M4' in relation to outliers.
The transition rates matrices and the values of all the criteria described above for Barcelona, Girona, Lleida and Tarragona, can be found in: Table 2 shows the values of the four criteria corresponding to the four models M1, M2, M3' and M4' and to the four constituencies.
One can see that the values of and are good (high).The latter, with M2, reaches 0.97 for Tarragona, while the best value of is 0.86 (Barcelona and Girona).Concerning these two criteria, the results are very similar for both M1 and M2, although, of course, M1 gets the best values for and M2 for .The worst value, from those given by M1, of is 0.77 (Tarragona) and the worst of , from those given by M2, is 0.75 (Lleida).
For their side, the values of max and max are high, even for M3' and M4' and, although they are not very much better than those obtained with M1 and M2, when the objective of minimising these criteria is imposed (M3' and M4') the values of and deteriorate.Since the 16 transition rate matrices obtained for the four constituencies with the four models, even though they show some common characteristics, are fairly different, it is not possible to analyse them here in detail and we will limit ourselves to some general comments and to deepen a little more into the transition rate matrix given by M1 for the constituency of Barcelona.
In 2010, seven options won seats in the Catalan Parliament.In 2012, there were again seven the options present in the Parliament, but different from those of 2010, because SI (4 seats in 2010) did not obtain representation in 2012 and CUP, which did not present lists of candidates in 2010, won 3 seats in 2012.Table 3 shows, for the options that won seats, either in 2010 or in 2012, and for the abstention, the results they obtained, for the overall of the four constituencies, in both elections.For its side, Table 4 shows analogous data for the constituency of Barcelona.The Barcelona constituency stands out for the high proportion of stayers that have the abstention and the options that won seats both in 2010 and 2012 (from M1: CiU, 77%; PSC-PSOE, 82%; PP, 91%; ICV-EUiA, 94%; ERC, 100%; C's, 100%; abstention, 75%).Figure 1 shows the main flows, from 2010 to 2012, given by M1 in this constituency.

Conclusions
This paper describes the problem of computing the transition rates corresponding to two consecutive elections held in the same electoral area.Four mathematical programming models are proposed to deal with it and are applied, regarding the 2010 and 2012 elections to the Catalan Parliament, to the four constituencies in which Catalonia is divided for this purpose.
-133-Journal of Industrial Engineering and Management -http://dx.doi.org/10.3926/jiem.1189 The models include constraints to guarantee that the total number of votes obtained in the constituency by an option at t' equals the number that results when applying the transition rates to the number of votes corresponding to t.The main features of the obtained results are commented.Mathematical programming reveals itself as a powerful flexible tool to calculate transition rates matrices.

Figure 1 .
Figure 1.Main flows from 2010 to 2012, given by M1 for the constituency of Barcelona, for the elections to the Catalan Parliament.The numbers next to the arrows are, expressed as a percentage, the corresponding transition rates (those under 1% are omitted).All the options represented in the figure, including abstention and with the exception of CiU, also receive flows from a certain number of minor options

Table 1 .
Sizes of the data sets used in the computational experiment.In all cases, the number of options includes null and blank votes and abstention

Table 2 .
Values of the four criteria corresponding to the four models and the four constituencies: Barcelona (BAR), Girona (GIR), Lleida (LLE) and Tarragona (TAR).The best values of the criteria, for each constituency, are highlighted in bold.Of course, they are, for , , max and max those obtained, respectively, with models M1, M2, M3' and M4'.
Note that the percentages are over the census (not, as it is more usual, over the valid votes for the lists of candidates)

Table 3 .
Main global results of the elections to the Catalan Parliament held in 2010 and 2012Note that the percentages are over the census (not, as it is more usual, over the valid votes for the lists of candidates)

Table 4 .
Main results, in the constituency of Barcelona, of the elections to the Catalan Parliament held in Journal of Industrial Engineering and Management -http://dx.doi.org/10.3926/jiem.1189votes got by CiU in 2010.The votes for CUP come mainly from ICV-EUiA and numerous minor options that won seat neither in 2010 nor in 2012; in Barcelona, moreover, CUP receives a significant amount of votes from CiU.
voters in 2010 changed to to C's in 2012.With the exception of Barcelona, ICV-EUIA shows a low proportion of stayers and the movers go to CUP and ERC.C's enjoys high proportions of stayers and gets votes from PSC-PSOE and PP.SI has no stayers in any constituency and its movers go mainly to CiU and to ERC as well; in Barcelona it receives a small proportion of the -132-