====================================================================
Authors: Jonas Gehrlein
Last updated: 07.12.2020
====================================================================
Validator selection¶
Introduction¶
The validator elections are essential for the security of the network, where nominators have the important task to evaluate and select the most trustworthy and competent validators. However, in reality this task is quite challenging and comes with significant effort. The vast amount of data on validators (which is constantly increasing) requires a substantial technical expertise and engagement. Currently, the process is too cumbersome and many nominators are either not staking or avoiding spending too much time to go through the large amount of data. Therefore, we need to provide tools, which both aid nominators in the selection process, while still ensuring that the outcome is beneficial for the network.
The following writeup provides an overview of several potential steps, which benefit the nominators while maintaining their freedom of choice. As a first step, it is helpful to illustrate why recommendations should be based on user’s preferences and cannot be universal for all individuals.
Problem¶
It is not desirable to provide an exogenous recommendation of a set of validators, because user’s preferences (especially riskpreferences) are quite different. Therefore, a comparison between metrics on different scales (e.g., selfstake in DOTs vs. performance in %) is not exogenously not possible. In addition, the shape of the marginal utility functions even within one dimension is unclear and based on individual’s preferences. It is outside our competence to decide on various tradeoffs of the selection process on behalf of nominators. To illustrate this issue, consider the following simple example:
Commission 
SelfStake 
Identity 
EraPoints 


Validator 1 
4% 
26 DOTs 
Yes 
Average 
Validator 2 
7% 
280 DOTs 
No 
Average  1% 
Validator 3 
1% 
1 DOT 
No 
Average + 5% 
All validators in the table have different profiles, where none is dominated. Validator 3 potentially yield high profits but does not have much selfstake (skininthegame) and is without registered identity. Validator 1 charges a higher fee for their service but might leverage a reputable identity. Validator 2 requires substantial fees but has the most selfstake. One could easily think of different preferences of users, who would prefer any one of those validators. While probably every user could make a choice from that selection, the problem gets increasingly difficult for a set of 2001000 validators.
Code of conduct for recommendations¶
As mentioned before, we cannot and do not want to give an exogenous recommendation to the users. We prefer methods, which values this insight and generates a recommendation based on their stated preferences. While valuing the preferences of the users, we still can nudge their decisions in a direction beneficial for the network (e.g., to promote decentralization). Nevertheless, the recommendation should be as objective as possible and should not discriminate against any specific validator.
Organization¶
Validator selection is divided into several chapters. In the sections “Underlying dataset” (Link), we illustrate which data might be useful and how additional metrics can be generated. Afterwards, we can apply a simple concept from economics to significantly reduce the size of potentially intresting validators. Afterwards, This is the first step to give users a way to choose at hand. Then, we discuss some ideas to further curate the set of validators to promote goals of the network. As a last section the UTAStar method illustrates a sophisticated approach to estimate the individual marginal preference functions of the user and make a more precise recommendation.
Underlying Data¶
This section explains which data can be gathered about validators in Polkadot and Kusama and are relevant for a selection process. Those metrics indicated with a * are used in the final dataset, the other variables are used to generate additional metrics. Currently, we focus on quantitative onchain data as those are verifiable and easy to process. This purely quantitative approach should be regarded as complementary to a selection process based on qualitative data, where nominators are e.g., voting for validators based on their identity or influence / engagement in the community.
Retrievable data¶
Name 
Historical 
OnChain 
Description 

Public Address* 
No 
Yes 
The public identifier of the validator. 
Identity* 
No 
Yes 
Is there a verified onchain identity? 
Selfstake* 
No 
Yes 
The amount of tokens used to selfelect. Can be seen as skininthegame. 
OtherStake 
No 
Yes 
The amount of allocated stake (potentially) by other nominators. 
TotalStake 
No 
Yes 
The sum of selfstake and otherstake. 
Commission 
Maybe 
Yes 
The amount of commission in % which is taken by the validator for their service. 
EraPoints 
Yes 
Yes 
The amount of points gathered per era. 
Number of Nominators* 
No 
Yes 
The amount of nominators allocated to a validator. 
EraPoints: The erapoints are awarded to a validator for performing beneficial action for the network. Currently this is mainly driven by block production. In general, the distribution of erapoints should be uniformly distributed in the long run. However, this can vary if validators operates on a superior setup (stronger machine, more robust internet connection). In addition, there is significant statistical noise from randomness in the shortterm, which can create deviations from the uniform distribution.
Generated metrics¶
Some of the retrieved onchain data might be not very useful for nominators or can serve some additional metrics, which help in the selection process.
Name 
Historical 
OnChain 
Description 

Average Adjusted EraPoints 
Yes 
Yes 
The average adjusted erapoints from previous eras. 
Performance 
Yes 
Yes 
The performance of a validator determined by erapoints and commission. 
Relative Performance* 
Yes 
Yes 
The performance normalized to the set of validators. 
Outperforming MLE 
Yes 
Yes 
An indicator how often a validator has outperformed the average erapoints. Should be 0.5 for an average validator. 
Average Performer* 
 
Yes 
A statistical test of the outperforming MLE against the uniform distribution. Indicates if a validator statistically over or underperforms. 
Active Eras* 
Yes 
Yes 
The number of active eras. 
Relative total stake* 
No 
Yes 
The total stake normalized to the set of validators. 
Operator Size* 
No 
Yes 
The number of validators which share a similar onchain identity. 
Average Adjusted EraPoints To get a more robust estimate of the erapoints, additional data from previous eras should be gathered. Since the total erapoints are distributed among all active validators, and the set of active validators might change, it could bias the results. To counter that, we can adjust the erapoints of each era by the active set size of that era. As this is the only biasing factor on the theoretical percapita erapoints, we can thereby make the historic data comparable.
It is unclear how many previous eras should be used as having a too long history might bias the results towards the average while too short of a history diminishes the robustness of the metric. One idea could be to use the average of \(activeeras\).
Performance: The performance of a validator from the point of view of a nominator is determined by the amount of erapoints gathered by that validator, the nominator’s share of the total stake and the commission a validator is charging. In addition, the performance level is linear in the bond of the nominator and is thereby independent from that. We can combine those metrics into one:
The relative performance is then simply defined by: \(\frac{performance  min(performance)}{max(performance)  min(performance)}\) This gives a more understandable measure as the performance is normalized between 0 and 1. Additionally, it is robust to potential changes within the network (e.g. with a larger number of validators the erapoints are reduced per era) and prevents false anchoring effects.
Outperforming MLE: By gathering the historic erapoints per validator during past eras, we can calculate how often a validator outperformed the average. As erapoints should be distributed uniformly, a validator should outperform the average 50% of times. However, as mentioned before, in reality additional factors as hardwaresetup and internet connection can influence this. This helps nominators to select the best performing validators while creating incentives for validators to optimize their setup.
Significance MLE: As the expected value of the outperforming MLE is 0.5 and the distribution should be uniformly, we can calculate whether a validator significantly over or underperforms by: \(z = \frac{outperformingMLE  0.5}{\sqrt{\frac{0.5 \times (10.5)}{numberActive}}}\)
If \(z > 1.645\) we can say that the respective validator outperforms significantly (10% significance level), while \(z < 1.645\) indicates significant underperformance.
Operator Size: Based on the identity of a validator, we can estimate how many validators are run by the same entity. It is both in the interest of users and the network that there are not too many operators and that those operators are not too large. Selecting validators of larger operators might increase the risk of superlinear slashing, because it is reasonable to assume that those operators follow similar security practices. A failure of one validator might mean a failure of several of those validators which increases the punishment superlinearly. A counterargument to this might be that larger operators are much more sophisticated with their setup and processes. Therefore, this objective measure should be left to the user to judge.
Filtering Phase¶
DominanceFiltering¶
After constructing the dataset as elaborated in the section “underlying data”, we can start reducing the set of validators to reduce the amount of information a nominator has to process. One concept is to remove dominated validators. As we do not make qualitative judgements e.g., which “identity” is better or worse than another, we can remove validators who are inferior to another, since there is no rational reason to nominate them. A validator is dominated by another validator if all her properties are equal and at least one property is worse. Consider the following example:
Example:¶
Number 
Public Address 
Identity 
Selfstake 
Nominators 
Relative Performance 
Outperformer 
Active Eras 
Operator Size 

1 
1N6xclmDjjA 
0 
10 
10 
0 
0 
3 
0 
2 
1ohS7itG5Np 
0 
200 
40 
0.7 
0 
4 
2 
3 
1xgFnMhdOui 
1 
100 
89 
0.3 
0 
16 
3 
4 
1vO7JLtSm4F 
1 
5000 
89 
1 
1 
29 
3 
Validator 1 is dominated by Validator 2, which means that it is worse in every dimension (note, as mentioned above a user might prefer larger operators in which case this would not be true). Validator 3 is dominated by Validator 3 and therefore can be removed from the set. By this process the set can be reduced to two validators. In practice, this shows to be quite powerful to vastly reduce the set size.
Further curation¶
Here we have the opportunity to do additional cleanup to the remaining set. As mentioned in the code of conduct, those should be optional but we can suggest default values for users. * Include at least 1 inactive validator. (we might suggest some inactive nodes based on some other processes.) * Reduce risk of superlinear slashing (i.e., remove validators from operators). * Remove validators who run on the same machine (some analysis of IP addresses possible?).
Manual selection¶
After the set has been reduced by removing dominated validators and giving some filter option the user can easily select preferred validators manually. In this step, the selection is purely based on personal preferences and for example a nominator might order the validators by their relative performance and select those who also satisfy some requirements on a minimum selfstake.
UTAStar¶
This method takes the filtered table from section LINK as input and therefore can be seen as a natural extension to the method before.
Overview¶
UTA (UTilité Additive) belongs to the methods of preference disaggregation (JacquetLagrèze & Siskos, 1982). UTAStar is an improvement on the original algorithm. The general idea is that the marginal utility functions of a decision makers (DM) on each dimension of an alternative (i.e. criterion) can be deduced from apriori ranked lists of alternatives. It uses linear programming to search for utility functions which satisfy the initial ranking of the DM while giving other properties (such as the maximum utility is normalized to 1).
Some notation:¶
This writeup relies strongly on Siskos et al., 2005 * \(u_i\): marginal utility function of criteria i. * \(g_1,g_2,...g_n\): Criteria. * \(g_i(x)\): Evaluation of alternative x on the \(i^{th}\) crterion. * \(\textbf{g}(x)\): Vector of performances of alternative \(x\) on \(n\) criteria. * \(x_1, x_2, ..., x_m \in X_L:\) Learning set which contain alternatives presented to the DM to give a ranking on. Note, that the index on the alternative is dropped.
Model¶
The UTAStar method infers an unweighted additive utility function:
\(u(\textbf{g}) = \sum_{i=1}^{n} u_i(g_i)\) where \(\textbf{g}\) is a vector of performances. with the following constraints:
where \(u_i, i=1,2...,n\) are non decreasing valued functions which are normalized between 0 and 1 (also called utility functions).
Thereby the value of each alternative \(x \in X_L\): $$ u’[\textbf{g}(x)]=\sum_{i=1}^{n}u_i[g_i(x)])+ \sigma^{+}(x) + \sigma^{}(x) \forall x \in X_L$$ where \(\sigma^{+}(x)\) and \(\sigma^{}(x)\) are the under and overestimation error. is a potential error relative to \(u'[\textbf{g}(x)]\)
The corresponding utility functions are defined in a piecewise linear form to be estimated by linear interpolation. For each criterion, the interval \([g_{i\star}, g_i^\star]\) is cut into \((\alpha_i  1)\) intervals and the endpoints \(g_i^j\) are given by:
The marginal utility function of x is approximated by linear interpolation and thus for \(g_i(x) \in [g_i^j  g_i^{j+1}]\)
The learning set \(X_L\) is rearranged such that \(x_1\) (best) is the head and \(x_m\) is the tail (worst). This ranking is given by the user.
\(\Delta(x_k, x_{k+1}) = u'[\textbf{g}(x_k)]  u'(\textbf{g}(x_{k+1}))\) then we can be sure that the following holds:
where \(\delta\) is a small and positive number which is an exogenous parameter set as the minimum discrepancy between the utilities of two consecutive options. In order to ensure monotonicity we further transform the utility differences between two consecutive interval endpoints:
Algorithm¶
Step 1: Express the global value of the alternatives in the learning set \(u[g(x_k)], k=1,2,...m\) in terms of marginal values \(u_i(g_i)\) and then transform to \(w_{ij}\) according to the above mentioned formula and by means of
Step 2: Introduce two error functions \(\sigma^{+}\) and \(\sigma^{}\) on \(X_L\) by writing each pair of consecutive alternatives as:
Step 3: Solve the linear problem:
Step 4: Robustness analysis to find find suitable solutions for the above LP.
Step 5: Apply utility functions to the full set of validators and return the 16 best scoring ones.
Step 6: Make some ad hoc adjustments to the final set (based on input of the user). For example: * include favorites * at most one validator per operator * at least X inactive validators * etc.
Remaining Challenges¶
There remain a few challenges when we want to apply the theory to our validator selection problem. 1) One challenge is how to construct the learning set. The algorithm needs sufficient information to generate the marginal utility functions. * Find methods to guarantee performance dispersion of the different criteria. * Use machine learning approaches to iteratively provide smaller learning sets which gradually improve the information gathered. * Potentially use simulations to simulate a wide number of learning sets and all potential rankings on them to measure which learning set improves the information the most. 2) UTAStar assumes piecewise linear monotone marginal utility functions. Other, methods improve on that but might be more difficult to implement.