Computing a balanced solution

This is a technical note with algorithmic considerations related to the validator election protocol under NPoS. We consider a scenario where a committee of validators has already been elected, and we explore the best way to assign the nominators’ stake to them. The reader should already be familiar with our research paper, and in particular the concept of balanced solutions defined in it. Although we prove in that paper that balanced solutions can be computed efficiently, not many details are given about it. Such details are presented in this note.

After establishing some notation, we introduce the balancing problem and explain why this is exactly the problem we need to solve. We then establish two algorithmic ways to solve the balancing problem, namely 1) using parametric flow algorithms, and 2) using a heuristic called star balancing, and we compare them.

1. Notation

We consider an instance of NPoS consisting of a bipartite graph \((N\cup A, E)\), where \(N\) is the set of nominators, \(A\) is a committee of elected validators of size \(k\), with \(k:=|A|\ll |N|\), and there is an edge \(nv\in E\) whenever nominator \(n\) approves of validator \(v\in A\). We are also given a vector \(s\in\mathbb{R}^N_{\geq 0}\) of nominator stakes, where \(s_n\) is the stake of nominator \(n\). An edge weight vector \(w\in \mathbb{R}^E_{\geq 0}\) is feasible if it is component-wise non-negative and observes the constraints: \(\sum_{v\in A: \ nv\in E} w_{nv} \leq s_n\) for each nominator \(n\in N\). We say that \(w\) is tight if the previous inequality is tight for each nominator \(n\) that has at least one neighbor in \(A\).

Let \(B\in \{0,1\}^{A\times E}\) be the node-edge incidence matrix for the validator set \(A\). For any \(w\in \mathbb{R}_{\geq 0}^E\), the total support that \(w\) assigns to each validator in \(A\) is given by the vector \(supp_w :=Bw\in \mathbb{R}^A\), so that for any validator \(v\in A\), its support \(supp_w(v)=(Bw)_v = \sum_{n\in N: \ nv\in E} w_{nv}\) is the total amount of stake that \(w\) assigns to \(v\) from the nominators.

Given an instance as above, the balancing problem consists of finding a tight vector \(w\) that minimizes the squared \(\ell_2\) norm of the support vector, i.e. minimize the value

\[val(w):= \|supp_w\|^2 = \|Bw\|^2.\]

Clearly, an optimal solution to this problem corresponds precisely to a balanced solution, as defined in our paper.

2. Algorithms

There are three possible ways to solve the balancing problem:

  1. Via convex programming: it can be solved with numerical methods for convex quadratic programs, but this is too computationally expensive to consider any further.

  2. Via parametric flow algorithms: We show in the research paper that the balancing problem can potentially be solved in time \(O(|E|k + k^3)\) using some advanced techniques for parametric flow problems.

  3. Via a simple combinatorial heuristic: the star balancing heuristic starts with any tight vector \(w\) and converges to an optimal vector \(w^*\) by following a local weight-balancing rule. It executes in time \(\tilde{O}(|E|k^2)\), ignoring logarithmic factors.

At first look, the worst-case complexity bound is much better for technique 2 than for technique 3. However, we point out that Babenko et al. (2007) studied a parametric max flow problem closely related to the balancing problem and performed experimental evaluations of both of these techniques, over real data for an application in revenue optimization as well as over synthetic data. They concluded that the performance of star balancing is actually comparable to that of parametric flow algorithms, except for instances with degenerate graph topologies. In fact, they conjecture that these two techniques have similar complexities whenever the underlying graph has moderately good expansion properties.

In view of this and of the fact that star balancing is vastly easier to implement than the algorithm based in parameter flow, we suggest that star balancing be used for NPoS.

3. The star balancing heuristic

Star balancing is a combinatorial randomized algorithm that outputs a solution arbitrarily close to optimal with high probability (this is what is known as a polynomial-time randomized approximation scheme, or PRAS). We remark that a different analysis to this algorithm can be found in Tarjan et al. (2006). We show the following.

Theorem: For any fixed parameters \(\varepsilon, \delta>0\), the star balancing algorithm returns a tight weight vector \(w\) whose value \(val(w)\) has a probability at least \((1 - \delta)\) of being within a multiplicative factor at most \((1+\varepsilon)\) from minimal, and runs in time \(O(|E|k^2 \log (k/\varepsilon \delta)).\)

Algorithm: Star balancing.

Consider an instance \((N\cup A, E, s)\). For each nominator \(n\in N\) let \(A_n\subseteq A\) be its set of neighbors in \(A\).

Fix constants \(\varepsilon, \delta>0\). The algorithm starts with an arbitrary tight vector \(w\), and improves it iteratively by performing \(r\) rounds, where we will give a precise value for \(r\) and prove that \(r = O(|N|k^2\log(k/\varepsilon \delta))\).

  1. Find any tight vector \(w\).

  2. Repeat \(r\) times: a. Select a nominator \(n\in N\) uniformly at random. b. Modify the weights of the edges incident to \(n\), keeping \(w\) tight and observing the non-negativity constraints, so that the supports of the neighboring validators are as close to each other as possible, i.e. so that

    \[\forall v,v'\in A_n, \ supp_w(v)>supp_w(v') \rightarrow w_{nv}=0.\]
  3. Return \(w\).

Running time: Consider a round of the algorithm. If nominator \(n\) is selected, the running time of the round is \(O(|A_n|)\), assuming that floating-point arithmetic operations take constant time. Hence, the average running time per round is proportional to \(\frac{1}{|N|}\sum_{n\in N} |A_n|=\frac{|E|}{|N|}\). Together with the bound on \(r\), we obtain a global running time of \(O(r|E|/|N|) = O(|E|k^2\log(k/\varepsilon \delta)).\)

Analysis: For each \(i\leq r\), let \(w^i\) be the state of weight vector \(w\) at the end of the \(i\)-th round, and let \(w^0\) be the initial vector. Let \(w^*\) be an optimal solution. Let’s start with an easy observation.

Lemma 1:\(val(w^0)\leq k\cdot val(w^*)\).

Proof: Recall that the objective value to minimize is \(val(w)=\|Bw\|^2_2=\|supp_w\|_2^2\). As both \(w^0\) and \(w^*\) are tight, the \(\ell_1\) norm of their support vectors are equal. Hence \(val(w^0)=\|Bw^0\|_2^2 \leq \|Bw^0\|_1^2 = \|Bw^*\|_1^2 \leq k\cdot \|Bw^*\|_2^2 = k\cdot val(w^*).\)

\(\square\)

Next we show that, in expectation, the progress in objective value perceived in each round is proportional to the difference between the current and optimal values.

Lemma 2: For each round \(i\in\{1,\cdots,r\}\) that starts with vector \(w^{i-1}\) and ends with vector \(w^i\), the expected objective value of \(w^i\) is such that \(val(w^{i-1}) - \mathbb{E}[val(w^{i})] \geq \frac{1}{k^2|N|} [val(w^{i-1}) - val(w^*)].\)

Proof: We fix a round \(i\), and for notational convenience we drop the superscripts \(i\) and \(i-1\) within the scope of this proof. In particular, we let \(w\) be the initial vector, and let \(w'^n\) be the final vector in the case that nominator \(n\) is picked in the round. Clearly, the expected progress in objective value equals the average progress \(\frac{1}{|N|}\sum_{n\in N} [val(w) - val(w'^n)]\). To lower bound the latter, it is sufficient to exhibit a different family of weight vectors \(\{w^n\}_{n\in N}\) such that \(val(w'^n)\leq val(w^n)\) for each \(n\), and then bound the average progress when moving from \(w\) to a member of that family.

Define the vector \(f:=w-w^*\in\mathbb{R}^E\). The following is a necessary technical observation whose proof we delay temporarily.

Lemma 3:\(\|f\|^2 \leq k^2 \|Bf\|^2.\)

Consider the decomposition of vector \(f\) as \(f=\sum_{n\in N} f^n\), where \(f^n\) is the restriction of \(f\) over the edges incident to nominator \(n\), and define the family of weight vectors \(\{w^n:= w-\frac{1}{k^2} f^n\}_{n\in N}\). We have \(val(w'^n) \leq val(w^n)\) for all \(n\in N\) as desired, because by construction (step 2.b. of the algorithm), \(w'^n\) is precisely the vector of minimum objective value among all maximally affordable vectors that differ from \(w\) only at the edges incident to \(n\). Hence, it only remains to bound the average progress in objective value with respect to the new family.

For a fixed \(n\in N\), we have

\[\begin{split}\begin{align} val(w) - val(w^n) &= \|Bw\|^2 - \|B(w-\frac{1}{k^2} f^n)\|^2 \\ & = \frac{2}{k^2} (Bw)^\intercal Bf^n - \frac{1}{k^4} \|f^n\|^2. \end{align}\end{split}\]

Thus, the average progress over all \(n\in N\) is

\[\begin{split}\begin{align} \frac{1}{|N|}\sum_{n\in N} [val(w)-val(w^n)] &= \frac{2}{k^2|N|}(Bw)^\intercal B(\sum_{n\in N}f^n) - \frac{1}{k^4|N|}\sum_{n\in N}\|f^n\|^2 \\ &= \frac{1}{k^2|N|}[2(Bw)^\intercal Bf - \frac{1}{k^2} \|f\|^2] \\ &\geq \frac{1}{k^2|N|}[2(Bw)^\intercal Bf - \|Bf\|^2] \\ & = \frac{1}{k^2|N|} (Bf)^\intercal B(2w-f) \\ &= \frac{1}{k^2|N|} [B(w-w^*)]^\intercal B(w+w^*) \\ &= \frac{1}{k^2|N|} [ \|Bw\|^2 - \|Bw^*\|^2] \\ &= \frac{1}{k^2|N|} [val(w) - val(w^*)], \end{align}\end{split}\]

where the inequality comes from Lemma 3. This completes the proof of Lemma 2. \(\square\)

Proof of Lemma 3: We interpret \(f\) as a flow over the network \((N\cup A, E)\). As both \(w\) and \(w^*\) are tight, there is flow preservation over all nominators. Let \(A_s, A_t\subseteq A\) be respectively the sets of sources and sinks, i.e. the sets of validators with net excess and net demand. By the flow decomposition theorem, there exists a decomposition \(f=\sum_{v\in A_s} f^v\) into single-source subflows, where \(f^v\) has \(v\) as its single source. We can assume that this decomposition generates no cycles by adjusting the choice of the optimal solution \(w^*=w-f\).

Consider one of these subflows \(f^v\). Its edge support looks like a directed acyclic graph (DAG) with single root \(v\). We arrange the edges on this DAG by levels, where the level of an edge is the length of the longest path from \(v\) containing this edge. These levels start at 1 for the edges incident to \(v\), up to at most \(2k\) because any simple path alternates between a nominator and a validator and there are only \(k\) validators. We now split \(f^v\) by levels, \(f^v=\sum_{i\leq 2k} f^{v,i}\), where \(f^{v,i}\) is the restriction of \(f^v\) over the edges at level \(i\). Since the excess in node \(v\) is \(supp_w(v)-supp_{w^*}(v)=(Bf)_v\) and no other node in the DAG has any excess, the sum of edge weights along each level \(i\) is \(\|f^{v,i}\|_1 \leq (Bf)_v\). Therefore, \(\|f^v\|_2^2 = \sum_{i\leq 2k}\|f^{v,i}\|_2^2 \leq \sum_{i\leq 2k} \|f^{v,i}\|_1^2 \leq 2k\cdot (Bf)^2_v.\)

Putting things together, we get

\[\begin{split}\begin{align} \|f\|^2_2 &= \|\sum_{v\in A_s} f^v\|_2^2 \\ &\leq |A_s|\sum_{v\in A_s} \|f^v\|_2^2 \\ & \leq 2k|A_s|\sum_{v\in A_s} (Bf)_v^2 \\ &= 2k|A_s|\cdot \|Bf\|_2^2, \\ \end{align}\end{split}\]

where the first inequality is an application of a Cauchy-Schwarz inequality.

In a similar manner, working with sinks instead of sources, we can obtain the bound \(\|f\|^2 \leq 2k|A_t| \cdot \|Bf\|^2\). Summing up these two bounds and dividing by two, we get \(\|f\|^2 \leq k(|A_s|+|A_t|) \cdot \|Bf\|^2 \leq k^2 \|Bf\|^2,\) which proves the claim. \(\square\)

For each round \(i\leq r\), consider the random variable \(\Delta^i:= val(w^i) - val(w^*)\), which represents how far from optimal the current solution is in terms of objective value. We now use Lemma 2 to show that \(\Delta^i\) decays exponentially fast in expectation.

Lemma 4: For any \(0<i\leq r\), the expected value of \(\Delta^i\) observes \(\mathbb{E}[\Delta^i] \leq k\cdot (1-\frac{1}{k^2|N|})^i val(w^*).\)

Proof: A reformulation of Lemma 2 gives \(\mathbb{E}[\Delta^i]\leq (1-\frac{1}{k^2|N|}) \Delta^{i-1}\). By induction and linearity of expectation, this implies that \(\mathbb{E}[\Delta^i]\leq (1-\frac{1}{k^2|N|})^i \Delta^0\). Finally, \(\Delta^0 = val(w^0) - val(w^*) < k\cdot val(w^*)\) by Lemma 1. \(\square\)

Recall now that we want the value of the output solution \(val(w^r)\) to be within a factor of \((1+\varepsilon)\) from \(val(w^*)\) with probability at least \((1-\delta)\). The next lemma completes the analysis of the algorithm and the proof of the main theorem.

Lemma 5: If \(r=\lceil |N|k^2\ln(k/\epsilon \delta) \rceil\), then \(\mathbb{P}[val(w^r) > (1+\varepsilon)val(w^*)]\leq \delta\).

Proof: By Lemma 4 and the choice of value \(r\), it follows that \(\mathbb{E}[\Delta^r]\leq \epsilon\cdot \delta\cdot val(w^*).\)

As the variable \(\Delta^r\) is non-negative, we can use Markov’s inequality:

\(\delta \geq \mathbb{P}[\Delta^r > \frac{\mathbb{E}[\Delta^r]}{\delta}] \geq \mathbb{P}[\Delta^r > \epsilon\cdot val(w^*)] = \mathbb{P}[val(w^r) > (1+\epsilon)\cdot val(w^*)],\) which is the claim. \(\square\)