# 5. A Phragmén-like Heuristic The [sequential Phragmén's method](4. Sequential Phragmén’s method.md) is fast, seems to work well in practice, and gives a solution that satisfies the Proportional Justified Representation (PJR) axiom. However, we identify two problems with it. First, it does not give a constant factor approximation to the [maximin support objective](3. The maximin support problem.md). Second, it lacks versatility, in the sense that it cannot be used to improve upon an arbitrary solution given as input. We describe a heuristic closely related to the sequential Phragmén's method, which takes as input an arbitrary partial solution, defines scores over the validator candidates, and uses them to add a new validator to the solution, removing another validator if necessary. Checking if a committee satisifies the PJR property is NP-hard. However, we define a stronger property called PJR($d$), which depends on a parameter $d$. We show that it implies PJR (for an adecuate value of $d$), and that it can checked efficiently, using our heuristic. We also build from our heuristic a "PJR($d$)-enabler"; that is, an algorithm that takes as input a solution of minimum support $d$, performs some swaps using the heuristic, and returns a solution of minimum support at least $d$ and observing PJR($d$) (and also PJR). Finally, we provide an efficient factor-3.15 approximation algorithm, which starts with an empty set and alternates between using our heuristic to elect a new validator, and running a flow balancing algorithm to improve the support distribution on the current set. ## Notation Recall that an instance of the NPoS election problem is a graph $(N\cup V, E)$ representing the trust relations between nominators and validator candidates, a vector $b\in\mathbb{R}_{\geq 0}^N$ of nominator budgets, and a target number $m$ of validators to elect. A support distribution from nominators to validators is represented by a vector $$w\in\mathbb{R}_{\geq 0}^E$$ of edge weights. Such a vector is called *affordable* if, besides non-negativity constraints, it observes the budget constraints $$\sum_{v\in V: nv\in E} w_{nv}\leq b_n$$ for each nominator $$n\in N$$. Furthermore, it is called *maximally affordable* with respect to a validator set $$S\subseteq V$$ if $$\sum_{v\in S: \ nv\in E} w_{nv} = b_n$$ for each $$n\in N$$ having at least one neighbor in $$S$$. By a *solution* we mean a pair $$(S,w)$$ where $$S\subseteq V$$ with $$|S|\leq m$$ and $$w$$ is maximally affordable for $S$. Our algorithm will return solutions of size $$m$$, but we will often consider solutions of smaller sizes in intermediate steps and in the analyses. Given an edge weight vector $$w$$ and a validator $$v\in V$$, the *support* of $v$ relative to $w$ is $$supp_w(v):=\sum_{n: nv\in E} w_{nv}$$. For a solution $(S,w)$, we extend this definition to $$supp_w(S):=\min_{v\in S} supp_w(v)$$. The *maximin support* objective is maximizing $$supp_w(S)$$ over all feasible solutions $$(S,w)$$ to the given instance. ## The basic heuristic Suppose that $(S,w)$ is a maximally affordable solution with $|S|\leq m$, and we wish to add a new candidate $v'\in V\setminus S$ and give it some support $d$. To do this, in general we will need to reduce the support of any other candidate $v \in S$ who has a neighbor nominator $n$ in common with $v'$. When we do this, we want to ensure that we do not reduce the support of $v$ below $d$ (assuming it was peviously above $d$). A simple way to ensure this is to reduce the weight on edge $nv$ from $w_{nv}$ to $w_{nv}\cdot(d/supp_w(v))$, and assign the difference to edge $nv'$. That way, it is clear that even if all nominators $n$ supporting $v$ are also neighbors of $v'$, the new support of $v$ does not go below $d$. Thus, if for each $n\in N$ and $d\geq 0$ we define the nominator's slack as \begin{align} slack_w(n,d):= & \begin{cases} b_n & \text{if }\nexists v\in S: nv \in E\\ \sum_{v\in S: \ nv\in E, \ supp_w(v)> d} w_{n,v}(1-d/supp_w(v)) & \text{otherwise } \end{cases}\\ = & b_n - \sum_{v\in S: \ nv\in E} w_{nv} \cdot\min\Big\{1, \frac{d}{supp_w(v)}\Big\} \end{align} and for each $v'\in V\setminus S$ and $d\geq 0$ we define that candidate's pre-score as $$prescore_w(v',d) := \sum_{n\in N: \ nv' \in E} slack_w(n,d)$$ then we can add $v'$ to the solution with support $prescore_w(v,d)$, while not making any other validator's support decrease from over $d$ to under $d$. In particular, if $prescore_w(v',d)\geq d$, the new solution $(S \cup \{v'\},w')$ has $$supp_{w'}(S \cup \{v'\}) \geq \min\{supp_w(S),d\}$$. The resulting heuristic, which adds a new candidate to the initial solution, is formalized below. **Algorithm: InsertCandidate$$(S,w,v',d)$$** 1. Let $$S'\leftarrow S \cup \{v\}$$ and $w'\leftarrow w$. 2. For all $n$ with $nv' \in E$: * If $\nexists v\in S$ with $nv\in E$, set $$w'_{nv'}\leftarrow b_n$$, otherwise set $$w'_{nv'}\leftarrow 0$$; * For all $v\in S$ with $w_{nv} > 0$ and $supp_w(v) > d$: increase $$w'_{nv'}$$ by $$w_{nv}(1-d/supp_w(v))$$, and set $$w'_{nv} \leftarrow w_{nv}(d/supp_w(v))$$; 3. Return $$(S',w')$$. The next result follows from the definitions and the discussion above and its proof is skipped. **Lemma 1:** If we run InsertCandidate($S,w,v',d$) for some maximally affordable solution $(S,w)$ with $|S|\leq m$, $v'\in V\setminus S$ and $d\geq 0$ to get $(S',w')$, then * $$(S',w')$$ is maximally affordable, * $$supp_{w'}(v')=prescore_w(v',d)$$, * for all $v\in S$ we have that $$supp_{w'}(v)\geq \min\{d, supp_w(v)\}$$, and consequently if $$prescore_w(v',d)\geq d$$, we obtain $$supp_{w'}(S') \geq \min \{d, supp_{w}(S) \},$$ * the running time of the algorithm is $O(|E|\cdot |S|)=O(|E|\cdot m)$. How high can we make $d$ and have the property $prescore_w(v',d)\geq d$ hold? We define $score_w(v')$ to be the maximum $d$ such that $prescore_w(v',d) \geq d$. Our heuristic now becomes apparent. **Heuristic:** *Given a maximally affordable solution $(S,w)$ with $|S|\leq m$, find the candidate $v'\in V\setminus S$ maximizing $score_w(v')$ and execute InsertCandidate($S,w,v',score_w(v')$), so that the new solution $(S',w')$ observes* $$supp_{w'}(S')\geq \min\{supp_w(S),\max_{v'\in V\setminus S} score_w(v')\}.$$ This is the core idea of our method. In the remainder of the section we establish how to find the candidate with the largest score efficiently. Fix a candidate $v'\in V\setminus S$, and consider the function $f(d):=prescore_w(v',d)-d$ in the interval $$[0,\infty)$$. Notice that this function is continuous and strictly monotone decreasing, and that $score_w(v')$ corresponds to its unique root. We can therefore approximate this root with binary search, as binary search works on any monotonic function. However, we can do better. Sort the set of support values $$\{supp_w(v): \ v\in S\}=\{d_1,d_2,\cdots,d_k\}$$ so that d_1 d^*} w_{nv} /supp_w(v) }.$$The interesting thing about the previous identity is that the right-hand side stays constant if we replace d^*=score_w(v') by any other value d within the same interval, among the above-defined intervals. This motivates us to define the following score function, for any v'\in V\setminus S and d\geq 0:$$score_w(v',d) :=\frac{\sum_{n: \ nv'\in E} (b_n - \sum_{v \in S: \ supp_w(v) \leq d} w_{nv})} {1+\sum_{n: \ nv' \in E} \sum_{v \in S: \ supp_w(v) > d} w_{nv} /supp_w(v) }.Function score(v',d) is very similar to, but algorithmically more convenient than function prescore(v',d). We remark that for d=0, the expression for 1/score_w(v',0) corresponds to the notion of score in sequential Phragmén's method. Hence, the latter can be seen as a special case of our approach. **Lemma 2.** Fix a maximally affordable solution (S,w) and a candidate v'\in V\setminus S: (i) Function score_w(v',d) is piece-wise constant with respect to d, namely it is constant in each of the intervals [0,d_1), \ [d_1, d_2), \cdots, [d_k, \infty), where the values d_1d^*\). (iii) The above defined value $$d^*$$ is equal to $$\max_{d\geq 0} score_w(v',d)$$. *Proof.* Let num(d) and denom(d) be respectively the numerator and the denominator in the definition of score_w(v',d). It is easy to check that if d_id^*\) and $$score_w(v',d)\neq d^*$$, then \begin{align} \frac{num(d^*) - num(d)}{denom(d^*) - denom(d)} = \frac{\sum_{n: \ nv'\in E} \sum_{v\in S: \ d^* d^*. \end{align} Therefore,num(d^*) - num(d) > d^*( denom(d^*) - denom(d)) = num(d^*) - d^*\cdot denom(d),$$and thus we conclude that d^* > num(d)/denom(d) = score(v',d). \square **Corollary 3.** Fix a maximally affordable solution (S,w). Then, (i) Function $$\max_{v'\in V\setminus S} score_w(v',d)$$ is constant in each of the above-defined intervals [0,d_1), \ [d_1, d_2), \cdots, [d_k, \infty). (ii) $$d^*:= \max_{v'\in V\setminus S} score_w(v')$$ is the unique root of function $$h(d):=\max_{v'\in V\setminus S} score_w(v',d) - d$$; moreover, h(d) is strictly positive for each $$dd^*$$. (iii) The above defined point $$d^*$$ is equal to $$\max_{d\geq 0} \max_{v'\in V\setminus S} score_w(v',d)$$. This corollary easily follows from the previous lemma, and its proof is skipped. Notice that the value $$d^*$$ is precisely what we need to find in our heuristic, and point (ii) establishes that we can find it with binary search. We define the explicit algorithm next. **Algorithm. CalculateMaxScore$$(S,w)$$** 1. Compute the values $$0=d_0 d_i} w_{nv} /supp_w(v). 4. For each v'\in V\setminus S, compute score_w(v',d_i)= \frac{\sum_{n\in N: \ nv'\in E} (b_n - \sum_{v \in S: \ supp_w(v) \leq d_i} w_{nv})}{1+\sum_{n\in N: \ nv'\in E} \sum_{v \in S: \ supp_w(v) > d_i} w_{nv} /supp_w(v) }. 5. Let \((d_{\max}, v_{\max})=(\max, \arg\max)_{v'\in V\setminus S} score_w(v',d_i)$$. 6. Let i' be the highest value such that $$d_{max}\geq d_{i'}$$, and set $$i_{\lo}\leftarrow \max\{i_{lo}, i'\}$$. If $$d_{\max < d_i}$$, set $$i_{\hi}\leftarrow i-1$$. 7. If i_{lo}< i_{hi}, set i\leftarrow \lceil (i_{lo}+i_{hi})/2 \rceil and go back to 3.; else, return $$(d_{\max}, v_{\max})$$. **Lemma 4:** CalculateMaxScore$$(S,w)$$ returns $$\max_{v'\in V\setminus S} score_w(v')$$ and a v' that attains that score, in O(\log |S|)=O(\log m) iterations, where each iteration takes time O(|E|). *Proof:* It is easy to verify that each iteration of the algorithm above executes in time O(|E|). Let $$d^*:=\max_{v'\in V\setminus S} score_w(v')=\max_{d\geq 0} \max_{v'\in V\setminus S} score_w(v',d)$$. From point (i) in the corollary, we can reduce our search for $$d^*$$ to only the evaluations of function $$\max_{v'\in V\setminus S} score_w(v',d)$$ over the O(m) points d_i. Moreover, by point (ii), we can perform binary search, so that if $$d_{\max}:=\max_{v'\in V\setminus S} score_w(v',d_i)$$ is larger than d_i then we can restrict our search to values d larger than the current d_i, and otherwise we can restrict our search to values smaller than the current d_i. This shows that only O(\log m) iterations are performed. Finally, by point (iii) we can also restrict our search to values d larger than the current d_{\max}, thus speeding up our search even more. To finish the proof, we remark that we choose to initialize the index i to zero because it seems to speed up the search in many implementations, but it can be initialized to any other value between i_{lo} and i_{hi}. \square When we have the score, we can insert candidate v_{\max} to the current solution (S,w) using InsertCandidate($$S,w,v_{\max},d_{\max}$$), thus obtaining a new solution $$(S',w')$$ with $$supp_{w'}(S')\geq \min\{supp_w(S), \max_{v'\in V\setminus S} score_w(v')\}$$, as desired. ## (Parameterised) Proportional Justified Representation. We can generalise the PJR property to our weighted votes setting and consider adding a parameter. For each nominator n\in N, let V_n\subseteq V be the subset of candidates that are trusted by n, i.e. $$V_n:=\{v\in V: \ nv\in E\}$$. **Definition:** A committee S (of any size) satisfies Proportional Justified Representation with parameter d (PJR(d) for short) if there is no subset N'\subseteq N of nominators and integer t>0 such that: a) \sum_{n\in N'} b_n \geq t\cdot d, b) |\cap_{n\in N'} V_n|\geq t, and c) |S\cap (\cup_{n\in N'} V_n)|0. **Algorithm.** LocalSearchForPJR(S,w, \varepsilon) 1. Let $$(d_{current}, v) \leftarrow (\min, \arg\min)_{v\in S} supp_w(v)$$. 2. Let $$d_{next}\leftarrow \min\{(1+\varepsilon)\cdot d_{current}, \sum_{n\in N} b_n / m\}$$. 3. Let $$(v_{\max}, d_{\max})\leftarrow CalculateMaxScore(S,w)$$. 4. If d_{\max}1, then the algorithm performs m\cdot O(1+ varepsilon^{-1}\log c) iterations, where each iteration executes in time O(|E|\cdot m). *Proof.* By the correctness of algorithm InsertCandidate(S,w,v_{\max}, d_{\max}), at the beginning of each iteration we have a maximally affordable solution of size m, where the new d_{current} is larger than the minimum between d_{current} and d_{next} in the previous iteration. Now, notice that d_{current} can never be greater than \sum_{n\in N} b_n/m, as the sum of supports cannot exceed the sum of budgets, and thus it is always the case that d_{next}\geq d_{current}. This shows that the minimum support of the current solution never decreases throughout the iterations, and proves point i). Next, if the algorithm eventually finalizes and outputs a solution (S',w') at step 4. with minimum support d', then the current value of d_{next} is d'' as defined in ii). By correctness of the algorithm CalculateMaxScore(S,w) we know that \max_{v'\in V\setminus S'} score_{w'}(v')0, an edge weight vector $$w\in\mathbb{R}_{\geq 0}^E$$ is an \varepsilon-MNMF for S if (i) w is maximally affordable, i.e. $$\sum_{v\in S: \ nv\in E} w_{nv}=b_n$$ for each n\in N having at least one neightbor in S, (ii) for any n\in N and any two neighbors v,v'\in S of it, if w_{nv}>0 then supp_w(v)\leq (1+\frac{\varepsilon}{5\cdot|S|})supp_{w}(v'), and (iii) For all affordable w', supp_{w'}(S)\leq (1+\epsilon)supp_w(S). In our note on [the min-norm max-flow problem](2. The min-norm max-flow problem.md), we provide more information about \varepsilon-MNMF vectors and present an algorithm MNMF(S,w,\varepsilon) that returns an \varepsilon-MNMF for S in polynomial time, and where the input vector w is optional. Consider the following algorithm. **Algorithm.** BalanceBetweenHeuristic() 1. Initialise S to the empty set and w to the empty vector; 2. For i from 1 to m: * Let (v,d)\leftarrow CalculateMaxScore(S,w); * Update (S,w)\leftarrow InsertCandidate(S,w,v,d); * Update w \leftarrow MNMF(S,w,\varepsilon); 3. Return (S,w). ### Analysis The main result of the section is showing that the previous algorithm offers a 3.15\cdot (1+\varepsilon)-factor approximation, and satisfies PJR. **Theorem 8:** The procedure BalanceBetweenHeuristic() returns a solution (S, w) for which $$supp_{w}(S) \geq d^*/(3.15\cdot (1+\varepsilon))$$, where $$d^*$$ is the maximin support across all solutions of the given NPoS election instance. Moreover, the solution satisfies PJR($$(1+\varepsilon)\cdot supp_{w}(S)$$) and, if \varepsilon \leq 1/m, PJR. We begin with a couple of needed technical result. **Observation.** For any 0\leq \varepsilon\leq 1 and any m\geq 1, we have the inequality$$\Big(1+\frac{\varepsilon}{5m}\Big)^{m} \leq 1+\frac{\varepsilon}{4}.$$**Proof:** The inequality $$1+x\leq e^x$$ holds for any real x. Replacing x by \varepsilon/(5m) and raising both sides to the power m, we obtain$$\Big(1+\frac{\varepsilon}{5m}\Big)^m \leq (e^{\frac{\varepsilon}{5m}})^m = e^{\frac{\varepsilon}{5}}.$$Finally, since the function $$f(\varepsilon):=e^{\varepsilon/5}$$ on the right-hand side is convex, within the range 0\leq \varepsilon \leq 1 it can be upper bounded by the linearization 1+(e^{1/5} -1)\varepsilon. It can be checked that $$e^{1/5} - 1 \leq 1/4$$, and the claim follows. \square **Proposition 9:** Let (S,w) be a solution where |S|< m and w is an \epsilon-MNMF of S for some 0\leq \varepsilon \leq 1, and let d^* be as in Theorem 8. Then, there exists a non-empty set T\subseteq V\setminus S with the property that for each 0\leq a\leq 1, there is a set of nominators N_a\subseteq N such that a) each n\in N_a has a neighbor in T, b) $$\sum_{n\in N_a}\geq (1-a)\cdot |T|\cdot d^*$$, and c) For each v\in S such that w_{nv}>0 for some n\in N_a, we have that $$supp_{w}(v)\geq \frac{a\cdot d^*}{1+\varepsilon/4}$$. *Proof.* Let $$m':=|S|$$, where m'0: Assuming otherwise that there is such a pair, the fact that n is in N_l implies that it has a neighbor v'\in S_l. But then we can apply the definition of \varepsilon-MNMF to obtain supp_w(v)/supp_w(v')\leq 1+\epsilon/(5m')=d_u/d_l, which contradicts the definitions of S_l and S_u. Next, by the fact that w is maximally affordable, and that each nominator in N_l has neighbors in S_l but gives no support to S_u, we have$$\sum_{n\in N_l} b_n \leq \sum_{v\in S_l} supp_w(v)< |S_l|\cdot d_l.Finally, if we define N_a\subseteq N_u as those n\in N_u that have a neighbor in T, then claim a) becomes evident, and claim c) follows from the fact that N_a has no neighbors in S_l (by def. of N_u) and all of its neighbors in S_u have a support of at least d_u>d_l\geq ad^*/(1+\epsilon/4). Hence, it only remains to prove claim b). Assume without loss of generality that $$w^*$$ provides a support of exactly $$d^*$$ to each $$v\in S^*$$, by possibly capping its edge weights, and define the edge weight vector w' by capping w arbitrarily such that supp_{w'}(v)=d_l if v\in S_u, and 0 otherwise (these vectors are affordable but in general not maximally affordable). Define $$f:=w^*-w'\in\mathbb{R}^E$$, which we consider as a flow over the network induced by $$N\cup S\cup S^*$$. Clearly, the net excess of set N relative to f is md^* - |S_u|d_l, the net excess of set N_l is at most \sum_{n\in N_l} b_n < |S_l|d_l, and the net demand of set S_u is $$|S^*\cap S_u|(d^*-d_l)$$. By subtracting the last two terms from the first one, we obtain that the flow going from N_u to $$S^*\setminus S_u$$ is at least \begin{align} &md^* - |S_u|d_l - |S_l|d_l - |S^*\cap S_u|(d^*-d_l) \\ &=md^* - m'd_l - |S^*\cap S_u|(d^*-d_l) \\ & \geq m(d^*-d_l) - |S^*\cap S_u|(d^*-d_l) \\ &\geq |S^* \setminus S|(d^*-d_l) \\ &\geq |T|d^*(1-a). \end{align} A key observation now is that none of the flow originating at N_u can pass by, or end in, S_l \cup N_l. This is because there are no edges between N_u and S_l \cup N_l, by definition of N_u; and even though the flow can pass by S_u, there is no flow from S_u to S_l\cup N_l in $$f=w^*-w'$$ because w' provides no flow from S_l\cup N_l to S_u. Therefore, the formula above is actually a lower bound on the flow going from N_u to $$T=S^*\setminus S$$. Finally, we notice that if we decompose flow f into simple paths, any path from N_u to T must have the last edge originating in N_u, or more specifically in N_a. This proves that $$\sum_{n\in N_a} b_n \geq |T|d^*(1-a)$$, which is claim b), and concludes the proof of the proposition. \square **Lemma 6:** Given a solution (S,w) with |S| \leq m that satisfies Condition 1, there exists a v \notin S with score_w(v) \geq d^*/4(1+\epsilon/2). **Proof:** Apply Proposition 1 and set a=1/2, then by using that for any a_1,\dots, a_n, there exists an a_i with a_i \geq \sum_i a_i/n, there is a v \in T such that the setA_{v,d^*/2}:=\Big\{n\in N \ | \ n,v \in E \text{ and }\forall v'\in V, w_{n,v'} > 0 \Rightarrow supp_{w}(v') \geq \frac{d^*}{2(1+\epsilon/2)} \Big\}$$has \sum_{n \in A_{v,d^*/2}} b_n \geq d^*/2. Now for any n \in A_{v,d^*/2}, slack(n,d^*/4(1+\epsilon/2))= \sum_{v:(n,v) \in E} w_n,v (1-d^*/4(1+\epsilon/2) supp_w(v)) \geq b_n/2 and so prescore(v,d^*/4(1+\epsilon/2)) \geq \sum_{n \in A_{v,d^*/2}} slack(n, d^*/4(1+\epsilon/2)) \geq d^*/4 > d^*/4(1+\epsilon/2). Thus score_w(v) \geq d^*/4(1+\epsilon/2). We can do better than this by using different a. **Lemma 7** Given a solution (S,w) with |S| \leq m that satisfis Condition 1, there exists a v \notin S with score_w(v) \geq d^*/3.15(1+epsilon/4). **Proof:** The following will be crucial: **Lemma 8** Consider a finite sum \sum_i f(x_i) a_i, where f is strictly increasing with derivative f'(x), a_i \geq 0 for all i and for some y \leq \min_i x_i, f(y) = 0, then$$\sum_i f(x_i) a_i = \int_{y}^infty f'(x) (\sum_{i:x_i \geq x} a_i) dx$$**Proof:** We can write the sum as a Lebesgue integral over the measure with weights a_i, obtaining that:$$\sum_i f(x_i) a_i = \int_0^\infty (\sum_{i:f(x_i) >t} a_i ) dtThe conditions on f are enough for it to be invertible with derivative df^{-1}/dt=1/f'(f^{-1}(t)) and f^{-1}(0)=y, so we can substiture x=f^{-1} t into the above to obtain: \begin{align*} \sum_i f(x_i) a_i & = \int_0^\infty (\sum_{i:f(x_i)} >t a_i ) dt \\ &= \int_y^\infty f'(x) (\sum_{i:f(x_i) >f(x)} a_i ) dx \\ &= \int_y^\infty f'(x) (\sum_{i:x_i > x} a_i ) dx \\ &= \int_y^\infty f'(x) (\sum_{i:x_i \geq x} a_i ) dx \end{align*} For some 0 \leq b \leq 1, to be determined later, we consider \sum_{v \in T} prescore(w,b d^*), we have\sum_{v \in T} prescore(w,b d^*) \geq \sum_{n: \exists v \in T, (n,v) \in E} slack(n, b d^*) \geq \sum_{n \in N_b} slack(n,b d^*) \; .$$Now for n \in N_b, let supp(n)= \max_{v':w_{n,v'} > 0} supp_w(n) and supp(n)=\infty if w_{n,v'}=0 for all v'. We certainly have supp(n) \geq b d^* from the definition of N_b. More generally for n \in N_b, n \in N_a if and only if supp(n) \geq a d^*. We have$$slack(n, b d^*) = b_n - \sum_{v' \in S} w_n,v bd^*/ supp_w(v') \geq b_n (1- bd^*/supp(n))and so using Lemma 8, \begin{align*} \sum_{v \in T} prescore(w,b d^*) & \geq \sum_{n \in N_b} slack(n,b d^*) \\ & \geq \sum_{n \in N_b} b_n (1- bd^*/supp(n)) \\ &= \int_{b d^*}^{infty} (bd^*/x^2) (\sum_{n \in N_{x/d^*}} b_n) dx \\ & \geq \int_{b}^{1} (b/a^2) (1-a)|T| d^*/(1+\epsilon/2) da \\ & = (bd^*|T|/(1+\epsilon/2) \int_{b}^1 1/a^2 - 1/a da \\ &= (bd^*|T|/(1+\epsilon/2) (1/b-1+ln b) \\ &= (d^*|T|/(1+\epsilon/3) (1-b+bln b) \end{align*} So for any b with b(2-ln b) \leq 1, we have that there is an v \in T with \begin{align*} prescore_w(b, b d^*/(1+\epsilon/4)) & \geq prescore_w(b, b d^*) \\ & \geq (d^*/(1+\epsilon/4) (1-b+bln b) \\ & \geq b d^*/(1+\epsilon/4) \end{align*} In particular, this holds for b=1/3.15. Thus there exists a v \notin S with score_w(v) \geq d^*/3.15(1+\epsilon/4). \square Now we can prove Theorem 1 **Proof of Theorem 1:** For a set S, we define d^*_S to be the maximum over afforable w of supp_w(S).In order not to lose error with condition 1 (ii), we need: **Lemma 9:** Let (S,w) be a solution with |S| \leq m that satisfies Condition 1. Let (S',w') be a solution of any size with S \subseteq S' and for all v \in S with supp_{w'}(v) < sepp_w(v), we have supp_{w'}(v) \geq d for some d. Then d^*_{S'} defined similarly has d^*_{S'} \geq \min \{d^*_S, d/(1+\epsilon/4)\}. **Proof:** Note that (1+\epsilon/12m)^{m+1} \leq (1+\epsilon/4). Since there are at most m values of supp_w(v) for v \in S, by the pidgeon hole principle, there must be an x=d/(1+\epsilon/12m)^i for 1 \leq i \leq m such that no v \in S has x < supp_w(v) \leq x(1+\epsilon/12m). Let X \subseteq S be the set of v \in S with supp_v(w) \leq x. Let N_X=\{n:\exists v \in X, (n,v) \in E\}. By condition 1, any n \in N_X has w_{n,v'}=0 for v' \notin X. By our construction, we also have w'_{n,v'} =0 for v \notin X. If X is empty, then d^*_S \geq supp_w(S) \geq x \geq d/(1+\epsilon/3) and the construction gives a w' with d^*_{S'} \geq supp_{w'}(S') \geq d/(1+\epsilon/3)=\min \{d^*_S, d/(1+\epsilon/3)\} and we are done. If X is non-empty, then we claim that d^*_S=d^*_{S'}=d^*_X. Let w^*_S,w^*_{S'} and w^*_X be affordable assignments that achieve these. Then by optimality of d^*_X, d^*_S \leq supp_{w^*_S}(X) \leq d^*_X and similarly d^*_{S'} \leq d^*_X. On the other hand, we have that|X| d^*_X \leq \sum_{v \in X} supp_{w^*_X} \leq \sum_{n \in N_X} b_n = \sum_{v \in X} supp_{w} (X) \leq |X| x$Now consider modifying$w$or$w'$by setting$w_{n,v}$for$v \in X, $n \in N_X$ to be $(w^*_X)_{n,v}$. Since $n \in n_X$ had $w_{n,v}=w'_{n,v} =0$ for $v \notin X$, this is still affordable and the supports for $v \notin X$ remain the same, that is $> x$. So these have minimum support $d^*_X$, and we have $d^*_{S'}=d^*_S$, which gives the the lemma. Now we claim inductively that $d^*_{S_i} \geq d^*/3.15(1+\epsilon/4)^2$. Lemma 7 implies that there is a $v \notin S_{i-1}$ with $score_w(v) \geq d^*/3.15(1+\epsilon/4)$. We add this to $w$, while not reducing the support of any $v$ to below $d^*/3.15(1+\epsilon/4)$. Lemma 9 gives that $d_{S_i} \geq d^*/3.15(1+\epsilon/4)^2$ The induction gives that $d^*_{S_m} \geq d^*/3.15(1+\epsilon/4)^2$. Stare balances gives a $w_m$ that satiusfies condtion 1 (ii) and so $supp_{w_m}(S_m) \geq d^*_{S_m}/(1+\epsilon/4) \geq d^*/3.15(1+\epsilon/4)^3 \geq d^*/3.15(1+\epsilon)$. For the PJR claim, suppose that $S_m$ does not satify PJR($d$) for some $d$. Then nor do any $S_i$, since they are subsets of $S_m$. So by Lemma 5, for every $0 \leq i \leq m-1$, there is a $v_i \notin S_i$ with $score_{w_i}(v_i) \geq d$. The same argument as above gives that $supp_{w_m}(S_m) \geq d/(1+\epsilon/4)^2 < d/(1+\epsilon)$. It follows that $S_m$ does satisfy PJR($supp_{w_m}(S_m)(1+epsilon)$). Suppose that $S_m$ does not satisfy PJR but $\epsilon \leq 1/m$. Then by Lemma 5, there exists a $v \notin S$ with $score_{w_n}(v) \geq \sum_n b_n/m$. By Lemmas 2 and 1, we can run InsertNewCandidate($w_n, score_{w_n}(v)$), to obatian a $w'$ such that $\supp_{w'}(S_m \cup \{v\}) = \min \{ score_{w_n}(v), supp_{w_m}(S_m) \}$. But since $|S_m \cup \{v\}|=m+1$, we must have $\supp_{w'}(S_m \cup \{v\}) \leq \sum_n b_n/(m+1)$. Thus we get that $supp_{w_m}(S_m) \leq \sum_n b_n/(m+1)$. So if $\epsilon \leq 1/m$, then $supp_{w_m}(S_m) (1+\epsilon) \leq supp_{w_m(S_m) (m+1)/m \leq sum_n bn /m$ and since PJR ($d$) implies PJR($d'$) for $d' > d$, we have PJR. This means that $\epsilon \leq 1/m$ implies PJR.