The satisficing secretary problem: when closed-form solutions meet simulated annealing

Roberto Brera [email protected] Feng Fu [email protected] Department of Mathematics, Dartmouth College, Hanover, NH 03755, USA Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Lebanon, NH 03756, USA

Abstract

The secretary problem has been a focus of extensive study with a variety of extensions that offer useful insights into the theory of optimal stopping. The original solution is to set one stopping threshold that gives rise to an immediately rejected sample $r$ out of the candidate pool of size $n$ and to accept the first candidate that is subsequently interviewed and bests the prior $r$ rejected. In reality, it is not uncommon to draw a line between job candidates to distinguish those above the line vs those below the line. Here we consider such satisficing sectary problem that views suboptimal choices (finding any of the top $d$ candidates) also as hiring success. We use a multiple stopping criteria $(r_{1},r_{2},\cdots,r_{d})$ that sequentially lowers the expectation when the prior selection criteria yields no choice. We calculate the probability of securing the top $d$ candidate in closed form solutions which are in excellent agreement with computer simulations. The exhaustive search for optimal stopping has to deal with large parameter space and thus can quickly incur astronomically large computational times. In light of this, we apply the effective computational method of simulated annealing to find optimal $r_{i}$ values which performs reasonably well as compared with the exhaustive search despite having significantly less computing time. Our work sheds light on maximizing the likelihood of securing satisficing (suboptimal) outcomes using adaptive search strategy in combination with computation methods.

keywords:

optimal stopping, optimization, adaptive decision-making, applied probability

^†^†journal: arXiv

1 Introduction

The secretary problem presents a conceptually simple yet mathematically elegant model for studying optimal decision-making strategies that can maximize the probability of finding the best choice (or match) in a range of real-world scenarios, ranging from getting the best job candidate to finding the best partner. Since this now well-known problem first appeared in the February 1960 issue of Scientific American, substantial progress concerning the solution and its different variations has been made [18, 17, 39]. Over past decades, systematic efforts on the secretary problem have greatly helped spur the interest from diverse fields with broad implications for informing practices aimed at finding “the right one(s)” with various constraints [19, 15, 45, 1, 37, 41, 29, 7, 32, 46, 9, 16, 13, 6]. Besides, the secretary problem has found wide applications to online trading [27], online auction [24], and customer behavior [48].

In 1961, Lindley published the first solution to the standard problem [30]. Since then, the secretary problem has been extended and generalized in numerous ways, including, among others, interview cost [4, 31], finite memory length [38, 44], uncertainty in employment [43], observation errors [42], and repeated secretary problem [22]. The problem has also been studied from various mathematical perspectives, such as via linear programming [10], submodular functions [5], random walks [25] and graph theoretic approach [28].

More specifically, one important line of extensions allows multiple choice [34, 36], or in general $d$ choice [21]. Chow has worked on a selection model minimizing the rank of the expected selected candidate to $\approx 3.87$ [14], and Smith has elaborated on the case of uncertain employment [43], where each candidate accepts an offer only with a fixed probability of $p$ . Furthermore, Buchbinder, Jain, and Singh have worked on the $J$ -choice $K$ -best secretary problem [10], where the decision maker is allowed the selection of $J$ candidates and needs to choose a candidate amongst the top $K$ to achieve success. Other multiple-selection variations have been explored by Albers [2] (where the decision maker must maximize the qualitative sum of a given number of selections) and Glasser [21], who computes the probabilities of selecting all the $d$ best candidates when given $d$ possibilities. Smith relaxed different criteria by allowing, at any time, the availability of the last $m\geq 1$ interviewed candidates for selection, whilst maintaining all other standard rules [44]. These variations of the problem, together with several others, have also been studied by Freeman [18] and Samuels [40].

Built on prior studies, the present work concerns a variation of the secretary problem. In the original problem, we are required to choose the best candidate out of a pool of $n$ candidates, where each can be interviewed only once and must be immediately accepted or rejected. Our variation of the problem will have as additional input a tolerance level $d$ , and success in the problem is dictated by choosing any of the top $d$ candidates out of the total $n$ candidates.

To inform our approach to the generalized secretary problem, we first look to the standard problem’s solution, which is also known as the optimal stopping problem. In order to maximize the likelihood of choosing the best candidate, we implement a strategy where the first $r-1$ candidates are automatically rejected (to obtain a sample of the pool), and then the first candidate which bests the rejected group is accepted. We can compute the $r$ value maximizing the probability of success in selecting the best candidate among the pool.

For a specific value of $r$ , this probability is given by:

	$\displaystyle P(r)$	$\displaystyle=\sum_{i=r}^{n}P(\text{$i$ is the best candidate})\cdot P(\text{success at selecting $i$ }\|\text{ $i$ is the best candidate})$
		$\displaystyle=\sum_{i=r}^{n}{\frac{1}{n}}\cdot P(\text{best candidate before $i$ lies in the $r-1$ group})$
		$\displaystyle=\frac{1}{n}\sum_{i=r}^{n}\frac{r-1}{i-1}$
		$\displaystyle=\frac{r-1}{n}\sum_{i=r}^{n}\frac{1}{i-1}.$

As $n\to\infty$ , we can approximate the summation as an integral, with $x=\lim_{n\to\infty}\frac{r-1}{n},t=\lim_{n\to\infty}\frac{i-1}{n},$ and $dt=\lim_{n\to\infty}\frac{1}{n}$ . Then, we can take the derivative with respect to $x$ to find the approximate value of $x\approx\frac{r-1}{n}$ for which $P(x)$ is maximized:

P(x)=x\int_{x}^{1}\frac{1}{t}dt=x\ln(x)

\frac{d}{dx}P(x)=\ln(x)+1=0.

Thus, the optimal choosing strategy has probability of success at about $e^{-1}$ for large $n$ and is achieved when $r\approx\left\lfloor\frac{n}{e}\right\rfloor+1$ .

In contrast to this standard case with strict criteria of success (only finding the best candidate), it is not uncommon that the job search is considered a success as long as any of the top $d$ candidates is found. We focus our present study on this “satsificing” secretary problem. We propose an adaptive search strategy with multiple stopping criteria $(r_{1},r_{2},\cdots,r_{d})$ . For each interval $[r_{s},r_{s+1})$ , the interviewer (decision-maker) lowers their expectation subsequently aiming to find the $s$ th-best candidate relative the prior $r_{s}$ rejected if prior search throughout the interval $[r_{s-1},r_{s})$ yields no selection. We are able to derive the probability of success, finding any of the top $d$ admissible candidates, in closed form for any given parameter choices of $r_{i}$ ’s values, for $i=1,\cdots,d$ .

In what follows, the paper is organized as follows. Section 2 presents the model details of the satisficing secretary problem. We first present the simple case $d=2$ which gives us heuristics for deriving closed-form solutions for the general case $d\geq 2$ . We also compare analytical results with computer simulations. Section 3 presents specific results on the optimization of multiple stopping criteria $(r_{1},\cdots,r_{d})$ using the effective computational method of simulated annealing as compared to the brute force method. We conclude and discuss the work in Section 4.

2 Model and methods

2.1 Satisficing secretary problem with top $d$ admissible candidates

When relaxing success criteria to the selection of one amongst a set of $d$ equally admissible candidates, relaxed selection criteria yield a greater probability of success. Thus, we implement a strategy where the first $r_{1}-1$ candidates are automatically rejected, and then, for a set of stopping points $(r_{1},r_{2},...,r_{d})$ , between each $r_{i},r_{i+1}-1$ inclusive we are willing to accept one amongst the best $i$ candidates amongst the currently rejected pool.

In this section, we show that the probability of selecting one amongst the top $d$ candidates out of a pool of $n$ total candidates, using this strategy with a set of stopping points $(r_{1},r_{2},...,r_{d})$ , is given by

\displaystyle P(r_{1},r_{2},...,r_{d})=\sum_{s=1}^{d}\hskip 4.30554pt\prod_{i=1}^{s-1}\frac{r_{i}-i}{r_{s}-i}\hskip 2.15277pt\cdot P(s,r_{s},r_{s+1}),

where

${\textstyle\noindent{\textit{$P(s,r,r^{\prime})$}}=d!\prod_{l=0}^{d-1}\frac{1}{n-l}\hskip 2.15277pt\left(\sum_{k=0}^{s-1}{r-1\choose k}\hskip 2.15277pt\sum_{i=r}^{r^{\prime}-1}{n-i\choose d-k-1}\hskip 2.15277pt\prod_{j=k+1}^{s}\frac{r-j}{i-j}\hskip 4.30554pt+\hskip 4.30554pt\sum_{k=s}^{d-1}{r-1\choose k}\sum_{i=1}^{d-k}{r^{\prime}-r\choose i}\cdot{n-r^{\prime}+1\choose d-k-i}\cdot S_{s}(k+i,i)\right)}$

\displaystyle S_{s}(k,m)\hskip 4.30554pt=\hskip 4.30554pt1-\prod_{i=0}^{s-1}\frac{k-m-i}{k-i}.

For heuristic reasons, we begin this section by first considering the simplified case where $d=2$ .

2.2 The two admissible candidates case: probability computations

As a first step towards generalization, we consider the first case with weaker success criteria, where we are required to select one of the best two candidates in the pool, with the same rules as the standard case. In order to maximize the success probability, we naturally extend the standard strategy to two strategies, whose probability distributions will motivate the implementation of a third, adaptive search strategy, which can lead to even better success probabilities.

2.2.1 First strategy: stopping at the best candidate interviewed so far

At first glance, we may want to use the same exact strategy as the standard case, and benefit from the weaker success criteria:

Strategy 1: Reject the first $r-1$ candidates, for $r\geq 2$ , and then select the first candidate which bests the pool
of rejected candidates.

Defining ”success” as the selection of the best or second-to-best candidate, and assuming that both of these candidates are located at positions $n_{1}$ and $n_{2}$ , there are only three subcases where success is possible:

	$\displaystyle\text{Subcase 1:}\quad r\leq n_{1}<n_{2}\leq n$
	$\displaystyle\text{Subcase 2:}\quad r\leq n_{2}<n_{1}\leq n$
	$\displaystyle\text{Subcase 3:}\quad n_{2}<r\leq n_{1}\leq n$

We can compute $P(\text{Success }\cap\text{ Subcase 1})$ by considering the probability of success when the two candidates of interest are at some fixed positions $n_{1}$ and $n_{2}$ , and then letting $n_{1}$ and $n_{2}$ iterate through all their possible positions within Subcase 1:

	$\displaystyle P(\text{Success }\cap\text{ Subcase 1})$	$\displaystyle=\sum_{n_{1}=r}^{n-1}\hskip 4.30554pt\sum_{n_{2}=n_{1}+1}^{n}P(\text{Success }\cap\text{ Best two candidates at $n_{1}$ and $n_{2}$})$
		$\displaystyle=\sum_{n_{1}=r}^{n-1}\hskip 4.30554pt\sum_{n_{2}=n_{1}+1}^{n}P(\text{Best two candidates at $n_{1}$ and $n_{2}$})$
		$\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\cdot P(\text{Success }\|\text{ Best two candidates at $n_{1}$ and $n_{2}$})$
		$\displaystyle=\sum_{n_{1}=r}^{n-1}(n-n_{1})\cdot\frac{1}{n}\cdot\frac{1}{n-1}\cdot P(\text{Best candidate before $n_{1}$ lies in the $r-1$ group})$
		$\displaystyle=\frac{1}{n}\cdot\frac{1}{n-1}\sum_{n_{1}=r}^{n-1}(n-n_{1})\cdot\frac{r-1}{n_{1}-1}.$

By similar reasoning, we obtain the same expression for $P(\text{Success }\cap\text{ Subcase 2})$ . Moreover, because no candidate between $r$ and $n_{1}$ can best $n_{2}$ , it follows that

	$\displaystyle P(\text{Success }\cap\text{ Subcase 3})$	$\displaystyle=P(\text{Subcase 3})\cdot P(\text{Success }\|\text{ Subcase 3})$
		$\displaystyle=(r-1)\cdot(n-r+1)\cdot\frac{1}{n}\cdot\frac{1}{n-1}.$

Because success can only occur within these three subcases, the total probability of success for
Strategy 1 is given by

	$\displaystyle P(r)$	$\displaystyle=P(\text{Success }\cap\text{ Subcase 1})+P(\text{Success }\cap\text{ Subcase 2})+P(\text{Success }\cap\text{ Subcase 3})$
		$\displaystyle=\left(\frac{r-1}{n(n-1)}\right)\left((n-r+1)+2\sum_{i=r}^{n-1}\frac{n-i}{i-1}\right)$
		$\displaystyle=\left(\frac{r-1}{n}\right)\left(2\sum_{i=r}^{n}\frac{1}{i-1}-\left(\frac{n-r+1}{n-1}\right)\right)$
		$\displaystyle\approx 2\cdot\left(\frac{r-1}{n}\right)\sum_{i=r}^{n}\frac{1}{i-1}.$

Notice how this success probability roughly equates twice the one computed for the strict case in the first section.

2.2.2 Second strategy (satisficing): stopping at best or second-to-best candidates interviewed so far

Alternatively, we could also implement the following strategy:

Strategy 2: Reject the first $r-1$ candidates, for $r\geq 3$ , and then select the first candidate which is the best or
second-best relative to them.

Using this strategy, there is a further case where success is possible that we need to consider:

\displaystyle\text{Subcase 4}\mathrel{\mathop{\mathchar 58\relax}}\quad n_{1}<r\leq n_{2}\leq n

$P(\text{Success }\cap\text{ Subcase 1})$ can be computed using a similar method to the one used for Strategy 1:

	$\displaystyle P(\text{Success }\cap\text{ Subcase 1})$	$\displaystyle=\sum_{n_{1}=r}^{n-1}\hskip 4.30554pt\sum_{n_{2}=n_{1}+1}^{n}P(\text{Best two candidates at $n_{1}$ and $n_{2}$})$
		$\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\cdot P(\text{Success }\|\text{Best two candidates at $n_{1}$ and $n_{2}$})$
		$\displaystyle=\sum_{n_{1}=r}^{n-1}(n-n_{1})\cdot\frac{1}{n}\cdot\frac{1}{n-1}\cdot P(\text{Best two candidates before $n_{1}$ lie in the $r-1$ group})$
		$\displaystyle=\frac{1}{n}\cdot\frac{1}{n-1}\sum_{n_{1}=r}^{n-1}(n-n_{1})\cdot\frac{r-1}{n_{1}-1}\cdot\frac{r-2}{n_{1}-2}.$

By similar reasoning, we obtain the same expression for $P(\text{Success }\cap\text{ Subcase 2})$ .

Moreover, by the same method:

	$\displaystyle P(\text{Success }\cap\text{ Subcase 3})$	$\displaystyle=\sum_{n_{2}=1}^{r-1}\hskip 4.30554pt\sum_{n_{1}=r}^{n}\frac{1}{n}\cdot\frac{1}{n-1}\cdot P(\text{Second-best candidate before $n_{1}$ lies in the $r-1$ group }\|$
		$\displaystyle\hskip 129.16626ptn_{2}\text{ lies in the $r-1$ group})$
		$\displaystyle=\frac{1}{n}\cdot\frac{1}{n-1}\cdot(r-1)\sum_{n_{1}=r}^{n}\frac{r-2}{n_{1}-2}.$

And, again, by similar reasoning, we obtain the same expression for $P(\text{Success }\cap\text{ Subcase 4})$ .

Hence, the total probability of success for Strategy 2 is given by

\displaystyle P(r)

\displaystyle=\left(\frac{2(r-1)(r-2)}{n(n-1)}\right)\left(\sum_{i=r}^{n}\hskip 4.30554pt\frac{1}{i-2}+\sum_{i=r}^{n-1}\hskip 4.30554pt\frac{n-i}{(i-1)(i-2)}\right).

2.2.3 Comparing the two strategies

Using the formulae derived in sections 2.2.1 and 2.2.2, we have graphed the probability of success against the $r$ -value for both strategies, and compared them with computer simulations averaged over $100,000$ trials (see the Appendix for more details regarding the code implemented), as shown Figure 1.

Refer to caption — Figure 1: Probability of success for Strategies 1 and 2 for $d=2$ and $n=100$ . Analytical results (solid curves) agree well with computer simulations (symbols).

2.2.4 A third strategy (adaptive search strategy)

From the data shown in Figure 1, it is clear that Strategy 1 significantly outperforms Strategy 2 at low $r$ -values, while the opposite is true at high $r$ -values. Hence, this motivates us to merge the two strategies to further optimize their performance:

Adaptive search strategy (Strategy 3): Implement Strategy 1, rejecting the first $r_{1}-1$ candidates. If no selection
is made by the time you have reached the $(r_{2}-1)^{th}$ candidate, switch to
Strategy 2 for the remaining candidates.

The probability of success with this strategy is given by

	$\displaystyle P(\text{Success with {Strategy 1} on first $r_{2}-1$ candidates})$
	$\displaystyle\quad+P(\text{No selections made with {Strategy 1} }\cap\text{ Success with {Strategy 2}}).$

For the second term, we note:

	$\displaystyle P(\text{No selections made }$	$\displaystyle\text{with {Strategy 1} }\cap\text{ Success with {Strategy 2}})$
		$\displaystyle=P(\text{No selections made with {Strategy 1}})\cdot P(\text{Success with {Strategy 2}})$
		$\displaystyle=P(\text{Best candidate before $r_{2}$ lies before $r_{1}$})\cdot P(\text{Success with {Strategy 2}})$
		$\displaystyle=\left(\frac{r_{1}-1}{r_{2}-1}\right)\cdot P(\text{Success with {Strategy 2}}),$

where to evaluate $P(\text{Success with {Strategy 2}})$ , it suffices to set $r=r_{2}$ in the formula at the end of section 2.2.2.

For the first term, we need to derive a slight generalization for the formula in section 2.2.1:

2.2.5 Strategy 1 on a restricted sample

In this section, our aim is to compute the probability of achieving success by Strategy 1 on a restricted sample of $r_{2}-1$ candidates, within the total sample of $n$ candidates. Analogously to section 2.2.1, we tackle the problem by first identifying all subcases where success is possible, considering the respective positions $n_{1}$ and $n_{2}$ of the best and second-best candidates:

	$\displaystyle\text{Subcase 1:}\quad r_{1}\leq n_{1}<n_{2}\leq r_{2}-1$
	$\displaystyle\text{Subcase 2:}\quad r_{1}\leq n_{2}<n_{1}\leq r_{2}-1$
	$\displaystyle\text{Subcase 3:}\quad n_{2}<r_{1}\leq n_{1}\leq r_{2}-1$
	$\displaystyle\text{Subcase 4:}\quad r_{1}\leq n_{2}\leq r_{2}-1<n_{1}$
	$\displaystyle\text{Subcase 5:}\quad r_{1}\leq n_{1}\leq r_{2}-1<n_{2}$

For the first two subcases, it suffices to repeat a similar proof to the one for Subcase 1 and Subcase 2 of section 2.2.1, with restricted range for $n_{1}$ and $n_{2}$ :

	$\displaystyle P(\text{Success }\cap\text{ Subcase 1 or 2})$	$\displaystyle=2\sum_{n_{1}=r_{1}}^{r_{2}-2}\hskip 4.30554pt\sum_{n_{2}=n_{1}+1}^{r_{2}-1}P(\text{Success }\cap\text{ Best two candidates at $n_{1}$ and $n_{2}$})$
		$\displaystyle=\frac{2}{n}\cdot\frac{1}{n-1}\sum_{n_{1}=r_{1}}^{r_{2}-2}\hskip 4.30554pt(r_{2}-1-n_{1})\cdot\frac{r_{1}-1}{n_{1}-1}.$

Similarly, $P(\text{Success }\cap\text{ Subcase 3})=\frac{1}{n}\cdot\frac{1}{n-1}\cdot(r_{1}-1)\cdot(r_{2}-r_{1})$ , and, by similar reasoning:

\displaystyle P(\text{Success }\cap\text{ Subcase 4 or 5})=\frac{2}{n}\cdot\frac{1}{n-1}\cdot(n-r_{2}+1)\sum_{n_{1}=r_{1}}^{r_{2}-1}\frac{r_{1}-1}{n_{1}-1}.

Hence, after simplifying:

\displaystyle P(\text{Success with {Strategy 1} on first $r_{2}-1$ candidates})

\displaystyle=\left(\frac{r_{1}-1}{n(n-1)}\right)\left((r_{2}-r_{1})+2\sum_{i=r_{1}}^{r_{2}-1}\hskip 4.30554pt\frac{n-i}{i-1}\right).

2.2.6 Strategy 3 probability computation

We then compute the total probability of success $P(r_{1},r_{2})$ for Strategy 3:

	$\displaystyle P(r_{1},r_{2})$	$\displaystyle=\left(\frac{r_{1}-1}{n(n-1)}\right)\left(2(r_{2}-2)\left(\sum_{i=r_{2}}^{n}\hskip 4.30554pt\frac{1}{i-2}+\sum_{i=r_{2}}^{n-1}\hskip 4.30554pt\frac{n-i}{(i-1)(i-2)}\right)+(r_{2}-r_{1})+2\sum_{i=r_{1}}^{r_{2}-1}\hskip 4.30554pt\frac{n-i}{i-1}\right)$
		$\displaystyle=\left(\frac{r_{1}-1}{n(n-1)}\right)\left(2(r_{2}-2)\left(\sum_{i=r_{2}}^{n}\hskip 4.30554pt\frac{n-1}{(i-1)(i-2)}\right)+(r_{2}-r_{1})+2\sum_{i=r_{1}}^{r_{2}-1}\left(\hskip 4.30554pt\frac{n-1}{i-1}-1\right)\right)$
		$\displaystyle=\left(\frac{r_{1}-1}{n(n-1)}\right)\left(2(r_{2}-2)(n-1)\left(\sum_{i=r_{2}}^{n}\hskip 4.30554pt\frac{1}{i-2}-\sum_{i=r_{2}}^{n}\hskip 4.30554pt\frac{1}{i-1}\right)+(r_{1}-r_{2})+2\sum_{i=r_{1}}^{r_{2}-1}\left(\hskip 4.30554pt\frac{n-1}{i-1}\right)\right)$
		$\displaystyle=\left(\frac{2(r_{1}-1)}{n(n-1)}\right)\left((n-r_{2}+1)+\frac{(r_{1}-r_{2})}{2}+\sum_{i=r_{1}}^{r_{2}-1}\left(\hskip 4.30554pt\frac{n-1}{i-1}\right)\right).$

In Figure 2, we plot a heat map illustrating the probability of success against $(r_{1},r_{2})$ values for this search adaptive strategy, with contour lines showing the maximum values achieved by the two strategies discussed so far:

Table 1 compares the corresponding results obtained through the formula derived in section 2.2.6 with computer simulations averaged over 100,000 trials (see the Appendix for more details regarding the code implemented):

$(r_{1},r_{2})$	$Success\hskip 2.15277ptprobability\hskip 2.15277ptby\hskip 2.15277ptclosed\hskip 2.15277ptform\hskip 2.15277ptsolutions\hskip 2.15277pt(Strategy3)$	$Success\hskip 2.15277ptprobability\hskip 2.15277ptby\hskip 2.15277ptsimulations\hskip 2.15277pt(Strategy3)$
$(20,50)$	$0.50439$	$0.50497$
$(20,60)$	$0.51804$	$0.52063$
$(20,70)$	$0.52043$	$0.51970$
$(30,50)$	$0.54855$	$0.54934$
$(30,60)$	$0.56939$	$0.56910$
$(30,70)$	$0.57304$	$0.57300$
$(40,50)$	$0.54252$	$0.54046$
$(40,60)$	$0.57056$	$0.57028$
$(40,70)$	$0.57546$	$0.57539$

Table 1: Comparison of closed-form success probability and simulations

2.3 The $d$ admissible candidates case: probability computations

In this section, we aim to generalize the results of the previous section, establishing a formula governing the total probability of success $P(r_{1},r_{2},...,r_{d})$ when required to select one of the best d candidates within the pool of n candidates. Analogously to section 2.2, we will first determine a formula for the total probability of success for each of the d standard strategies within a restricted sample of the candidates pool in a stepwise manner. This will then allow their combination into a final strategy maximizing the overall probability of success based on mutually exclusive events.

2.3.1 General standard strategy on a restricted sample of candidates

When required to select one amongst the best d candidates, there are d possible strategies one could possibly implement. Hence, for each positive integer-valued $2\leq s\leq d$ , we can define:

Strategy s: Reject the first $r-1$ candidates and then select the first candidate which ranks at least sth
relative to the pool of rejected candidates.

The aim of this subsection is to determine, for each $s\leq d$ , the total probability of success $P(s,r,r^{\prime})$ when implementing Strategy s on a restricted sample of $r^{\prime}-1\leq n$ candidates, rejecting the first ${r-1}$ candidates.

Assuming that the best d candidates, ignoring rank, hold positions $n_{1}<n_{2}<...<n_{d}\leq n$ , it is convenient to categorize the subcases where success is possible according to the number of the best d candidates in the rejected pool, leading us to consider the following d general subcases:

	$\displaystyle\text{Subcase 0:}\quad r\leq n_{1}<n_{2}<...<n_{d}$
	$\displaystyle\text{Subcase 1:}\quad n_{1}<r\leq n_{2}<n_{3}<...<n_{d}$
	$\displaystyle\text{Subcase 2:}\quad n_{1}<n_{2}<r\leq n_{3}<n_{4}<...<n_{d}$
	$\displaystyle...$
	$\displaystyle\text{Subcase {k}:}\quad n_{1}<n_{2}<...<n_{k}<r\leq n_{k+1}<...<n_{d}$
	$\displaystyle...$
	$\displaystyle\text{Subcase {d-1}:}\quad n_{1}<n_{2}<...<n_{d-1}<r\leq n_{d}$

Before proceeding, it is helpful to define two pertinent expressions:

2.3.2 Permutations of consecutive candidates

Given some $n,x\in\mathbb{Z}^{+}$ , where $x\geq n$ , notice that

\displaystyle\frac{1}{n!}\prod_{i=0}^{n-1}(x-i)={x\choose n}.

Then, the total permutations of $k\in\mathbb{Z}^{+}$ consecutive candidates between positions a and b, where the ith candidate holds position $n_{k-i+1}$ and $a\leq n_{k}<n_{k-1}<...<n_{1}\leq b$ , are given by

\displaystyle\sum_{n_{k}=a}^{b-k+1}\sum_{n_{k-1}=n_{k}+1}^{b-k+2}...\sum_{n_{i}=n_{i+1}+1}^{b-k+i}...\sum_{n_{1}=n_{2}+1}^{b}\cdot\hskip 4.30554pt(1)\hskip 4.30554pt=\hskip 4.30554pt\frac{1}{k!}\prod_{i=-1}^{k-2}(b-a-i)\hskip 4.30554pt=\hskip 4.30554pt\frac{1}{k!}\prod_{i=0}^{k-1}\left((b-a+1)-i\right)\hskip 4.30554pt=\hskip 4.30554pt{{b-a+1}\choose k}.

The equality above can be shown by induction on $k$ and $b$ respectively, for $b\geq(a+k-1$ ):

	$\displaystyle\sum_{n_{k}=a}^{b-k+1}\sum_{n_{k-1}=n_{k}+1}^{b-k+2}...\sum_{n_{i}=n_{i+1}+1}^{b-k+i}...\sum_{n_{1}=n_{2}+1}^{b}\cdot\hskip 4.30554pt(1)\hskip 4.30554pt=\hskip 4.30554pt\sum_{n_{k}=a}^{b-k+1}\left(\frac{1}{(k-1)!}\prod_{i=-1}^{k-3}(b-(n_{k}+1)-i)\right)\hskip 2.15277pt\left[induction\hskip 2.15277pthypothesis\hskip 2.15277pton\hskip 2.15277pt(k-1)\right]$
	$\displaystyle=\frac{1}{(k-1)!}\prod_{i=-1}^{k-3}(b-a-i-1)\hskip 2.15277pt+\hskip 2.15277pt\frac{1}{(k-1)!}\sum_{n_{k}=a}^{(b-1)-k+1}\prod_{i=-1}^{k-3}((b-1)-(n_{k}+1)-i)$
	$\displaystyle=\frac{1}{(k-1)!}\prod_{i=-1}^{k-3}(b-a-i-1)\hskip 2.15277pt+\hskip 2.15277pt\frac{1}{k!}\prod_{i=-1}^{k-2}((b-1)-a-i)\hskip 8.61108pt\left[induction\hskip 2.15277pthypothesis\hskip 2.15277pton\hskip 2.15277pt(b-1)\right]$
	$\displaystyle=\frac{1}{k!}\left(((b-1)-a+2)\prod_{i=-1}^{k-3}((b-1)-a-i)\right)$
	$\displaystyle=\frac{1}{k!}\prod_{i=-1}^{k-2}(b-a-i).\hskip 8.61108pt\blacksquare$

2.3.3 Selections

Given some $k\in\mathbb{Z}^{+}$ , suppose that k candidates hold positions strictly below r’, and that m of these hold positions greater than or equal to r, where $m\leq k\leq r^{\prime}-1$ . In this subsubsection, we compute the probability $S_{s}(k,m)$ that, given some $s\leq k-m$ , at least one of the m candidates ranks at least sth relative to the k candidates.

Notice that this doesn’s occur if and only if the best $s$ candidates all lie within the $k-m$ positions strictly below r, where the remaining $k-s$ candidates can be arranged without restrictions. Hence, by counting permutations:

\displaystyle 1-S_{s}(k,m)\hskip 4.30554pt=\hskip 4.30554pt\frac{\frac{(k-m)!}{(k-m-s)!}\cdot(k-s)!}{k!},

so that

\displaystyle S_{s}(k,m)\hskip 4.30554pt=\hskip 4.30554pt1-\prod_{i=0}^{s-1}\frac{k-m-i}{k-i}.

Having defined the expressions above, we will now proceed towards computing $P(s,r,r^{\prime})$ by partitioning our total $d-1$ subcases into two types (defined in terms of the general Subcase k), and compute their respective probabilities in the following two subsubsections.

2.3.4 Subcases where $0\leq k\leq s-1$

Note that, for this type of subcase, the probability of success is independent of the specific ranks of the best d candidates (since the strategy implemented guarantees the selection of any of the best d candidates whenever this is interviewed), but is dependent on the ranks of candidates before the $n_{k+1}$ th position (since Strategy s does not require a candidate to be amongst the best d candidates for selection).

Thus, using an analogous approach to Section 2.2.1:

$P(\text{Success }\cap\hskip 2.15277pt\text{Subcase}\hskip 4.30554ptk\leq s-1)$

	$\displaystyle=\hskip 4.30554pt\sum_{n_{1}=1}^{r-k}\hskip 4.30554pt\sum_{n_{2}=n_{1}+1}^{r-k+1}...\sum_{n_{k}=n_{k-1}+1}^{r-1}\hskip 4.30554pt\sum_{n_{k+1}=r}^{r^{\prime}-1}\hskip 4.30554pt\sum_{n_{k+2}=n_{k+1}+1}^{n-(d-(k+2))}...\hskip 4.30554pt\sum_{n_{d}=n_{d-1}+1}^{n}P(\text{Success }\cap\text{Best}\hskip 4.30554ptd\hskip 4.30554pt\text{candidates at $n_{1},...,n_{d}$})$
	$\displaystyle=\hskip 4.30554pt{r-1\choose k}\cdot\sum_{n_{k+1}=r}^{r^{\prime}-1}{n-n_{k+1}\choose d-k-1}\cdot P(\text{Best}\hskip 4.30554ptd\hskip 4.30554pt\text{candidates at $n_{1},...,n_{d}$})\cdot P(\text{Success }\|\hskip 4.30554pt\text{Best}\hskip 4.30554ptd\hskip 4.30554pt\text{candidates at $n_{1},...,n_{d}$})$
	$\displaystyle=\hskip 4.30554pt{r-1\choose k}\cdot d!\prod_{i=0}^{d-1}\frac{1}{n-i}\hskip 2.15277pt\cdot\sum_{n_{k+1}=r}^{r^{\prime}-1}{n-n_{k+1}\choose d-k-1}\hskip 2.15277pt$
	$\displaystyle\hskip 103.33301pt\cdot P(\text{Best $s$ candidates before $n_{k+1}$ in the $r-1$ group }\|\hskip 4.30554pt\text{Best}\hskip 4.30554ptd\hskip 4.30554pt\text{candidates at $n_{1},...,n_{d}$})$
	$\displaystyle=\hskip 4.30554pt{r-1\choose k}\cdot d!\prod_{i=0}^{d-1}\frac{1}{n-i}\hskip 2.15277pt\cdot\sum_{n_{k+1}=r}^{r^{\prime}-1}{n-n_{k+1}\choose d-k-1}\cdot\hskip 2.15277pt\prod_{j=k+1}^{s}\frac{r-j}{n_{k+1}-j}.$

2.3.5 Subcases where $s\leq k\leq d-1$

Conversely to the previous subsubsection, notice that for this type of subcase the probability of success is dependent on the specific ranks of the best d candidates (since the more highly selective strategy may reject even such candidates unless they rank sufficiently high relative to the rejected pool), but independent of the ranks of candidates before the $n_{k+1}$ th position (since Strategy s, in this case, requires a candidate to rank higher than at least one of the best k rejected candidates for selection, hence to be one of the best d candidates overall).

Therefore, due to the rank-dependence of the best d candidates, we will need a slightly different approach:

$P(\text{Success }\cap\hskip 2.15277pt\text{Subcase}\hskip 4.30554ptk\geq s)$

	$\displaystyle=\hskip 4.30554pt\sum_{i=1}^{d-k}\hskip 4.30554ptP\left((\text{Success }\cap\hskip 2.15277pt\text{Subcase}\hskip 4.30554ptk\geq s)\hskip 2.15277pt\cap\hskip 2.15277ptn_{1}<...<n_{k}<r\leq n_{k+1}<...<n_{k+i}<r^{\prime}\leq n_{k+i+1}<...<n_{d}\leq n\right)$
	$\displaystyle=\hskip 4.30554pt\sum_{i=1}^{d-k}\hskip 4.30554ptP(n_{1}<...<n_{k}<r\leq n_{k+1}<...<n_{k+i}<r^{\prime}\leq n_{k+i+1}<...<n_{d}\leq n)$
	$\displaystyle\hskip 68.88867pt\cdot P\left((\text{Success }\cap\hskip 2.15277pt\text{Subcase}\hskip 4.30554ptk\geq s)\hskip 2.15277pt\|\hskip 2.15277ptn_{1}<...<n_{k}<r\leq n_{k+1}<...<n_{k+i}<r^{\prime}\leq n_{k+i+1}<...<n_{d}\leq n\right)$

\displaystyle=\hskip 4.30554pt\sum_{i=1}^{d-k}\frac{{r-1\choose k}\cdot{r^{\prime}-r\choose i}\cdot{n-r^{\prime}+1\choose d-k-i}}{{n\choose d}}\cdot P(\text{at least one candidate in }n_{k+1},...,n_{k+i}\text{ ranks at least }s\text{th relative to candidates in }n_{1},...,n_{k+i})

\displaystyle=\hskip 4.30554pt{r-1\choose k}\cdot d!\prod_{j=0}^{d-1}\frac{1}{n-j}\hskip 2.15277pt\cdot\sum_{i=1}^{d-k}{r^{\prime}-r\choose i}\cdot{n-r^{\prime}+1\choose d-k-i}\cdot S_{s}(k+i,i).

2.3.6 Computing the probability of success for any given Strategy $s$ on a restricted sample of candidates

Using the results of the two previous subsubsections, and the fact that success can only occur within the $d$ subcases listed in section 2.3.1, we can finally compute the probability of success for any given Strategy s on a restricted sample of candidates between $r,\hskip 2.15277ptr^{\prime}-1$ :

	$P(s,r,r^{\prime})$	$\displaystyle=\sum_{k=0}^{d-1}\hskip 4.30554ptP(\text{Success }\cap\hskip 2.15277pt\text{Subcase}\hskip 4.30554ptk)$
		$\displaystyle=\sum_{k=0}^{s-1}\hskip 4.30554ptP(\text{Success }\cap\hskip 2.15277pt\text{Subcase}\hskip 4.30554ptk\leq s-1)\hskip 4.30554pt+\hskip 4.30554pt\sum_{k=s}^{d-1}\hskip 4.30554ptP(\text{Success }\cap\hskip 2.15277pt\text{Subcase}\hskip 4.30554ptk\geq s)$
		$\displaystyle=d!\prod_{l=0}^{d-1}\frac{1}{n-l}\hskip 2.15277pt\cdot$
		$\displaystyle\left(\sum_{k=0}^{s-1}{r-1\choose k}\hskip 2.15277pt\sum_{i=r}^{r^{\prime}-1}{n-i\choose d-k-1}\hskip 2.15277pt\prod_{j=k+1}^{s}\frac{r-j}{i-j}\hskip 4.30554pt+\hskip 4.30554pt\sum_{k=s}^{d-1}{r-1\choose k}\sum_{i=1}^{d-k}{r^{\prime}-r\choose i}\cdot{n-r^{\prime}+1\choose d-k-i}\cdot S_{s}(k+i,i)\right).$

2.3.7 General adaptive search strategy

Following the motivation of Section 2.2.4, we can now generally define:

Strategy d+1: Reject the first $r_{1}-1$ candidates. For each $s\leq d$ , until no selection is made, implement
Strategy s between the $r_{s}^{th}$ and the $(r_{s+1}-1)^{th}$ candidate, where $r_{d+1}=n+1$ .

Armed with the results of the previous subsection, we can now compute the total probability of success for this adaptive search strategy. Note that, for $s=1$ , the product operator below evaluates to $1$ , giving the probability of success for Strategy 1 between $r_{1},r_{2}$ , as desired.

	$\displaystyle P(r_{1},r_{2},...,r_{d})$	$\displaystyle=\sum_{s=1}^{d}\hskip 4.30554pt\prod_{i=1}^{s-1}P(\text{Strategy }i\text{ made no selections})\hskip 2.15277pt\cdot P(s,r_{s},r_{s+1})$
		$\displaystyle=\sum_{s=1}^{d}\hskip 4.30554pt\prod_{i=1}^{s-1}P(\text{best }i\text{ candidates before position }r_{i+1}-1\text{ in the first }r_{i}-1\text{ positions)}\hskip 2.15277pt\cdot P(s,r_{s},r_{s+1})$
		$\displaystyle=\sum_{s=1}^{d}\hskip 4.30554pt\prod_{i=1}^{s-1}\hskip 2.15277pt\prod_{j=1}^{i}\frac{r_{i}-j}{r_{i+1}-j}\hskip 2.15277pt\cdot P(s,r_{s},r_{s+1})$
		$\displaystyle=\sum_{s=1}^{d}\hskip 4.30554pt\prod_{i=1}^{s-1}\frac{r_{i}-i}{r_{s}-i}\hskip 2.15277pt\cdot P(s,r_{s},r_{s+1}),$

where the final step can be shown by induction (the statement holds trivially for $s=1$ ):

	$\displaystyle\prod_{i=1}^{s-1}\hskip 2.15277pt\prod_{j=1}^{i}\frac{r_{i}-j}{r_{i+1}-j}$	$\displaystyle=\left(\prod_{j=1}^{s-1}\frac{r_{s-1}-j}{r_{s}-j}\right)\cdot\left(\prod_{i=1}^{s-2}\hskip 2.15277pt\prod_{j=1}^{i}\frac{r_{i}-j}{r_{i+1}-j}\right)$
		$\displaystyle=\left(\frac{r_{s-1}-(s-1)}{r_{s}-(s-1)}\cdot\prod_{j=1}^{s-2}\frac{r_{s-1}-j}{r_{s}-j}\right)\cdot\left(\prod_{i=1}^{s-2}\frac{r_{i}-i}{r_{s-1}-i}\right)$
		$\displaystyle=\frac{r_{s-1}-(s-1)}{r_{s}-(s-1)}\cdot\prod_{i=1}^{s-2}\frac{r_{i}-i}{r_{s}-i}\hskip 4.30554pt=\hskip 4.30554pt\prod_{i=1}^{s-1}\frac{r_{i}-i}{r_{s}-i}.$

Hence, for each added threshold $r_{s}$ , this strategy bumps up the overall success probability monotonically with respect to $d$ . Though not surprising, it ensures a greater probability of finding sub-optimal choices above the selection criteria.

3 Computational results on optimization of search success

In this section, we verify the results derived in the previous section using computer simulations, and propose an algorithm to maximize success probability within a reasonable computation time.

3.1 Simulating the outcomes of an arbitrary number of trials

The algorithm below simulates the problem $numtrials$ times for finding the best $d$ candidates, over $n$ total candidates, given a list of selection thresholds $(r_{1},r_{2},...,r_{d})$ . It can be used to verify the analytical results of the formula derived in section 2.3.7 by using simulations. The full Python code for the algorithm can be found in the Appendix.

Parameters :

n

d

numtrials

, selection thresholds

(r_{1},r_{2},...,r_{d})

, where

r_{d+1}=n+1

for $trial=1,\cdots,numtrials$ do

Generate list of

n

candidates with random ranking orders;

Set

leastAcceptable

d

th highest ranking in the list;

Set selection round

s

1

;

for $s=1\hskip 2.15277ptto\hskip 2.15277ptd$ do

for $i=r_{s}\hskip 2.15277ptto\hskip 2.15277ptr_{s+1}-1$ do

if $i$ th candidate has greater ranking than $s$ th best candidate so far then

if $i$ th candidate has greater ranking than $leastAcceptable$ then

Record successful trial

else

Record failed trial

Compute percentage of successful trials out of

numtrials

total trials.

Algorithm 1 Simulating the selection process

For instance, we simulate the problem one million times for the best $5$ candidates, over $100$ total candidates, with selection thresholds $50,60,65,75,80$ (see Appendix for more details on the specific code implemented).

In this case, the final formula derived in section 2.3.7 computes a success probability of $0.829427$ (6 d.p.), and further similar tests show excellent agreement with probability discrepancies generally below $10^{-3}$ .

3.2 Maximizing success probability: the brute force method

In order to determine the values for $r_{1},r_{2},...,r_{d}$ yielding the maximum success probability, we can use the following direct procedure. The full Mathematica code is provided in the Appendix.

1) Generate a list $rlist[d,n]$ containing all the possible combinations of $r_{1},r_{2},...,r_{d}$ for given values of $d,n$ .

2) Use the formula derived at the end of Section 2 to compute the probability of success for given $d,n$ values, and a list $rlist$ containing the current choices of $r_{1},r_{2},...,r_{d}$ .

3) Compute the maximum probability and the optimal choices of $r_{1},r_{2},...,r_{d}$ for given values of $d,n$ by
exhaustively iterating through $rlist[d,n]$ .

Despite that this method is guaranteed to find the optimal solution to the problem, it is very inefficient due to the high computational run-times. In fact, as the formula below shows, the total number of lists $L(d,n)$ the algorithm is required to iterate through ramps up extremely quickly as values of $d,n$ increase:

\displaystyle L(d,n)\hskip 2.15277pt=\hskip 2.15277pt\frac{1}{d!}\prod_{i=1}^{d}\hskip 2.15277pt(n-i)\hskip 2.15277pt=\hskip 2.15277pt\frac{n-d}{n}{n\choose d}.

For instance, if we wanted to select one amongst the best $d=6$ candidates, out of a total of $n=100$ candidates, we would already need to iterate over above one billion of possible combinations for $r_{1},r_{2},...,r_{6}$ , making the algorithm very inconvenient on a practical level.

This issue motivates the method of simulated annealing in the following section.

3.3 Maximizing success probability: Simulated annealing

This method aims to find the greatest possible success probability for given $d,n$ values within an arbitrary number $numIterations$ of searches in the total search space $rlist[d,n]$ .

We begin at an initial state $rlist0[d,n]$ (a best guess for the optimal choices of $r_{1},r_{2},...,r_{d}$ ) and then, given a current state rlistCurrent, the acceptance probability function $P$ determines the probability of moving to a random neighboring state rlistCandidate, based on the relative difference in the states’ success probabilities and the annealing schedule temperature $T$ . The probability of success $p$ for a given state is determined by the formula derived at the end of Section 2 (and computed in the exact same way as in Step 2 of Section 3.2). $T$ depends on the number of current iterations $i$ relative to the total iterations $numIterations$ allowed, and a RescaleFactor needed to amplify the relatively small probability discrepancies between neighboring states. In general, as the algorithm approaches the maximum number of iterations, $T$ gradually approaches zero, reducing the probability of a change of state.

The pseudocode below illustrates the algorithm. The full Mathematica code implementing the algorithm, our full definitions for $rlist0[d,n]$ , $T$ , and $P$ , and the neighbour-generating procedure are shown in the Appendix.

Parameters :

d

n

numIterations

Set current state

s\leftarrow rlist0[d,n]

;

for $i=1$ to $numIterations$ do

Set temperature

T\leftarrow(\frac{n-i}{n\cdot RescaleFactor})

;

Pick a random neighbour

s_{new}

;

Set acceptance probability

P(s,s_{new})\leftarrow\frac{1}{1+e^{\frac{p(s)-p(s_{new})}{T}}}

;

if $P(s,s_{new})>random(0,1)$ ¹¹1This function call generates a random number uniformly distributed over (0,1). then

Set

s\leftarrow s_{new}

;

Output final state

s

;

Algorithm 2 Simulated annealing for optimal solutions

Despite that simulated annealing is not guaranteed to find the optimal solution to the problem, it can find extremely close estimates by sampling only a tiny fraction of the total search space.

Table 2 illustrates the efficiency of this method by comparing the results it produces within a negligible computation time, involving merely $100$ iterations, to the maximum probabilities computed by brute force over a considerably longer period (which is proportional to the formula for $L(d,n)$ shown above):

$Parameters\hskip 2.15277pt(d,n)$	$Simulated\hskip 2.15277ptannealing$	$optimal\hskip 2.15277ptr_{i}\hskip 2.15277ptvalues$	$Brute\hskip 2.15277ptforce$	$optimal\hskip 2.15277ptr_{i}\hskip 2.15277ptvalues$	Discrepancy
$2,20$	$0.604616$	$(8,14)$	$0.604616$	$(8,14)$	$0.000000$
$3,20$	$0.747476$	$(7,13,16)$	$0.747476$	$(7,13,16)$	$0.000000$
$4,20$	$0.842694$	$(7,12,15,17)$	$0.842694$	$(7,12,15,17)$	$0.000000$
$2,30$	$0.594123$	$(11,21)$	$0.594123$	$(11,21)$	$0.000000$
$3,30$	$0.734922$	$(11,18,24)$	$0.734922$	$(11,18,24)$	$0.000000$
$2,50$	$0.585779$	$(18,34)$	$0.585779$	$(18,34)$	$0.000000$
$2,100$	$0.579554$	$(36,68)$	$0.579561$	$(35,67)$	$0.000007$

Table 2: Comparison of simulated annealing vs. direct brute force method

Although the runtime complexity of the brute force method does not currently allow to verify whether deviations from the maximum probability are as promising for higher $d,n$ values, these initial results demonstrate how efficient simulated annealing can be at rapidly finding optimal solutions when equipped with an appropriate selection of parameters.

Though not surprising, it is worth noting that the probability of success for a fixed $n$ monotonically increases when increasing $d$ . This concept was illustrated analytically in section 2.2.1 where it has been shown that, for a fixed $n$ and single stopping point $r$ , the probability of success almost doubles for $d=2$ , when compared to the standard case where $d=1$ . Similarly, for any $d>1$ , it is sufficient to apply the strict selection criteria of strategy 1 (giving inferior success probabilities than the optimal adaptive search strategy) to observe that we are guaranteed a higher success probability than the standard case, since we would have a more relaxed success criteria under identical selection criteria.

4 Discussion and conclusion

The present work centers on the extension of the classical secretary problem by considering suboptimal choices as successful hires, as well as the use of an adaptive search strategy with multiple stopping criteria that allow closed-form calculation of the probability of hiring any of the top $d$ candidates. Our approach provides a more comprehensive and practical solution to the original secretary problem, as we take into account real-world scenarios where hiring the best candidate may not always be possible. Furthermore, applying the simulated annealing method as an alternative computation tool offers a faster and more effective optimization as compared to the brute force exhaustive search method. Our results have implications for various fields, where decision-making under uncertainty is a common challenge [1, 41, 43], including personnel selection and resource allocation.

In order to further develop the solution to the problem presented in this paper, a crucial step is to determine whether it is possible to derive algebraically a closed expression directly evaluating the optimal selection thresholds, hence maximum success probability, for arbitrary values of $d,n$ . The derivation of such formula would drastically simplify the computational aspect of the problem introduced in section 3, and potentially provide a more complete understanding of its solution. However, the presence of an arbitrary number of variables (due to the weaker selection criteria) and the complexity of the general probability formula computed in section 2.3.7 have made it difficult to follow an approach relatable to the one used for the standard case to find optimal strategies [26].

Moreover, a related secondary aspect involves carrying out a more rigorous verification of the effectiveness of the probability maximizing method proposed in section 3.3. This can either be done by solving the more fundamental issue described above, providing a method to directly compute optimal selection thresholds and maximum success probability for arbitrary parameters, or, alternatively, by using greater computation power to evaluate discrepancies in less trivial cases. If such verification finds the current method to be unsuitable at greater values of $d,n$ , the latter can be improved by optimizing the temperature function of simulated annealing (e.g. through RescaleFactor as shown in Algorithm 2) and the total number of iterations. Alternatively, we could adopt particle swarm optimization to enhance the coverage of the search space [26].

Inspired by the comparison of Strategies 1 and 2 presented in Section 2, our present work investigates an adaptive search strategy with arbitrarily many $d$ stropping criteria. Previously, Gusein-Zade [23] and Gilbert and Mosteller [20], respectively, have explored the base case where $d=2$ and computed its limit probability of success as $n\to\infty$ to $\approx 0.574$ as well as asymptomatic behavior of general $d$ values. More importantly, we are able to derive closed-form solutions for the probability of success in securing any of the top $d$ candidates. Unlike in Ref. [8], our work does not give preference to any candidate within the top $d$ candidates over another: choosing the best candidate, 2nd best candidate, or $d$ th best candidate are all seen as equal successes, and not choosing a candidate within the top $d$ , no matter their final rank, is seen as an equal failure. Moreover, unlike in Ref. [47], our work treats all the candidates, including the best $d$ , as directly comparable items.

One promising extension for future work is to incorporate game theory [35] into the secretary problem to account for multiple decision makers [12] having idiosyncratic preferences within the selection committee [3]. Extension like this will help inform fair and efficient selection practices while mitigating the impact of human biases such as in-group favoritism [33, 11].

In sum, we calculate the probability of hiring any of the top $d$ candidates using multiple stopping criteria and with closed form solutions. To avoid extensive computational time, we use the simulated annealing method for optimization, which performs reasonably well and saves significant computing time compared to exhaustive search. Our findings highlight the importance of using adaptive search strategies and computational methods for maximizing the likelihood of suboptimal outcomes in the satisficing secretary problem.

Acknowledgments

F.F. gratefully acknowledges support from the Bill & Melinda Gates Foundation (award no. OPP1217336), the NIH COBRE Program (grant no.1P20GM130454), and the Neukom CompX Faculty Grant. Roberto Brera acknowledges Ian Gill’s significant contributions in the project’s initial phase.

References

[1] AR Abdel-Hamid, JA Bather, and GB Trustrum. The secretary problem with an unknown number of candidates. Journal of Applied Probability, 19(3):619–630, 1982.
[2] Susanne Albers and Leon Ladewig. New results for the k-secretary problem. Theoretical Computer Science, 863:102–119, 2021.
[3] Steve Alpern and Vic Baston. The secretary problem with a selection committee: do conformist committees hire better secretaries? Management Science, 63(4):1184–1197, 2017.
[4] R Bartoszyński and Z Govindarajulu. The secretary problem with interview cost. Sankhyā: The Indian Journal of Statistics, Series B, pages 11–28, 1978.
[5] MohammadHossein Bateni, Mohammadtaghi Hajiaghayi, and Morteza Zadimoghaddam. Submodular secretary problem and extensions. ACM Transactions on Algorithms (TALG), 9(4):1–23, 2013.
[6] L Bayón, P Fortuny Ayuso, JM Grau, AM Oller-Marcén, and MM Ruiz. The multi-returning secretary problem. Discrete Applied Mathematics, 320:33–46, 2022.
[7] J Neil Bearden. A new secretary problem with rank-based selection and cardinal payoffs. Journal of Mathematical Psychology, 50(1):58–59, 2006.
[8] J Neil Bearden, Amnon Rapoport, and Ryan O Murphy. Sequential observation and selection with rank-dependent payoffs: An experimental study. Management Science, 52(9):1437–1449, 2006.
[9] Domagoj Bradac, Anupam Gupta, Sahil Singla, and Goran Zuzic. Robust algorithms for the secretary problem. arXiv preprint arXiv:1911.07352, 2019.
[10] Niv Buchbinder, Kamal Jain, and Mohit Singh. Secretary problems via linear programming. Mathematics of Operations Research, 39(1):190–206, 2014.
[11] Ho-Chun Herbert Chang and Feng Fu. Elitism in mathematics and inequality. Humanities and Social Sciences Communications, 8(1):1–8, 2021.
[12] Robert W Chen, Burton Rosenberg, and Larry A Shepp. A secretary problem with two decision makers. Journal of Applied Probability, 34(4):1068–1074, 1997.
[13] Robert Chin, Jonathan E Rowe, Iman Shames, Chris Manzie, and Dragan Nešić. Ordinal optimisation and the offline multiple noisy secretary problem. arXiv preprint arXiv:2106.01185, 2021.
[14] YS Chow, Sigaiti Moriguti, Herbert Robbins, and SM Samuels. Optimal selection based on relative rank (the ?secretary problem?). Israel Journal of mathematics, 2(2):81–90, 1964.
[15] Ruth M Corbin. The secretary problem as a model of choice. Journal of Mathematical Psychology, 21(1):1–29, 1980.
[16] José Correa, Andrés Cristi, Laurent Feuilloley, Tim Oosterwijk, and Alexandros Tsigonias-Dimitriadis. The secretary problem with independent sampling. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2047–2058. SIAM, 2021.
[17] Thomas S Ferguson. Who solved the secretary problem? Statistical science, 4(3):282–289, 1989.
[18] PR Freeman. The secretary problem and its extensions: A review. International Statistical Review/Revue Internationale de Statistique, pages 189–206, 1983.
[19] Jacqueline Gianini and Stephen M Samuels. The infinite secretary problem. The Annals of Probability, 4(3):418–432, 1976.
[20] John P Gilbert and Frederick Mosteller. Recognizing the maximum of a sequence. Selected Papers of Frederick Mosteller, pages 355–398, 2006.
[21] Kenneth S Glasser and Richard Holzsager Austin Barron. The d choice secretary problem. Sequential Analysis, 2(3):177–199, 1983.
[22] Daniel G Goldstein, R Preston McAfee, Siddharth Suri, and James R Wright. Learning in the repeated secretary problem. arXiv preprint arXiv:1708.08831, 2017.
[23] SM Gusein-Zade. The problem of choice and the optimal stopping rule for a sequence of independent trials. Theory of Probability & Its Applications, 11(3):472–476, 1966.
[24] Greg Harrell, Josh Harrison, Guifen Mao, and Jin Wang. Online auction and secretary problem. In Proceedings of the International Conference on Scientific Computing (CSC), page 241. The Steering Committee of The World Congress in Computer Science, Computer ?, 2015.
[25] M Hlynka and JN Sheahan. The secretary problem for a random walk. Stochastic Processes and their applications, 28(2):317–325, 1988.
[26] James Kennedy and Russell Eberhart. Particle swarm optimization. In Proceedings of ICNN’95-international conference on neural networks, volume 4, pages 1942–1948. IEEE, 1995.
[27] Elias Koutsoupias and Philip Lazos. Online trading as a secretary problem. In International Symposium on Algorithmic Game Theory, pages 201–212. Springer, 2018.
[28] Grzegorz Kubicki and Michal Morayne. Graph-theoretic generalization of the secretary problem: the directed path case. SIAM Journal on Discrete Mathematics, 19(3):622–632, 2005.
[29] M Lee, T Gregory, and Matthew Welsh. Decision-making on the full information secretary problem. Proceedings of the Twenty- Sixth Conference of the Cognitive Science Society, pages 819–824, 2005.
[30] Denis V Lindley. Dynamic programming and decision theory. Journal of the Royal Statistical Society: Series C (Applied Statistics), 10(1):39–51, 1961.
[31] Thomas J Lorenzen. Optimal stopping with sampling cost: the secretary problem. The Annals of Probability, 9(1):167–172, 1981.
[32] Mohammad Mahdian, Randolph Preston McAfee, and David Pennock. The secretary problem with a hazard rate condition. In International Workshop on Internet and Network Economics, pages 708–715. Springer, 2008.
[33] Naoki Masuda and Feng Fu. Evolutionary models of in-group favoritism. F1000prime reports, 7, 2015.
[34] TAMAS F Móri. The random secretary problem with multiple choice. Annales Univ. Sci. Budapestinensis de Rolando Eotvos Nominatae V, pages 91–102, 1984.
[35] Matjaž Perc and Attila Szolnoki. Coevolutionary games?a mini review. BioSystems, 99(2):109–125, 2010.
[36] J Preater. On multiple choice secretary problems. Mathematics of Operations Research, 19(3):597–602, 1994.
[37] Peter A Rogerson. Probabilities of choosing applicants of arbitrary rank in the secretary problem. Journal of applied probability, 24(2):527–533, 1987.
[38] H Rubin and SM Samuels. The finite-memory secretary problem. The Annals of Probability, pages 627–635, 1977.
[39] Stephen M Samuels. [who solved the secretary problem?]: Comment: Who will solve the secretary problem? Statistical Science, 4(3):289–291, 1989.
[40] Stephen M Samuels. Secretary problems as a source of benchmark bounds. Lecture Notes-Monograph Series, pages 371–387, 1992.
[41] Darryl A Seale and Amnon Rapoport. Optimal stopping behavior with relative ranks: The secretary problem with unknown population size. Journal of Behavioral Decision Making, 13(4):391–411, 2000.
[42] Marek Skarupski. Secretary problem with possible errors in observation. Mathematics, 8(10):1639, 2020.
[43] MH Smith. A secretary problem with uncertain employment. Journal of applied probability, 12(3):620–624, 1975.
[44] MH Smith and JJ Deely. A secretary problem with finite memory. Journal of the American Statistical Association, 70(350):357–361, 1975.
[45] Theodor J Stewart. The secretary problem with an unknown number of options. Operations Research, 29(1):130–145, 1981.
[46] Krzysztof Szajowski and Mitsushi Tamaki. Shelf life of candidates in the generalized secretary problem. Operations Research Letters, 44(4):498–502, 2016.
[47] Xinlin Wu and Haixiong Li. A satisficing policy of the secretary problem: theory and simulation. Communications in Statistics-Theory and Methods, pages 1–14, 2020.
[48] Tong Zhao, Mandy Hu, Razieh Rahimi, and Irwin King. It’s about time! modeling customer behaviors as the secretary problem in daily deal websites. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 3670–3679. IEEE, 2017.