
Hiring good people is hard.
No, actually, hiring is easy. It's finding the right people for the job that's hard.
Suppose you are the hiring manager in charge of filling an open position. You have written an exciting and enticing job ad, posted it in all of the right places, and now, you sit in front of a stack of 50 resumes. In the early stage of the hiring process, your aim was to cast as wide a net as possible, in hopes of finding the absolute best fit. You've done that, and now you find yourself at the inflection point, where you need to work in the opposite direction, and narrow the field of consideration down to one.
In an ideal world, you would hire each of these applicants for a year, and see who does the job best, like some reality show competition. Realistically, however, you don't have time to interview 50 people, let alone coach them through a first year on the job.
What you -- and most people -- would do to speed up the process is to define some criteria that would let you start sorting the applicants into "will" and "won't" interview piles. What you need are heuristics.
Two Kinds of Heuristics
The word heuristics (hyoo-RIS-tics) refers to a kind of filter that comes before a difficult or laborious decision, and which reduces the options by quickly disqualifying those that seem like a waste of time or energy. Using a heuristic, you can efficiently narrow down the field of choices, and spend more time meeting with and getting to know the candidates who seem most promising.
How can you tell which applicants to immediately discard? That question lies at the heart of this discussion, and to address it we need to talk about two different kinds of heuristics.
Loose/Casual Heuristics
Think of all of the decisions that consumers make at the grocery store. Few consumers make decisions based solely on objective product details (e.g. price, expiration dates, quantities, nutritional details, etc.). Most consumers also include subjective criteria, and might therefore opt for the brand of eggs with the cuddly mascot on the carton, or pay for the more expensive brand of razors because of the catchy jingle in their ads.
Most of the time, humans make decisions using fuzzy, ill-defined emotional judgments, not clear, strictly-defined objective criteria. For that reason, sorting "good" choices from "bad" choices tends to be highly subjective, and that's why we talk first about casual heuristics.
Returning to the interviewing scenario, there are countless heuristics you could use to quickly discard the "bad" candidates. You could start by tossing the obviously-unprofessional resumes: those with spelling mistakes, those with bad grammar or swearing, those written in fanciful fonts, or those written by hand, in pink, on the back of a Big Mac wrapper.
This is your first heuristic, call it the Unprofessional Heuristic. You are effectively saying that the ideal candidate for the job will already be fully versed in the codes of professionalism that you expect from job-seekers.
This is a "loose" heuristic, because you can see it as a way to cut corners. You may be right 99.9% of the time, that an unprofessional-looking resume is indicative of a candidate who would not fit the corporate culture, but you can probably also picture a Hollywood-worthy scenario where a prodigious but eccentric savant is disqualified from a job he could do in his sleep, because of the Unprofessional Heuristic. Hollywood notwithstanding, the Unprofessional Heuristic is a useful way of reducing the candidate pool to a manageable size.
Strict/Technical Heuristics
On the other hand, certain problems -- especially in the quantitative and scientific disciplines -- come with an "objective" definition of what makes a bad or a good candidate.
To illustrate this, consider the problem of verifying if a number is prime or not. Determining whether a small number is prime is easy, but as numbers get increasingly long, the difficulty of the problem increases. In cryptography, for instance, knowing whether very large numbers (containing hundreds of digits) are prime or not is critical to keeping information secure. Finding ways to make the prime/composite determination about large numbers faster and more efficient is essential to the widespread use of cryptography.
Heuristics can help greatly accelerate the process. If you spot a number greater than 2 that ends in an even digit (0, 2, 4, 6, or 8), you can quickly tell that it is divisible by 2, and therefore could not possibly be prime. The beauty of this approach is that no matter how many digits a number has, looking to see if the last digit is even is always going to take the exact same amount of time.
We'll call this the Obviously-Even Heuristic. It alone saves you from checking a whopping 50 percent of candidates!
Unlike the Unprofessionalism Heuristic above (which was a casual/loose type of heuristic), this one will never be wrong. This is the key difference between the two types of heuristics: technical heuristics must never, ever disqualify a good candidate.
Whereas a "loose" heuristic is like a "rule of thumb", a "strict" heuristic is more like a hint in a puzzle: the hint will never be wrong, but it won't tell you the answer either. A hint will narrow down your range of choices and provide a condition which must necessarily be true in order to find the answer.
Necessary, not Sufficient
Technical heuristics ensure that they never disqualify a good candidate by isolating a necessary but not solely-sufficient condition that can be easily tested (like the presence of an even number as the last digit for numbers greater than 2).
Notice that a number not ending in an even digit does not guarantee it to be prime. A number like 25 does not end in an even digit, but is not prime, either.
Like sifting pans with holes that get progressively smaller down the stack, you could pile heuristics on top of one another to reduce the number of candidates even further, e.g.:
- First, we apply our Obviously-Even Heuristic,
- Then, we apply the Obviously-Divisible-by-Five Heuristic, which will filter out any candidate greater than 5 that ends in 5 or 0.
- Finally, we'll apply the Obviously-Divisible-by-Three Heuristic, which will filter out any candidate greater than 3 whose digits add up to a multiple of 3.
A number like 8 will get filtered out by the first heuristic, a number like 15 will be filtered out by the second one, and a number like 63 will be filtered out by the third.
If a number manages to get through all three filters, it has managed to satisfy three necessary conditions of primacy, but that is still not enough, not sufficient. The number still needs to be tested for being prime or not. Heuristics don't help speed up the final test, they just ensure that you don't waste time running it on numbers like 9,375,292,394, which are clearly not prime.
Programming with Costs
This entire discussion has been about the cost of making decisions. These costs come in the form of time, money, effort, or any other scarce resource (e.g. computing power, hard disk space, bandwidth).
Casual Heuristics sacrifice some of the accuracy of the decision-making process to gain time. Technical Heuristics find necessary but insufficient conditions that can be tested at low cost.
A skillful programmer should always bear the question of costs in mind while writing code. Sending a request out over the internet may take a lot of time, and should be considered much more expensive than retrieving the same information from the hard drive, or, even better, from an in-memory cache.
When combined with a language feature called "short circuiting", it becomes very easy to implement technical heuristics in code.
#!/usr/bin/env python3
from .heuristics import (
# Runs quickly, only checks the last digit
obviously_even_heuristic,
# Runs quickly, only checks the last digit
obviously_divisible_by_five_heuristic,
# Runs a tiny bit slower, requires summing up
# all of the digits in a number recursively
obviously_divisible_by_three_heuristic,
)
from .deciders import (
# Very expensive to run on large numbers
is_number_prime
)
def determine_if_prime(n):
return (
not obviously_even_heuristic(n) and
not obviously_divisible_by_five_heuristic(n) and
not obviously_divisible_by_three_heuristic(n) and
is_number_prime(n)
)
"Short circuiting" means that in the return statement at the end of the code above, the second condition will not be evaluated if the first one is false, the third will not be evaluated if the second one is false, and so on. In essence, with the "and" operator, Python is smart enough to know that it can stop evaluating the expression if the end result cannot possibly be true.
(Short-circuiting is essentially a technical heuristic built into the programming language itself!)
This behavior could also be used with unrelated tests using the "or" operator, as in this excerpt:
def transfer_money(amount, from_account, to_account):
if (
amount > 10000 or
from_account.is_frozen() or
to_account.is_frozen() or
account_is_on_government_watchlist(from_account) or
account_is_on_government_watchlist(to_account)
):
raise TransferDeniedException
# proceed with the rest of the transfer code here
...
Before we proceed to the bulk of the work inside of this transfer_money() function, we first test whether this transfer raises any red flags. We do this with the "or" operator, which means that as soon as any one of these conditions is true, we stop testing the rest.
This is why we start with a very inexpensive test (whether the amount being transferred exceeds $10,000), then proceed with a bit more expensive tests (we query our own database to see if the accounts are frozen), and finally, we perform the most expensive tests (querying a government system).
Conclusion
Technical heuristics are a very useful tool for any programmer who wishes to build at scale. A well-crafted technical heuristic may permit the same program to run faster, on smaller devices, across more users simultaneously, etc.
More broadly, programming while keeping in mind the costs associated with each statement is a good habit to adopt. Higher resource usage implies higher fragility as well, as there is more that can go wrong. All other things being equal, code that uses fewer resources is necessarily less fragile.
If you'd like to get started experimenting with heuristics in a hands-on way, there is no better place to start than Project Euler. There, you will find hundreds of puzzles involving number theory and geometry. For each one, you must write a piece of code that, when run, will find the answer in under one minute. You will quickly discover that each puzzle has a "naive" solution that will require hours, days, or even months of computing time to find. The art in solving these puzzles lies in finding the right set of heuristics to speed up your code.