Week 1: Algorithms 1

“Introduction to Algorithms current section … Tools to Analyze Algorithms … Algorithmic Technique: Divide and Conquer … Divide and Conquer Example: Investing … Randomization in Algorithms … Application Area Scheduling 1 … Application Area Scheduling 2”

(Source URL)

Summaries

Week 1: Algorithms 1 > 1a Introduction to Algorithms > 1a Video
Week 1: Algorithms 1 > 1b Tools to Analyze Algorithms > 1b Video
Week 1: Algorithms 1 > 1c Algorithmic Technique: Divide and Conquer > 1c Video
Week 1: Algorithms 1 > 1d Divide and Conquer Example: Investing > 1d Video
Week 1: Algorithms 1 > 1e Randomization in Algorithms > 1e Video
Week 1: Algorithms 1 > 1f Application Area Scheduling 1 > 1f Video
Week 1: Algorithms 1 > 1g Application Area Scheduling 2 > 1g Video

Week 1: Algorithms 1 > 1a Introduction to Algorithms > 1a Video

I’m a Professor of Industrial Engineering and Operations Research, and also of Computer Science, and I’m going to be presenting to you some of the basics of algorithms.
We will begin with some motivation about why algorithms is such an important topic to be studying right now, especially in this era of big data and in the current world with the internet and all the different uses that are happening, and then we’ll learn some of the basics of the techniques that you need to understand algorithms- how to understand the running time of an algorithm, how to design algorithms, looking at some basic techniques.
We’ll look at a few different examples, including some basic algorithms on data, such as sorting and searching for items.
So let’s begin just with some motivation about why we’re studying algorithms.
Algorithms, which may have been a somewhat obscure science going back some years, are now everywhere, and they are used everywhere in your life and in applications that we’re all aware of.
Again, there’s core underlying algorithms that explain how to do that.
A company like FedEx or UPS, in order to be able to visit all the sites that they want to visit, is constantly using algorithms and repeatedly solving them in real time.
Biology, which used to be considered a science which had no math in it, to oversimplify things a little bit- people are now realizing that by looking at data, with all this data in the human genome and in all the other scientific applications, we can make amazing biological and medical discoveries via algorithms.
More maybe examples that are more classical- in a computer operating system there’s constant algorithms going on to keep your computers running.
It might actually be running a program which is running an algorithm to try to figure out how to slow the car down and stop it in the best possible way.
You can see the importance of getting algorithms correct and get them to work at the right time.
Why is this the right time to learn about algorithms? So many, many, many things we’ll look at, the problems may have been around for a long time.
In these days, several phenomena have come together to make this really the time when learning algorithms is an extremely powerful tool for your life and for your work.
So we have some mathematical understanding of what’s going on with algorithms right now.
So that ability makes the mathematical understanding of the algorithms even more important.
Algorithms are not things that run on computers in some big warehouse.
Now, with the large amounts of data that are available on the internet, we can actually make more meaningful predictions and make better use of the algorithms.
Without- we’re going to talk about how to design good algorithms.
Without designing good algorithms, it’s not enough to just have the fast computers and the data.
OK, so what do we want to be able to do? What we’re going to want to be able to do here, and this is the framework we’re working in, is given some kind of problem that arises, we want to find the right algorithm.
Then we also want to understand that we need to verify that the algorithm is correct.
OK? And the correctness, which may be something that might seem tedious at first, is extremely important because if you have the wrong algorithm in your brakes in your car, you’re in trouble.
So let me just use this as an example of what does it mean to have an algorithm for this problem.
So to give an example of a simple algorithm, one algorithm that you might be able to communicate just at a high level is what’s an algorithm? Well, we’re going to repeatedly scan the list of numbers, and each time we’re going to take out the smallest remaining number and write it down.
So let me write down the set of numbers and just demonstrate an algorithm.
Maybe the simplest algorithm you might think of, is to just scan the list of numbers and each time pull out the smallest remaining number, cross that out, and write it down at the bottom.
I would cross it out, write it down here, scan the numbers, 3, write it down here, scan the numbers, 5, write it down here.
So now that we have given an algorithm to sort these numbers, then what do we want to know about this algorithm? So some of the questions we might ask is first, how do we know if our algorithm is correct? That’s not something we will get into too much here, but we should at least think about it to realize that this is sorting correctly.
OK, how do we know if our algorithm is good? In that sense we want to know, is it fast enough? So we could say I took eight numbers and I sorted them in 20 seconds, but that’s not particularly useful beyond the confines of these eight numbers.
What we really want to know is, have I designed a method that, given any set of numbers, will be able to sort them in enough time? So in that, we would say we would like an algorithm that, given some number n of numbers as input, can sort them in some time that is some function of n. OK? And we’ll learn how to do that, but that’s the most important thing in understanding and analyzing an algorithm.
So we want to understand exactly how general our algorithm is.
That is, what types of data will it work on and work well on? And maybe there are other types of data which our algorithm will not work on, because we may very well want to design algorithms that only work on certain types of data, to take advantage of structure in various data.
How would you answer a question like this? And of course, this is just a toy question, because in general if you were investing and you wanted algorithms for investing, you’re going to answer much more complicated problems.

Week 1: Algorithms 1 > 1b Tools to Analyze Algorithms > 1b Video

So what we want when we analyze algorithms, is we want to have a predictive science.
That is, we want to be able to measure the running time as a function of n, which is the size of the input.
What’s useful is a general science that says, if I have n numbers, I can sort them in 20 times n seconds, or something like that.
What we want to do is be able to bound or approximate the time that we’re using.
You want your running time to scale as a function of that.
We want to work in a simple model that allows us, as a first pass, to not worry about the details of exactly- is it is in our phone? Is in our Apple laptop? Is it our PC laptop? You know, is one of them a little bit faster, or a little bit slower, than the other? We want to be able to sort of work in a model that captures enough of the reality that we can reason about what are the good algorithms.
You can think of all these basic operations as just taking one unit of time.
Again, they may not all really take one unit of time, but as a first approximation this has served us very well to understand algorithms.
OK. The other point about an algorithm is that we’re going to talk about the running time of an algorithm.
In reality, you can’t talk about the running time an algorithm divorced from the data.
If it happens to find the value 7, then it’s going to happen to print the word hello many, many times.
This is going through two loops, each going up to n. OK. And when you do that, when you go through a loop n times, it executes n things.
When you have two nested loops like this- so this is n times and within each time it’s going another n, those multiply.
So this part here is actually going n squared times.
If it sees a 7, it goes off, and n squared times prints the word hello.
OK. So to answer these questions- what are the worst case running time? What is the best case running time? What is the average? So the important one here- again, the most important one is the worst case running time.
The point about this is that, this program is running at a time that has something to do with n cubed.
In the worst case, if you wanted to think about what’s the worst case input for this? It would just be a sequence of 7s. And so every time as you go through, every time you would see a 7.
You would go off and print hello many, many times.
Just to dismiss the best case- the best case running time of this would be n. Because if you had array that had no 7s in it, all that this program does is it scans the array, see’s there’s no 7’s, and terminates.
So the running time is growing like n. Again, in this example, somehow you wrote this program because you were interested in 7.
So how do we measure the running time? So let me come back to a little more detail about why I wrote that n and that n cubed there.
OK. You can count very carefully get more complicated expressions that exactly characterize the running time.
OK. So you might get something 5n cubed plus m minus 6, or 8 times n log n minus 60n. And so you might get different kinds of functions that have many different terms in them.
The second part- this alternatively is a more useful way of thinking about it- is we’re going to say that some function f of n is O of g of n, if there are positive constants, c and n0, such that f of n is upper bounded by c times g of n for all n greater than or equal to 0.
So what we want is, we want to have a g of n. OK. There’s lots of functions that would upper bound f of n. In fact, there is an infinite number of them.
When we classify functions, f of n, which are our running times of algorithms, by one these g of n. So I’ve listed some of the typical ones here.
That means that things run in a time that’s a constant.
So these are- we’ll see things whose running times are proportional to the log of the data size.
We’re just saying, the running time of the algorithm is in one of these classes here.
Small meaning, smaller running time, to the left in this chart.
We can ask, how long does it take to run data, like say something the input of size 100, or an input of size 10,000, on an algorithm whose running time is in one of these classes across the top, like n, n log n, et cetera.
Even on 1,000,000 items, you’re at 6 times 10 to the minus 6 seconds- again, instantaneous.
This is like the time it takes to scan your data once, or do a few simple things.
Maybe our computers are going to get- you know, in a few years our computer’s going to be 10 times faster.
N cubed is actually, as we’ll see, that’s the time to multiply 2n by n matrices.
Again, even if your computers get 10 times faster, that’s 3,100 years.
2 to the n is the time to look at all subsets of a set of numbers.
OK. So when we’re talking about polynomial time, you may have even heard this.
So we say an algorithm runs on polynomial time if it’s running time is upper bounded by a polynomial.

Week 1: Algorithms 1 > 1c Algorithmic Technique: Divide and Conquer > 1c Video

So just to start with a very simple example of recursion, something you may have seen before are the Fibonacci numbers.
Now, the Fibonacci numbers, if you see the sequence at the bottom of the screen, are a set of numbers where you just add the previous two to get the next one.
You can express this in this procedural way as I did it, take each two numbers and add them to get the next one.
We can say f of n, that is the nth Fibonacci number, is found by summing f of n minus 1 plus f of n minus 2, the n minus first plus the n minus second Fibonacci number, as long as n is greater than 2.
In order to find the nth, we use the solution to the n minus first and the n minus second or in general some other Fibonacci number that is smaller, or in general some other problem of smaller size.
For the Fibonacci numbers, we’ve written here f of 1 equals f of 2 equals 1.
That is, we start with the base case, that the first two numbers are 1, and then we build from there.
If you played this at some point in your life, you probably realize the idea is you want to keep eliminating half the numbers with one question.
If we were thinking of integers between 1 and 100, with one question, is it less than 50, you can narrow it down to is it in the first 50, zero to 50, or is it in the second 50, 51 to 100? And so with one question, you’ve halved the number of possibilities.
In other words, you started by playing a game of numbers between 1 and 100 and now you’re playing the same game, but now it’s either between a 1 and 50 or 51 and 100.
If the range just has one number, if low equals, if I’m guessing a number between 6 and 6, it’s clear that the answer is 6, so you just return that.
So this general technique is not just for guessing numbers, but for any ordered set that you’re searching, you can zero in on one half or the other.
You can see that when we ask is the number bigger than the middle number, is it bigger than 50, that’s how we’re dividing.
Now we can analyze this, right? We want to know, how many guesses do you have to make in the worst case? Now again, in the best case you just guess the number maybe on the first try.
In the worst case, what you do is you halve the numbers each time.
That’s the idea is that you’re guaranteeing that each time you have the set of possible numbers.
So what you’re asking is, how many times can you have a number before you reach 1? In other words, you start with 100 and then after one around you have 50 possibilities and after another round you have 25 possibilities and after another around you have 13 possibilities, et cetera.
How many times can you keep halving a number before you reach 1? And that is basically the definition of the logarithm of a number, the logarithm base 2.
So that’s a basic fact that we use frequently in thinking about algorithms, that when you can be halving things repeatedly, the number of times you halve before you get down to 1 is the logarithm.
So you can say that if you play this game of trying to guess a number, the number of questions you have to ask is the logarithm of n, where n is the maximum possible number.
Instead of guessing numbers, you would guess now names and is it before or after in alphabetical order.
Each time we want to select, we have to look over an entire list of n numbers.
Or perhaps we’ve eliminated something, but still it’s a list of at most n numbers.
So we’re passing over a list of n numbers n times, so that’s n times n. And we get a time that is something like n squared.
So let’s see how we can use divide and conquer to sort numbers much faster.
We’re going to take a list of numbers and we’re going to divide it in half.
So maybe one of the lists is the following set of numbers, 2, 5, 6, 8; and the other list is maybe the numbers 1, 3, 4, 9.
So I have eight numbers, but I have them in two lists of four.
So we’re going to start with a set of numbers that are not in sorted order.
So I’m going to put up eight numbers as an example of such a list.
Now, how are we going to sort these eight numbers using MergeSort? So remember, we’re going to divide and conquer.
We know how to sort a list of two numbers probably already.
Now we combine these two sorted lists, the sorted list 3 and the sorted list 9 into one list.
You can see I’ve merged the sorted list 1, 2 with the sorted list 3, 9.
If you look at this- so one level, it just looks like jumbled numbers.
We just used the fact that we were dividing and conquering and then combining to sort our number, to say that we don’t know how to sort but we can divide things up into smaller pieces, and we also know how to merge sorted lists.
The number of levels I’ve done here is actually logarithmic in the problem size, because what have I done? I keep splitting in half, splitting in half, splitting in half.
So in fact, from here to here is a logarithmic number.
Then from here to here is a logarithmic number again.
The number of levels of this is proportional to the log.

Week 1: Algorithms 1 > 1d Divide and Conquer Example: Investing > 1d Video

Recall that we assume we have the price of a stock in each of the next days, in each of the next n days.
We’re going to buy it once and sell it once, and we want to maximize our profit.
OK. So I’ve put up there the most straightforward algorithm, which is just to say that there’s only the n possibilities for when you buy, and there is only n possibilities for when you sell.
You have to check pairs where you buy before you sell.
We say, what if I bought on day one and sold on day eight? Well, then I would lose $20. What if I bought on day three and sold on day ten? Then that would be 75 minus 40, or $35. So I just keep computing those for all pairs.
So think about the first eight days and the second eight days.
Either I buy and sell in the first eight days, n over 2 days, or I buy and sell in the second eight days.
Or I buy in the first eight days and sell in the second eight days.
Those are the only possibilities, right? Either I buy and sell in the beginning, buy and sell in the end, or buy in the first half and sell in the second half, right? Remember, I’m going to buy before I sell.
Now, what you’ll notice is that these first two possibilities, buying and selling in the first n over 2 days and buying and selling in the second n over two days, these are just recursive problems.
I buy in the first n over 2 days and sell in the second n over two days.
So if I told you that, that you had to buy in the first eight days and sell in the second eight days, what would you do? Well, if you think about it for a minute, you would buy at the cheapest possible price in the first eight days, which in this example is 13.
You would sell in the most expensive price in the second eight days, which in this case is 75.
So if I told you you were buying in the first eight days and selling in the second eight days, you would buy at 13, and you would sell at 75.
OK. So and what we do is if we just have- if you look at the code written up there- if you just have two days, then you’re just going to buy on the first and sell on the second.
Otherwise, you’re going to split it in half, and you’re going to either recurse on the first piece, recurse on the second piece, that’s line six and seven.
So just like I made an easier problem merging, I’m going to make an easier problem here, which is not, how do you solve this general problem? But how do you solve the problem where I already have told you, you buy in the first half, and you sell in the second half? So by the same paradigm, we can put it together, and we get a more efficient algorithm.
That’s what we did in step eight by just looking for the largest in the first half and looking for the smallest in the second half.
Then in line eight, we do the non-recursive part, where we look for the largest number in the second half of the numbers, and the smallest number in the first half the numbers, and subtract them.

Week 1: Algorithms 1 > 1e Randomization in Algorithms > 1e Video

In this lecture, we’re going to talk about the use of randomization in algorithms.
OK, so we’re now going to be talking about algorithms, where we make random choices at various points.
We’re not actually going to see details of that in this short course, but I wanted to make you aware of that possibility.
The key thing randomization is, it’s going to simplify our computational task, many times.
So the problem is a very basic problem when you’re looking at data, which I call selection, and this is finding the i’th smallest from a set of numbers.
So I give you a set of n numbers, which I denote by A, and I want to find the i’th smallest.
The median, or the middle one of that set of numbers in this case, is 10.
Or you may want to find the seventh smallest of the set of numbers, which in this case is 23.
We’ve spoken about that- you can just scan the numbers and keep track of the minimum, that’s easy.
The median on the other hand- you can’t just scan the numbers and find the median, because I don’t know how to do that, but there is a straightforward way to do this, combining things we’ve already seen.
So if I sorted those numbers, 1, 3, 4, 8, 10, et cetera- 10 would be the fifth one, I would return 10, that’s the median.
Even without that big question, something like finding the median or finding some percentile- how do we do this quickly and simply? And that’s what we’re going to see, and randomization is going to play a key role in letting us do this quickly.
OK, so we’re going to give a recursive randomized strategy, which sounds maybe a little scary, but it’s actually a very natural thing to do.
We’re going to take our numbers, and we’re going to pick some element to be what we’re going to call the pivot element.
So maybe I should put up here, just a small example of a set of numbers.
Again, imagine- this is a small example, but imagine I have a billion numbers, and I want to find the half a billion smallest numbers.
We’re going to use this idea of recursion.
So how am I going to do this? I’m going to just pick a number randomly, and then split into two sets, and then figure out what to do.
So let’s say I just pick a number randomly, and in this case, I’m going to pick the number seven randomly.
So then what am I going to do? I’m going to split the items into two sets- one that’s less than or equal to this pivot number, the random number, and one that’s greater than.
So I’m just going to split them into numbers less than 7, which I’ll just write down over here- 2 3 5 7 1 4, which I call L. And then numbers greater than 7, which is just 9 and 8.
Now, the idea’s I’m going to recurse on one or the other of these sets.
What do I know? I know that these are the six smallest numbers, and these are the two biggest.
Now I would recurse on this set, but now I’m looking for the third smallest, and I’ve crossed out two numbers.
If you just think about picking a random number, a random number is evenly distributed amongst the possibilities.
One can analyze this in the following- I’m going to do an informal analysis, which can be made formal, which we’re not going to do.
OK, so how do we do this? We’re going to use the similar ideas we used for finding the median, except in the median-finding, in the selection, we were able to throw away half the elements.
Here we can’t throw them away, but we’re going to use this to do divide and conquer, where we work on both halves.
So we’re going to do a similar idea here, and that is, at each step we’re going to- you might think of it is approximately sort.
We’re going to use this idea of picking a random element and partitioning to move the smaller elements to the left, the larger elements to the right, and then we’ll just keep repeating and recursing, and then in the end we’ll be sorted.
So I’m just going to go through this by means of example, to show you how this works.
So the same thing- we’re going to pick a random pivot element, let’s just say I pick 6 this time.
Then what I’m going to do is split into two sets.
I’m not going to change the order of the elements while I do that.
So in other words, I’m going to pick out the ones that are less than or equal to 6.
Now I’m going to pick out the numbers that are bigger than 6- 8, 9, 7.
This time, maybe I’ll pick 3 as my pivot, and I’ll divide into the numbers that are less than or equal to 3- that’s 3, 1, 2.
OK, so what did I do here? Again, the interesting thing is I just kept picking numbers randomly, and splitting the set by the smaller ones and the larger ones.
Again, by an analysis which is more complicated than we’re going to go into, we’re roughly cutting the problem in half each time.
Then we’re going to get the similar kind of logarithmic behavior.

Week 1: Algorithms 1 > 1f Application Area Scheduling 1 > 1f Video

You can think of it as allocating resources, which are often called machines, to tasks, which are also called jobs, over time.
Things like scheduling computational jobs in a big data center is a scheduling issue.
There’s all these different machines, which are computers, and you have to decide which job is running on which computer, at which time.
It’s running multiple apps, and in there is a scheduler that is deciding which apps should be running at which time, should you be going to the network to try to get some information? Should something be happening with the lights, et cetera.
Scheduling happens in health care- scheduling nurses, or scheduling machines in hospital.
So if you’re modeling scheduling problems- again as I said in the beginning, you have a machine, you have some kind of jobs, and you have different objectives.
You can have different kinds of jobs, you can have jobs that are all the same and just need one unit of processing, you can have jobs that arrive over the course of a day with different processing times.
You can have precedence constraints between jobs- sometimes you have situations where Job A has to be done before Job B, or at the same time as Job B. You can have jobs that can be preempted- in other words, can be stopped and started, or jobs that can be run continuously.
A computational job may be able to be stopped and started.
A job that has a physical process, like heating some kind of element.
You might want to finish your jobs as early as possible, you might want to finish jobs by their deadlines, you might want to minimize some notion of response time.
That is, from when I submit a job to when it comes back, how long does that take? And again, there are many, many objectives, and many problems actually have multiple objectives, and competing objectives.
So there’s this whole vast world of machines and jobs and objectives and many different problems in the world of scheduling that arise every day.
You just have one machine, you have a set of jobs that have time- so I’ve listed five jobs, each of which take some amount of time.
So the first job takes five units of time, the second job takes three units of time, et cetera.
What is response time? Response time is the time between when a job arrives, or is submitted- which in this case is time 0 for all these jobs- until the time when it completes.
Which a schedule is going to put these jobs to time, it is going to give each job a completion time, and that’s denoted by Cj. And what do we want to do? We want to minimize the average of these.
So that’s what our problem is, so we have these jobs.
So suppose we just want to schedule these jobs in this order.
So Job 1 would schedule for five units of time, Job 2 would schedule for three units of, time so we finish at time 8.
Job 3 would then schedule for 10 units of time, so we’d finish at time 18.
Job 4 for eight units of time, so it would finish at time 26.
Then Job 5 for four units of time, so it would finish at time 30.
So this schedule has a value of 87, and we may ask, is this the best possible, and what would be the algorithm to do better? So how do we reason about this? How do we decide what’s the right algorithm for this problem? So one way to do this- and I think the best way to think about this is to think about, what if you just had two jobs, what would you do? And if you can reason about what to do with two jobs, that may or may not be the right thing to do in general, but certainly you have to be consistent with what you would do for two jobs for a more general set of jobs.
So let’s just look at- suppose we had two jobs.
We could run the first job followed by the second job, which would finish at time 5 and 8, and that would be 5 plus 8, or 13.
Or we can run the second job, followed by the first job.
Now, we could ask, what is it that made us prefer this schedule to this schedule? And in this case, the only thing we’re working with is the processing times of the jobs.
You’ll see in this one, we put the smaller processing time first, and in this one, we put the larger processing time firsh- which are essentially our only two choices.
What happens here? You can see actually by looking at these two pictures that the second job to complete always completes at time 8, and the first job to complete completes at its processing time.
We might as well choose the smaller job first, and we can reason that this is the better schedule.
The reason this is the better schedule is because it ran the shorter job first.
Let’s look at this algorithm, where we run shortest job first.
We’ve sorted the jobs by size and run them in that order, which you can see here.
So we ran the smallest one, Job 2, first, which takes three units of time.
Then Job 5 next, which takes four units of time, et cetera.
What we did is, we can formalize this thought process with two jobs, to actually prove that this algorithm is optimal.
With two jobs, we argued that the smaller one has to come first.
We can do the same thing with multiple jobs, arguing that given a set of jobs, it must be the case that the smallest one runs first.
In other words, I can decide which job is going to run first by looking at the other jobs- I have to know the processing times of the other jobs- but without knowing how these other jobs are going to be scheduled.

Week 1: Algorithms 1 > 1g Application Area Scheduling 2 > 1g Video

This is a class of problems and scheduling that arise often, and these are problems that have deadlines.
Many times when you’re scheduling you have a deadline, a time by which something needs to be done.
You can actually talk about two different kinds of deadlines- hard deadlines, which [INAUDIBLE] called deadlines, or soft deadlines.
A deadline can mean really, you have to finish by this time or else it’s useless, or you have to finish by a time or there’s a penalty.
Things have to fire by certain times, and these are real deadlines.
These aren’t suggestions, these are real deadlines.
So in a simple problem with deadlines, you might have a set of jobs like before, and they have processing times like before.
Now we can ask, how do we schedule these jobs so that they can all meet their deadlines, if such a thing is possible? Realize when we’re asking a problem where either you meet your deadlines or you don’t, it’s possible to have a set of jobs where you don’t meet your deadlines.
Do we use processing times to order them? Do we use deadlines to order them? Do we use both? There are a lot of possibilities, and we need to reason which is correct.
So I can put job 2, follow by job 3, or I could put job 3 followed by job 2.
So we can see, job 2 in both examples finishes before its deadline.
Its deadline is 6 and we’re finishing it at time 10.
So it’s not smallest job first now, because here we put job 2 before job 3, we put the smaller job first, but the schedule didn’t work.
If we compare these two, we see the one feature of this example, that is not a feature of this example, is that we ordered them by their deadlines.
If you have jobs with deadlines, order them by deadline.
Just from this example, it might be that it’s actually some function of deadline and processing time together.
It turns out that the right thing to do is to order the jobs by their deadlines.
So now we’ve taken this algorithm, earliest deadline first, and we’ve done these two examples that I’ve put up here.
So here’s the schedule for the first example, where we sorted the jobs in deadline order.
So that’s job 3, followed by job 2, followed by job 4, job 1, job 5.
If you look at each one, it is indeed less than its deadline.
So job 3 has a deadline of 6, and it finishes at time 6.
So this schedule is good, it meets all its deadlines.
So I’ve done the same thing- I sorted the jobs by deadline, which is just 1, 2, 3, 4.
I scheduled them, and if we look, job 1 means its deadline of 5, job 2 meets its deadline of 6.
Job 3 has a deadline of 7, but it actually finishes at time 10, so this is violating its deadline.
Now, we just said that earliest deadline first is the right algorithm, so what happened here? Well, this instance was an impossible instance to solve.
If you just look at the first three jobs, they have deadlines of 5, 6, and 7.
So there’s no way- one of them has to finish at time 10, and so one of them is going to violate its deadline.
Sometimes you get an instance, and the answer is you can’t meet all the deadlines.
The nice thing is that earliest deadline first not only solves instances where you can meet all deadlines, but if you can’t, it identifies it for you.
In other words, if you run earliest deadline first, and you violate a deadline, you can say with confidence, it was impossible to meet all the deadlines.
One could formalize this, that if there is a schedule meaning all deadlines, than this earliest deadline first algorithm will do so.
If someone is asking you to schedule all these jobs by their deadline, and it’s impossible, than it’s not an algorithmic mistake.
So in this case, by using earliest deadline first, you can identify when this happens.
So I’ve put an example up here, and now when a machine becomes available, you just keep scheduling a job with the earliest deadline, so they’re in earliest deadline order in this example- 1, 2, 3, 4, 5.
You’ll notice that in this schedule, job 5 does not meet its deadline because it finishes a time 12, and it has a deadline of 11.
So in this case, EDF does not meet all the deadlines.
It turns out that for this problem, there’s another schedule that’s not EDF, that does meet all its deadlines.

Week 1: Algorithms 1

Week 1: Algorithms 1

“Introduction to Algorithms current section … Tools to Analyze Algorithms … Algorithmic Technique: Divide and Conquer … Divide and Conquer Example: Investing … Randomization in Algorithms … Application Area Scheduling 1 … Application Area Scheduling 2”

(Source URL)

Summaries

Week 1: Algorithms 1 > 1a Introduction to Algorithms > 1a Video

Week 1: Algorithms 1 > 1b Tools to Analyze Algorithms > 1b Video

Week 1: Algorithms 1 > 1c Algorithmic Technique: Divide and Conquer > 1c Video

Week 1: Algorithms 1 > 1d Divide and Conquer Example: Investing > 1d Video

Week 1: Algorithms 1 > 1e Randomization in Algorithms > 1e Video

Week 1: Algorithms 1 > 1f Application Area Scheduling 1 > 1f Video

Week 1: Algorithms 1 > 1g Application Area Scheduling 2 > 1g Video

Return to Summaries List.

(image source)