I. Warm-up: Binary Search
1. Binary search repeatedly divides the search space (the values you are searching through) in half.
2. In the best case scenario, the first value that you look at is the optimal braking coefficient.
II. Introduction: Algorithms and Data Structures
1. Algorithms & Data Structures
(1) Algorithm: a set of steps to solve a computational problem
- We need to choose the correct algorithm to solve the problem
- Many algorithms have already been written to solve common problems. We can use existing algorithms in many scenarios.
(2) We care about how efficient our algorithms are (how fast they run)
- How can we quantify efficiency?
(3) Data structures: ways to organize and store data
- Using the right data structure can make your program more efficient.
2. Managing a fleet of rovers
(1) Background
- Our fleet of rovers has grown.
- You have been appointed Chief Rover Engineer
- Part of your job is to charge up the rovers to full battery power. After you charge up a rover, you want to record that it now has full battery power.
- In code, we’ll need to look up a particular rover and edit the information about that rover.
3. Finding and charging a rover
III. Time Complexity
1. Warm Up
we wrote this code for chargeRover:
==============================================================
// Find a particular rover (using the id of the rover)
// Set charge = 1 for that rover
// return true if we find the rover, false otherwise
bool chargeRover(vector<Rover> &fleet,
string roverId) {
for(int i = 0; i < fleet.size(); ++i) {
if (fleet[i].id == roverId) {
fleet[i].charge = 1;
return true;
}
}
return false;
}
==============================================================
In a real-world context, we often need to run our functions on huge datasets, or run them many times. For example, what if we had 1,000 rovers in our fleet, and we needed to run chargeRover hundreds of times? If we are only running chargeRover a few times, it doesn’t really matter how fast it takes to run. But, if we are running it hundreds of times, it matters a lot! Even making chargeRover a little bit more efficient would save us a lot of time!
When we are thinking about a function or an algorithm, we want to be able to quantify how efficient it is. This allows us to compare it to other algorithms, to see which one is most efficient. To start thinking about how to quantify efficiency, consider the following scenarios.
Let’s suppose we have 20 rovers in our fleet, and our vector of rovers is in a random order (e.g., the rovers aren’t sorted in any way). We are looking for a particular rover to charge.
EX) In the best case scenario, how many rovers will we have to look at before we find the rover we are looking for?
A: 1 rover
=> The first rover that we look at is the rover that we’re looking for.
EX) In the worst case scenario, how many rovers will we have to look at before we find the rover we are looking for?
A: 20 rover
=> We’ll have to look at every single rover before we find the rover that we’re looking for.
EX) On average, how many rovers will we have to look at before we find the rover we are looking for?
A: 10 rover
=> We will need to look at half of the rovers.
2. How efficient is our function?
=> This searching algorithm is called linear search.
3. Time complexity
(1) We need a way to talk about how efficient our function is.
- Time complexity = how long an algorithm takes to run
- We care about best-case time complexity, worst-case time complexity, and average-case time complexity.
(2) The number of times that our loop runs depends on the size of the vector. Since the size of the vector can change, call the size n.
- Best case for chargeRover = 1 time through the loop
- Worst case for chargeRover = n time through the loop
- Average case for chargeRover = n/2 time through the loop
4. Linear time
(1) Linear time: the time the algorithm takes to run grows linearly with n
- ex) if n doubles, the time the algorithm takes would double
- Typically, if you have a single loop iterating through a vector, the algorithm is linear time
- Sometimes, you might see this written with big-oh notation: O(n)
(2) Big-oh notation: provides strict mathematical definitions for the concepts that we’re describing more generally here.
5. Polynomial time
(1) Polynomial time: the time the algorithm takes to run grows polynomially with n
- ex) if n doubles, the time the algorithm takes would 4 times as long
- Typically, if you nested loops iterating through vectors, the algorithm is polynomial time
- Big-oh notation: O(n^2), O(n^3), O(n^4), etc.
6. Constant time
(1) Constant time: the time the algorithm takes to run doesn’t change based on n
- ex) if n doubles, the time the algorithm takes would stay the same.
- Big-oh notation: O(1)
7. Logarithmic time
(1) Logarithmic time: the time the algorithm takes to run grows logarithmically with n
- Big-oh notation: O(log(n))
EX) Assume fleet is a vector of Rovers. What is the worst-case time complexity of the following snippets of code? (Here, n = size of fleet.)
==============================================================
// finds the largest amount that TWO rovers can carry together
double biggestTwoRoverCapacity = -1;
for (int i = 0; i < fleet.size(); ++i) {
for(int j = 0; j < fleet.size(); ++j) {
double twoRoverCapacity = fleet[i].capacity + fleet[j].capacity;
if (biggestTwoRoverCapacity == -1 ||
twoRoverCapacity > biggestTwoRoverCapacity) {
biggestTwoRoverCapacity = twoRoverCapacity;
}
}
}
==============================================================
A: Polynomial time
=> Anytime we see nested loops iterating through vectors, it’s pretty likely that the algorithm will be at least polynomial time.
In this case, if the size of the fleet is five, the outer loop will run five times, and the inner loop will run five times for each rover - this means that the code in the middle of the nested loops will run twenty-five times total.
If the size of the fleet is ten, then the code in the middle of the nested loops will run one hundred times total. This is growing polynomially.
EX)
==============================================================
// Print out the id the first five rovers (if they exist)
for(int i = 0; i < 5; ++i) {
if (i < fleet.size()) {
cout << fleet[i].id << endl;
}
}
==============================================================
A: Constant time
=> No matter how big our fleet is, the loop will only run five times.
EX)
==============================================================
// How many rovers are selected for the mission?
int numSelected = 0;
for(int i = 0; i < fleet.size(); ++i) {
if (fleet[i].isSelected) {
numSelected += 1;
}
}
==============================================================
A: Linear time
IV. Searching Algorithms
1. How can we improve chargeRover?
(1) We have been using linear search
(2) To improve this, make a key assumption: the rovers are sorted by id.
(3) If we make this assumption, we can do binary search.
(4) Instead of searching for braking coefficients, we’re going to search for rover ids.
2. How efficient is binary search?
(1) For each step of binary search, we cut our search space in half.
(2) Binary search is logarithmic time.
- Binary search is especially efficient when the vector is large.
3. Searching algorithms
(1) Linear search and binary search are a couple of algorithms designed specifically for searching .
(2) Why are search algorithms important? They come up in many context.
- Different algorithms work better in different settings.
- ex) binary search only works when the items are sorted.
(3) ex) real-life problems that use searching algorithms.
- Autocomplete. As you are typing out a word on your phone, your phone is searching through all possible words and trying to guess which word you meant to type. There are thousands of possible words that you could be trying to type - this has to be a very efficient search!
- Keyword search. Suppose that you want to write a program to help doctors access relevant medical information. One part of this program could be searching for particular keywords in medical literature, and identifying which articles have these keywords.
- Searching through simulation results. It’s common to run complicated simulations of fluid dynamics, to understand how air or water interacts with certain physical objects. Suppose that you ran hundreds of these simulations with different parameters. You would need to search through the simulation results in order to identify which parameters are optimal for this particular set-up.
V. A Sorting Algorithm: Selection Sort
1. Sorting algorithms
(1) Binary search only works when the items are sorted.
(2) Selection sort: a basic sorting algorithm.
2. Selection sort
(1) Find the smallest element in the vector that has not been sorted yet.
(2) Swap the smallest element with the element in the current position.
EX) Suppose we are sorting a vector of size n. What is the maximum number of swaps that selection sort will take?
A: n-1
=> In the worst case, you will have to do a swap for every single element in the list, except for the last element. Once you’ve sorted n-1 elements, the last element will necessarily be the largest element and won’t need to be swapped.
EX) Suppose that we have a vector of size n, and all of the elements are already sorted. If we run selection sort on this vector, how many swaps will selection sort do?
A: 0
=> In selection sort, swaps only happen if there is an element in the rest of the vector smaller than the element at the current position. Because the elements of the vector are already sorted, this will never happen. The element at the current position will be smaller than all of the elements in the rest of the vector.
EXERCISE) Implementing Selection Sort
Write a program that implements selection sort on a vector of Rovers. Assume that the fleet is a vector of Rovers, and that we have access to a helper function swap that swaps two Rovers.
EX) What is the worst-case time complexity of selection sort? (Here, n = size of fleet.)
A: Polynomial time
=> If the size of the fleet is five, the outer loop will run five times, and the inner loop will run between one and four times for each rover.
If the size of the fleet is ten, the outer loop will run ten times, and the inner loop will run between one and nine times for each rover.
Because of the inner loop, we know that this runs slower than linear time (if there were no inner loop, this would be a linear time algorithm). Even though the number of iterations that the inner loop goes through changes, this is still polynomial time.
IV. Other Sorting Algorithms
There are lots of sorting algorithms! Depending on what data you are trying to sort, different algorithms are more or less efficient in different contexts.
Here are a few real-world sorting algorithms applications:
1. Showing product results
When you search for a product on an online store (like Amazon), the store sorts the results in order of relevance, putting the products that they think will be most helpful at the top of the list.
2. Object detection.
A self-driving car has cameras that take images of the scenes around it. After getting these images, an algorithm parses the images to identify possible objects (e.g,. other cars, pedestrians, trees). For a single object, the algorithm generates several possibilities for what that object could be, and then it sorts them in order of which label is most likely. In the example below, the algorithm identifies people and airplanes with high probability.
Searching and sorting algorithms are two categories of algorithms that are already developed. So, if you find yourself writing a computer program where you need to search for a particular item or sort a collection of items, you don’t need to reinvent the wheel! You can use an algorithm that someone else has already written. There are many other algorithms out there that solve common computational problems.
IIV. Data Structures
1. Data structures
(1) Data structures: ways to organize and store data
(2) ex)
- Vectors: sequential list of elements
- Strings: sequential list of characters
- Structs: create our own data types with different member variables.
2. Dictionary (std::map in C++)
(1) Retrieving information. We have student information for every student in our class, and we want to be able to give our data structure a student ID, and access all of the information about that student ID.
(2) Maps keys to values
(3) Keys must be unique
3. Dictionaries in C++
4. Trees
(1) File organization. The files on your computer are organized hierarchically (ex. A folder might have additional folders inside of it). We want to store the file structure of your computer, and be able to access different folders in that structure.
(2) Collection of nodes
(3) Often used for hierarchical data or sorted data
5. Stacks
(1) Reversing a word. We want to insert all of the characters of a word into a data structure, and remove them in reverse order.
(2) Similar to vectors, but with a special rule:
(3) Stack: insert & remove elements from only one end
- Last-in, first-out(LIFO): when you remove an element, you remove the most recently inserted element.
6. Queues
(1) Scheduling programs on your computer. Suppose that you have multiple programs that you want your computer to run. As soon as a program is ready to run, it is inserted into the data structure. When the computer is ready to run a program, it removes that program from the data structure and runs it. The first program that is ready to run should be the first program that is actually run.
(2) Queue: insert elements on one end; remove from the other end.
- First-in, first-out (FIFO): when you remove an element, you remove the element that has been in the queue for the longest time.
IIIV. Summary
1. There are many algorithms written for common computational problems. Two categories of algorithms are searching algorithms and sorting algorithms.
2. Time complexity : how long an algorithm takes to run. ex) constant time, logarithmic time, linear time, and polynomial time. Sometimes, time complexity is written using big-oh notation.
3. Linear search and binary search are two searching algorithms. Binary search is more efficient, but requires the items to be sorted first.
4. There are many different sorting algorithms. We looked closely at the selection sort.
5. Data structures are ways to organize and store data.
(1) A dictionary maps keys to values.
(2) A tree is a collection of nodes often used for hierarchical or sorted data.
(3) A stack is similar to a vector, but requires elements to be inserted and removed from only the end.
(4) A queue is similar to a vector, but requires elements to be inserted on one end and removed on the other end.
Different data structures are more efficient for different contexts.
'[Umich] COE Core > ENGR 101 (Matlab, C++)' 카테고리의 다른 글
[Notes] Ch.20 More Data Structures (Runestone) (0) | 2022.12.06 |
---|---|
[Notes] Ch.19 Structs (Runestone) (1) | 2022.12.06 |
[Notes] Ch.18 Program Design in C++ (Runestone) (0) | 2022.12.05 |
Umich ENGR 101 Project 4 Overview (Summary) (0) | 2022.12.01 |
[Notes] Ch.17 Vectors in C++ (Runestone) (0) | 2022.11.18 |
댓글