# CS 124: Data Structures and Algorithms

- Instructor:
- Michael Mitzenmacher
- Email:
- michaelm@eecs.harvard.edu
- Course Website:
- http://fas.harvard.edu/~libcs124

### Objectives

This course covers the modern theory of algorithms, focusing on the themes of efficient algorithms and intractable problems. The course goal is to provide a solid background in algorithms for computer science students, in preparation either for a job in industry or for more advanced courses at the graduate level. I strongly encourage mathematicians, biologists, physicists, and people from other concentrations to take the course as well.

Besides introducing the basic language and tools for algorithm analysis, we will also cover several specific problems and general design paradigms. Toward the end of the quarter, we will also examine heuristic techniques often used in practice, even though in many cases formal theoretical results are not known.

We will focus on the theoretical and mathematical aspects in class and on the homework assignments. But because one gains a deeper understanding of algorithms from actually implementing them, the course will include a substantial programming component. Large programming assignments can be done (but do not have to be done!) in pairs. More details will be available when the first programming assignment is given.

As you can see from the preliminary list of topics (included below), we will be covering a great deal. I expect the course to be challenging, both in terms of the workload and the difficulty of the material. You should be prepared to do a lot of work outside of class. The payoff will be that you will learn a lot of both useful and interesting things.

### Prerequisites

Students should be able to program in a standard programming language; C or C++ is preferred (but not mandatory). Some mathematical maturity also will be expected; students should have some idea of what constitutes a mathematical proof and how to write one. Some knowledge of basic probability will also be helpful.

### Assessment

Your performance will be measured in four ways. (The percentage contributions to your grade given below are approximate and subject to change.)

**Problem sets (30%)**: There will be some number of problem sets. [Last year there were 7; this is subject to change according to how we distribute assignments.] They will generally be due one week after they are given out. These sets will primarily be mathematical and/or theoretical in nature. These assignments are governed by the collaboration policy, given below.**Programming assignments (20%)**: There will be 3 programming assignments. For these assignments you may work with another student if you choose. You may not work with the same partner on all 3 assignments. Note that working in pairs is not mandatory. Generally you will have two weeks for programming assignments. You must electronically submit your code and your write ups. Beyond the fact that you can work with a partner, programming assignments are also governed by the collaboration policy, given below.**Midterm Exam (20%)**: There will be one exam approximately 1/2 of the way through the course.**Final Exam (30%)**: There will be a final exam.

Student often ask for some sort of calendar. While I will work with the staff to try to provide a calendar of upcoming assignment, specific assignments are subject to change depending on how the class proceeds, my own scheduling, and external forces. Quite simply, you should expect that you'll have homework assignments every week, and plan accordingly.

All assignments will be turned in via Canvas, and are due according to assignment instructions, which will be marked accordingly on Canvas. Assignments will not be accepted late with the exception of medical emergencies or similar exceptional circumstances that must be discussed in advance with the instructor. Please remember it is better to turn in an incomplete assignment rather than no assignment. (Some students somehow would not rather turn in an incomplete assignment, which makes no actual sense.) As all assignments should be turned in electronically, typically as pdf, you should have a scanner available, or become familiar with LaTeX, or otherwise be ready to deal with turning in pdfs of mathemetical work. (LaTeX is highly, highly recommended)

### Collaboration Policy

I would like to emphasize the rules on working with others on homework assignments. For the programming assignments, you may work in pairs. It is assumed you will not work with anyone besides your partner. For problem sets, limited collaboration in planning and thinking through solutions to homework problems is allowed, but no collaboration is allowed in writing up solutions. You are allowed to work with one or two other students currently taking CS 124 in discussing, brainstorming, and verbally walking through solutions to homework problems. But when you are through talking, you must write up your solutions independently and may not check them against each other. Note that writing up solutions independently does not allow for you to contact another student while you are writing up your solution to ask them how to word or describe something that you are supposed to be writing. Of course, there may be no passing of homework papers between collaborators; nor is it permissible for one person simply to tell another the answer.

If you collaborate with one or two other students in the course in the planning and design of solutions to homework problems, then you should give their names on your homework papers.

Under no circumstances may you use solution sets to problems that may have been distributed by the course in past years, or the homework papers of students who have taken the course past years. Nor should you look up solution sets from other similar courses.

Violation of these rules may be grounds for giving no credit for a homework paper and also for serious disciplinary action, including but not limited to having your case sent to the appropriate Harvard disciplinary body. Severe punishments will apply, so please do not violate these rules. If you have any questions about these rules, please ask an instructor.

### Required Text

While officially there is no required text, I very strongly recommend you purchase one of the following books:*Introduction to Algorithms*, by Cormen, Leiserson, Rivest, and Stein. This book is probably worth buying if you are going to study algorithms beyond this course. It is primarily a theoretical text, and it is quite encyclopedic in nature. If you are looking for help with the proofs and mathematics, this is a good book to purchase.*Algorithm Design*, by Kleinberg and Tardos. This is also an excellent book, with a different style. It follows the course quite closely, but it is not as encyclopedic as the other book, and in particular assumes a lot more background.

In place of a book, class notes will be regularly made available. Generally these notes will not be available until a few days after the corresponding lecture.

In a typical year, some small number of students will actually complain that I do not require a textbook, because they think a textbook would have been helpful. Please buy a textbook if you think it will be helpful for you. There are also many on-line available sources when you might find an alternative source helpful, such as Wikipedia. However, a reminder that you are not to use outside sources in developing solutions to your own homework problems.

### Class Information/Notes

Class notes, homework assignments, and other information will be made available on the Web when possible. For access go to the class web site. Generally this information will be available in PDF. In many cases, the class web-site may be the only location where information is posted or available, so look in from time to time!

### Incomplete List of Topics

- Fundamentals
- Induction
- Recurrence relations
- Big-Oh and little-Oh notation
- Merge sort
- Graph Algorithms
- Depth-first search, strongly connected components
- Breadth-first search, Dijkstra's algorithm
- Greedy Algorithms
- Minimum spanning tree
- Union find
- Set cover
- Huffman coding
- Dynamic Programming
- Longest common subsequence
- Traveling salesman
- Divide and Conquer
- Integer multiplication
- Matrix multiplication
- Hashing
- Balls into bins problems
- Bloom filters
- Document similarity
- Linear Programming
- Problem definitions and solution techniques
- Reductions
- Maximum matching
- Randomized Algorithms
- Primality testing and factoring, RSA
- Random walks and 2-SAT
- NP-completeness review
- Basic NP-complete problems
- Novel approaches to NP-complete problems
- Approximation algorithms
- Heuristic algorithms