# CS 124: Data Structures and Algorithms

- Instructors:
- Michael Mitzenmacher
- Adam Hesterberg
- Email:
- michaelm@eecs.harvard.edu
- ahesterberg@g.harvard.edu
- Course Website:
- http://fas.harvard.edu/~libcs124/cs124/index.html

### Objectives

This course covers the modern theory of algorithms, focusing on the themes of efficient algorithms and intractable problems. The course goal is to provide a solid background in algorithms for computer science students, in preparation either for a job in industry or for more advanced courses at the graduate level. I strongly encourage mathematicians, biologists, physicists, and people from other concentrations to take the course as well.

Besides introducing the basic language and tools for algorithm analysis, we will also cover several specific problems and general design paradigms. Toward the end of the quarter, we will also examine heuristic techniques often used in practice, even though in many cases formal theoretical results are not known.

We will focus on the theoretical and mathematical aspects in class and on the homework assignments. But because one gains a deeper understanding of algorithms from actually implementing them, the course will include a substantial programming component. Large programming assignments can be done (but do not have to be done!) in pairs. More details will be available when the first programming assignment is given.

As you can see from the preliminary list of topics (included below), we will be covering a great deal. I expect the course to be challenging, both in terms of the workload and the difficulty of the material. You should be prepared to do a lot of work outside of class. The payoff will be that you will learn a lot of both useful and interesting things.

### Prerequisites

Students should be able to program in a standard programming language; C or C++ is preferred (but not mandatory). Some mathematical maturity also will be expected; students should have some idea of what constitutes a mathematical proof and how to write one. Some knowledge of basic probability will also be helpful.

If one needs more mathematical background, I highly recommend the resource list provided by CS 121 -- particularly note the self-study of 6.042j suggested (for those who have not taken CS20). We will have math review sessions at the beginning of the semester.

### Assessment

Your performance will be measured in four ways. (The percentage contributions to your grade given below are approximate and subject to change. The class is curved, and we may rebalance scores a bit where appropriate, so if you do much better on the final exam than on the midterm, we may lower the weight of the midterm for your grade.)

**Problem sets (30%)**: There will be some number of problem sets. [Last year there were 7; this is subject to change according to how we distribute assignments.] They will generally be due one week after they are given out. These sets will primarily be mathematical and/or theoretical in nature. These assignments are governed by the collaboration policy, given below.**Programming assignments (20%)**: There will be 3 programming assignments. For these assignments you may work with another student if you choose. You may not work with the same partner on all 3 assignments. Note that working in pairs is not mandatory. Generally you will have two weeks for programming assignments. You must electronically submit your code and your write ups. Beyond the fact that you can work with a partner, programming assignments are also governed by the collaboration policy, given below.**Midterm Exam (20%)**: There will be one exam approximately 1/2 of the way through the course.**Final Exam (30%)**: There will be a final exam.

Students often ask for some sort of calendar. While I will work with the staff to try to provide a calendar of upcoming assignments, specific assignments are subject to change depending on how the class proceeds, my own scheduling, and external forces. Quite simply, you should expect that you'll have homework assignments every week, and plan accordingly.

Also, I cannot recommend enough the importance of starting assignments early. Some problems take significant time, and that time might require thinking about the problem in the background, rather than just writing down an answer.

All assignments will be turned in via Canvas/Gradescope, and are due according to assignment instructions, which will be provided. Please remember it is better to turn in an incomplete assignment rather than no assignment. (Some students somehow would not rather turn in an incomplete assignment, which makes no actual sense.) As all assignments should be turned in electronically, typically as pdf, you should have a scanner available, or become familiar with LaTeX, or otherwise be ready to deal with turning in pdfs of mathemetical work. (LaTeX is highly, highly recommended)

### Collaboration Policy

I would like to emphasize the rules on working with others on homework assignments. For the programming assignments, you may work in pairs. It is assumed you will not work with anyone besides your partner. For problem sets, limited collaboration in planning and thinking through solutions to homework problems is allowed, but no collaboration is allowed in writing up solutions. You are allowed to work with one or two other students currently taking CS 124 in discussing, brainstorming, and verbally walking through solutions to homework problems. But when you are through talking, you must write up your solutions independently and may not check them against each other. Note that writing up solutions independently does not allow for you to contact another student while you are writing up your solution to ask them how to word or describe something that you are supposed to be writing. Of course, there may be no passing of homework papers between collaborators; nor is it permissible for one person simply to tell another the answer.

If you collaborate with one or two other students in the course in the planning and design of solutions to homework problems, then you should give their names on your homework papers.

Under no circumstances may you use solution sets to problems that may have been distributed by the course in past years, or the homework papers of students who have taken the course past years. Nor should you look up solution sets from other similar courses.

Violation of these rules may be grounds for giving no credit for a homework paper and also for serious disciplinary action, including but not limited to having your case sent to the appropriate Harvard disciplinary body. Severe punishments will apply, so please do not violate these rules. If you have any questions about these rules, please ask an instructor.

### Required Text

While officially there is no required text, I very strongly recommend you obtain/purchase one of the following books:*Algorithm Design*, by Kleinberg and Tardos. This is an excellent book, with a different style than many textbooks. It follows the course quite closely, but it is not as encyclopedic as the other book below, and in particular assumes a lot more background.*Introduction to Algorithms*, by Cormen, Leiserson, Rivest, and Stein. This book is probably worth buying if you are going to study algorithms beyond this course. It is primarily a theoretical text, and it is quite encyclopedic in nature. If you are looking for help with the proofs and mathematics, this is a good book to purchase.- Jeff Erickson has released an online textbook ; I haven't tried to map it to this course, but it looks pretty good.

In place of a book, class notes will be regularly made available. Generally these notes will not be available until a few days after the corresponding lecture.

In a typical year, some small number of students will actually complain that I do not require a textbook, because they think a textbook would have been helpful. Please buy a textbook if you think it will be helpful for you. There are also many on-line available sources when you might find an alternative source helpful, such as Wikipedia. However, a reminder that you are not to use outside sources in developing solutions to your own homework problems.

### Class Information/Notes

Class notes, homework assignments, and other information will be made available on the Web when possible. For access go to the class web site. Generally this information will be available in PDF. In many cases, the class web-site may be the only location where information is posted or available, so look in from time to time!

### Incomplete List of Topics

- Fundamentals
- Induction
- Recurrence relations
- Big-Oh and little-Oh notation
- Merge sort
- Graph Algorithms
- Depth-first search, strongly connected components
- Breadth-first search, Dijkstra's algorithm
- Greedy Algorithms
- Minimum spanning tree
- Union find
- Set cover
- Huffman coding
- Dynamic Programming
- Longest common subsequence
- Traveling salesman
- Divide and Conquer
- Integer multiplication
- Matrix multiplication
- Hashing
- Balls into bins problems
- Bloom filters
- Document similarity
- Linear Programming
- Problem definitions and solution techniques
- Reductions
- Maximum matching
- Randomized Algorithms
- Primality testing and factoring, RSA
- Random walks and 2-SAT
- NP-completeness review
- Basic NP-complete problems
- Novel approaches to NP-complete problems
- Approximation algorithms
- Heuristic algorithms

Academic Integrity. You are responsible for understanding Harvard Extension School policies on academic integrity (https://www.extension.harvard.edu/resources-policies/student-conduct/academic-integrity) and how to use sources responsibly. Stated most broadly, academic integrity means that all course work submitted, whether a draft or a final version of a paper, project, take-home exam, online exam, computer program, oral presentation, or lab report, must be your own words and ideas, or the sources must be clearly acknowledged. The potential outcomes for violations of academic integrity are serious and ordinarily include all of the following: required withdrawal (RQ), which means a failing grade in the course (with no refund), the suspension of registration privileges, and a notation on your transcript.

Using sources responsibly (https://www.extension.harvard.edu/resources-policies/resources/avoiding-plagiarism) is an essential part of your Harvard education. We provide additional information about our expectations regarding academic integrity on our website. We invite you to review that information and to check your understanding of academic citation rules by completing two free online 15-minute tutorials that are also available on our site. (The tutorials are anonymous open-learning tools.)

Accommodation Requests. Harvard Extension School is committed to providing an inclusive, accessible academic community for students with disabilities and chronic health conditions. The Accessibility Services Office (ASO) (https://www.extension.harvard.edu/resources-policies/accessibility-services-office-aso) offers accommodations and supports to students with documented disabilities. If you have a need for accommodations or adjustments in your course, please contact the Accessibility Services Office by email at accessibility@extension.harvard.edu or by phone at 617-998-9640.