CS4500 Reflections

In the Fall 2023 semester, I was enrolled in CS 4500: Software Development. The final course in Northeastern's sequence of introductory computer science classes, CS 4500 focused on teaching larger-scale software design and implementation. In particular, it preached "socially responsible programming": the idea that it is socially responsible to design and develop software systematically, with future maintainers in mind. It's the culmination of all systematic design frameworks and methodologies taught in previous courses, bundled in an intensive class that provides numerous opportunities to "make a design decision and document it."

To accomplish this, pairs of students work on a single project – in a language of their choice – for the entire semester. The project is designed to be complex enough to require several thousands of lines of code for any general-purpose programming language. Typically a variant of some turn-based multiplayer game, our instructors developed a simplification of the tile-based strategy game Qwirkle.

The early stages of the project aim to play and simulate the game locally, involving creating simple data representations and their corresponding functionality. Later stages accommodate distributed games. Since there's a standard specification for the game, students' implementations should be able to interface with one another, with one acting as a server and others acting as clients.

For a given week, students write a high-level design document for the following week while implementing their proposed design from the previous week.

Reflections

Coming into the course, I heard no shortage of complaints from course alums about its challenging and time-consuming nature.

Why would you take swdev :MonkaCozy:

software dev was hell

I remember asking about it on here before taking it. I wasn’t warned enough, so I thought “ohh, it’s probably fine, people complain about all classes, and I’ve done fine so far”. I was wrong.

you're looking at at least 50 hours a week

Do you have a strong pain tolerance?

Despite the explicitly negative comments, I was still committed to enrolling in the course. The reasons were several:

Software Development is the natural and intended progression for Northeastern's CS curriculum.
I had the chance to take the course with the professor who developed the entire curriculum – I could fully understand the motivation behind the curriculum.
Even better, he headed the programming language I use most. I could develop my skills in that language, learning what constituted its idioms.
Software Engineering, the alternative course option, was claimed by students to be boring and not particularly rigorous. Why enroll in a course that wouldn't provide the most opportunity for growth?

Reflecting on this course among classmates and several months later individually, I don't regret it. It was challenging and time-consuming, yet I found it my most enjoyable and valuable course at Northeastern.

I thoroughly recommend the class, especially over the alternative. It provides a fantastic opportunity to perform conscientious, systematic design and implement increasingly larger codebases.

In courses I've taken since, along with projects I've worked on, I've become increasingly conscious of systematic design. I can reason about design at a higher, more abstract level, consider and manage various a system's invariants, the relations between components, ownership, and responsibilities, and more.

I attribute a large portion of my growth as a developer during university to this course, along with other classes that encouraged (perhaps more implicitly) the practice of systematic design in their domains.

Assignments

The course assignments are intended to be time-consuming, challenging, and stress-inducing. Since the course's prerequisites are fundies 1, fundies 2, and fundies 3 (object-oriented design), it aims to stress-test the accumulation of ideas students learned in those courses. It asks the question: how well can you apply those lessons you learned in previous semesters under stress?

Assignments are called "milestones". Lasting a week, they represent a sprint with an explicit goal of implementing a single component in the codebase. Components gradually scale in complexity through their relations and dependence on previous components. Bugs of earlier milestones propagate upwards, and flawed design decisions yield under pressure.

A milestone is comprised of several parts:

Writing a design document detailing your proposed design for the next milestone's component
Implementing this milestone's component, following the specification while referencing your previous design document
Writing integration tests and a test script that tests what you wrote in the previous milestone

Make a Design Decision

One of the professors is notorious for writing the same Piazza response in the several courses he teaches: make a design decision and document it. Each assignment provides the opportunity to make a conscious design decision, considering various options, their benefits and trade-offs, and committing to a choice and documenting it.

I believe it's essential to have the opportunity to make numerous design decisions. The industry expects employees to formulate designs and specifications based on potentially vague requirements. This class allows you to practice making design decisions, observe how they perform with increasing complexity, and discuss them in depth.

A decent number of our design decisions stemmed from anticipating future specification requirements. We adapted our design to allow some flexibility to account for specific scenarios that could arise, resulting in a trivial final milestone since our design easily absorbed the requirement.

Testing

Part of a milestone is the testing task, which involves the development of a concentrated test suite and a test script that provides a language-agnostic interface for executing your codebase against tests.

Writing tests is almost a creative task; it requires an intimate understanding of the specification and the ability to deconstruct examples into their interesting parts.

When developing a test, you construct pairs of JSON inputs and expected JSON outputs – the common language of communication. To accommodate any language, the autograder passes inputs via standard input to your testing script, and outputs are collected from standard output and structurally compared against the expected output. A test passes if they are structurally equivalent.

On submission, each team's test suite is run through the oracle implementation, developed by the instructors. The autograder filters tests based on validity; their output file matches the oracle's output. By the end, we have an extensive collection of instructor and student-developed test suites.

Once valid tests are collected, the autograder runs students' implementations through their scripts against the collected test suite. The autograder awards bonus points if a test a student wrote catches another team's incorrect implementation. The opportunity for bonus points provides the extrinsic motivation to develop interesting tests.

I enjoyed the testing tasks despite almost always completing them within a few hours of the deadline. It provided an opportunity to be creative. One of our tests involved using coordinates with overflowing integers, abusing how other languages would handle them.

For groups using TypeScript, the super large number would be parsed correctly. The problem arose when they passed it into their board, which used a key-value object where the keys were coordinates. The large number was transformed into a string using scientific notation, which would break their implementation when they wanted to access adjacent coordinates.
Groups using Java used JSON parsing libraries, some of which would automatically parse large integers into strings.
A Python group used the standard library's JSON module and had a slight bug with hardcoding the amount of information they were reading.

One of the milestones involved implementing an algorithm that performed an exhaustive search. However, few teams realized that – including my partner and I. We only discovered it while writing tests during the next milestone and consequently wrote tests that required implementations to use an exhaustive search to pass. The same test consistently netted us bonus points for multiple milestones since some teams were too lazy to fix their implementation (or perhaps didn't realize it).

Following the "testfest", test results are delivered to students, allowing us to inspect tests we failed, challenging any misconceptions we had of the specification, and amend them for following milestones.

Lessons Learned

From the assignments, I've learned several key lessons:

Documentation is valuable but frequently becomes outdated, especially as the project evolves rapidly. Encode as much of the documentation into code and contracts as possible.
A comprehensive unit and integration test suite allows for more confident refactors. You can fearlessly rewrite components of your codebase, and as long as the tests pass, your logic is more or less solid.
Design decisions are difficult to make. They require a lot of rationalization and discussion with your partner and team.
Follow conscientious, systematic design; the slower initial development velocity can lead to less refactoring and debugging in the future.
Use property-based testing to check invariants.

Codewalks and Paneling

Perhaps the most rewarding parts of the class are codewalks, opportunities for students to talk and reason about their designs and code in front of a panel. The objective is to have an interesting discussion about architecture and code. At the start of the codewalk, it's helpful to give an overview of the codewalk's goal, transition into a design discussion, and finally show the code of specific components.

For codewalks, I learned the following:

Since an important component of codewalking is a discussion on design and design decisions, an architecture diagram helps panelists visualize the components you developed and their relationships. You can also concretely point to components when explaining them.
You should focus on explaining the purpose of code, rather than what the code does. If you observe yourself explaining line-by-line what the code does, you're not focusing on design. Talking about what the code does impedes proper design discussion.
Skip the uninteresting parts. The interesting parts of a codebase are where there is a greater likelihood of design decisions being made and design issues surfacing. The goal is not to show your codebase's "good" parts; otherwise, you cannot refine any critically flawed design.

Paneling is the other side of the coin: critiquing the design and codebase of the presenters. I learned the following:

Reading code is challenging, particularly in an unfamiliar language.
Understanding the presenters' design is challenging, particularly when they give a poor overview of it.
Formulating questions that challenge or clarify the design is hard. It's easy to phrase suggestions as questions (bad!), focus too much on code (bad!), and ask uninteresting questions when there are more significant problems (bad!).

Code is a symptom of design. If the design is poor, then the code will reflect that. Spend time identifying potential design issues that the students presented, then delve into the possible locations of those issues.

Brief Reflection on Racket

Racket was our language of choice for this course. To keep this section short, here's what I found valuable about it:

Extremely flexible. We wrote the first few milestones in a functional style, transitioning to a mixed object-oriented and functional paradigm in later milestones when we felt it allowed our design to be more flexible.
A very robust, batteries-included standard library. It had reasonable defaults, and we didn't experience any pain using it.
REPL-driven development in Emacs with racket-mode improved productivity since we could play around and interact with data and redefine incorrectly implemented functions. Playing with the REPL acted as a quick sanity check before concretely transforming our examples into tests.
Some cool features that we used:
- Generic interfaces for structs: essentially interfaces for structs that we defined.
- Mixins: essentially a function that takes in a class and produces a class.
- Sandboxing and custodians: A custodian manages resources such as threads and file ports.
- Parameters: thread-local global variables that we used to provide "configuration" for our project.

Improvements

The main issue with the course was the amount of students enrolled in a single class. More students meant fewer opportunities to participate in codewalks and panels, meaning fewer opportunities to learn and grow as a presenter and a panelist. Improving in technical conversations is challenging and requires a lot of time to do so. From my perspective, only having two codewalks and a few more panelings is minimal. Students would likely benefit a lot more from more frequent feedback.

Advice to Future Students

Do I recommend taking the course? Absolutely. The course is an intensive and enjoyable opportunity to become a better developer and communicator. There are few other courses at Northeastern where you can autonomously make design decisions and defend them to other students. There are few other courses where you receive honest and constructive criticism and feedback from students and instructors. There are few other courses where you can be involved in socially-responsible, systematic design and maintenance of larger-scale software.

CS 4500 is a unique academic opportunity to have full ownership of the design and implementation of a project from the very beginning. The project has numerous interesting challenges that force you to slow down and think thoroughly about design. CS 4500 also helps you learn to receive strong, valid criticism of your code and designs and iterate on your project's flaws.

The alternative course, CS 4530, doesn't adequately substitute for the skills you hone in CS 4500. It is certainly the exponentially easier option. But how can you grow as a developer and communicator without challenges?