I'm a Berkeley prof. Working at a startup led me to update my 1500-person class

19 points by joshhug 2 years ago · 14 comments

Reader

joshhugOP 2 years ago

This is a weird assignment idea. I’d love to hear what you think, how you might improve it, etc.

Some specific open questions I have:

1. How can I nudge students towards being creative with their comparators so that we get more variety?

2. Is there a natural extension that would get students working on the same files, yielding a richer demonstration of CI?

3. Could we allow students to somehow edit the contest code itself without breaking the entire idea of the project?

ccooffee 2 years ago

I really don't like this assignment idea as anything more than getting used to the assignment submission and grading system.
1. What is it teaching? Comparator<Integer> is trivial to implement, and you're adding no requirements on it (except maybe having a human readable shortname). Code reviewing a Comparator which merely has to compile and match the two simple rules of Comparators is not a useful code review. CI tests to check those rules for common errors is quite easy (undermining the value of code review) but there's not much extra complexity to shove into the Github actions to show their power. Maybe students will be amazed by a live scoreboard, but I don't expect that anymore. We had live scoreboards for (secretly submitted) assignments back in the 00s, and the world has become significantly faster-updating since then.
2. Incentivizing unique comparators adds in a lot of meta-work. Students will need to monitor the repository and the submissions of other students if the incentive is too high. Some students may wait until the last second to submit their solutions to avoid being detected, and yet other students may try to play spoiler and copy other submissions.
3. I don't like the structure where students submit two things (a number and a comparator) where the outcomes of the lottery are almost entirely based on the number. The comparator, which seems to be the intent of the lesson, isn't tied to the success of a particular student. And since you're only re-rolling comparators in the case of a 0 output for an input pair, there's a pretty significant chance that some student submissions won't even appear in the lottery computation tree.
- joshhugOP 2 years ago
  
  1. It's definitely not about the Comparator. The lesson is all about showing off a CI/code review process. This is intended to be a short exercise, expected to take no more than an hour. The CI tests can't check that the comparator actually does what it claims, they'll just test that it's a valid comparator. And I'd build on this in a future pair assignment, where they can use some or all of the steps of the workflow from this more gimmicky assignment.
  2. I agree it's a bad use of student time to see if someone's submission is unique, but I suspect there are ways we could structurally push students to do something unique, e.g. gus_massa's post below.
  3. Yeah, it's a little weird. The integer submission is just to keep the macguffin of the assignment going. Also for N submissions, we'll use at least N-1 comparators. If we use RNG that cycles through every comparator before repeating, we can make sure everyone gets used at least once.
gus_massa 2 years ago

About 1:
* Generate a list of 1000000 pairs of random numbers. Run the Judge of each student on that list to get a new list of ternary numbers. Compare the list with the list of all the other students. The distance is the numbers of differences with the closest one. The one that is more far away wins a chocolate.
Perhaps add the variants like 123111213 -> 321333231 to detect mirror criteria.
Add 2222... (everyone ties) to avoid giving the chocolate to a lazy student.
(I guess it's easy to cheat with a good hashing method instead of an interesting criteria. :( )
* Generate a list of 1500 random numbers and join it with the 1500 numbers of the students. Use the Judge to compare the number of a student with all the other numbers. The objective is that the number of the student must be close to the middle. Perhaps 60% or 75%, to encourage been creative and cheating to win, but not too much. (I think you don't like "cheating", but for me it adds some fun.)
- joshhugOP 2 years ago
  
  I like these! And I encourage a little bit of cheating. I'm curious to see how the particularly clever students will subvert the system.
  As it happens, I had a fun assignment all about subverting the system back in the day when I taught security at Princeton back in 2012. In the final assignment (assignment 8), students were given a Linux disk image of a hard drive owned by a guy named Nefarious who had committed a murder. This was an assignment originally created by J Alex Halderman and Ed Felton, but I went a little extra in my version.
  In the assignment text, I mentioned that Nefarious was originally arrested due to an anonymous tip from a pseudonymous "Cecco Beppe". Buried in /home/root on Nefarious's drive, there is an innocuously labeled file: CB.7z, which is notable only in that it is the only file on the drive dated 2012 (intentionally), years newer than all of the files on the drive. If one went to the trouble of unzipping this file, they'd realize it is a very small disk image inside the disk image.
  Booting this image dropped the user into a chat with a depressed artificial intelligence (a custom ALICE chat bot, but with some hard coded responses to advance my story, this was pre-GPT) who, when prodded with the appropriate keywords, told the tale of the AI's depression about the murder and attempted but failed self-deletion -- leaving the AI nothing more than a pathetic and mostly incoherent chat bot.
  Further prodding led the AI to reveal the existence of secret assignment #9 as well as a URL for said assignment.
  At the URL, there is a zip file containing a flawed pseudorandom generator, a file encryption tool that uses said PRGen, an encrypted file, and a truncated copy of the corresponding plaintext. The truncated plaintext explains that their next task is to complete decryption of the file, which they can do by exploiting the flaw in the PRGenerator.
  The deciphered text then explains that their next task is to go back to the HW1 autograder and submit a new solution for HW1 (which, incidentally, was to develop a PRGen) -- but with the catch that they may only use print statements, i.e. they're just trying to trick the parser. This was based on an security flaw I'd discovered in Princeton's grader while I was working on their first Coursera courses.
  Once students submitted this cheating print statement code via the usual Princeton web submit for HW1, the autograder activated a secret message that explained the final part of secret assignment 9, which was to steal the source code for the HW1 autograder and send it to me.
  This last part could be done by simply writing code that opening the .class files and prints them to the screen when the autograder script runs.
  I had around 10 students (out of ~150) figure this all out, it was great.
taftster 2 years ago

I agree with one of the sibling comments here. Comparator is just way too simple of an interface. And you've defined a lot of it in terms of a single individual, and maybe just slightly a "team" in the larger sense of the entire class. It's kind of a "boring" assignment.
Just an idea. What if you created some sort of "robot" contest that defined various areas of functionality. Each robot could have sensors, motor controls, and the ability to reason on these. The interfaces of the robot would be modular and pluggable by student code; each student would collaborate on their teams robot, providing the logic for one of the interfaces. This effectively forces students to collaborate in a single repository.
The competition could include things like:
a) Teams of {n} students would build a robot collaboratively. They would each need to contribute into a shared team git repository and have a suite of CI tools that builds their code. The system pulls each robot and puts it into a continuously running competition, evaluating each robot for fitness. The contest is ongoing, so that each team can improve their team score (with more commits) until the assignment due date. Grades are given according to the top performers.
b) Like (a), but each student may collaborate on multiple robots and issue pull requests across multiple team repositories. The student with the most accepted pull requests is the winner.
I like the idea of a student placing their code into a larger vessel; collaboration with others through the use of modern software development methodologies. I also like the idea of seeing the "score" of the team robot being improved over time, as the team sees their robot moving up the charts. This helps incentivize the team towards improvement. "Gamifying" the system is a great way of encouraging student involvement.
I love that you're doing this assignment. As a full time software professional that has previously taught as an adjunct, I really appreciate that you're trying to teach concepts that are being actively used throughout industry.
[edit] As inspiration, think about the game Factorio as a potential model (without the graphics). Each student team would be responsible for a "line" on a production floor, each needing to process or assemble parts coming down the conveyor belt. The fittest team is the one that can correctly assemble the parts and deliver packages in the shortest time, etc.
- joshhugOP 2 years ago
  
  I'll get pondering. Having some sort of rube goldeberg-esque mega project was my original idea, but finding a specific form for it has been elusive.
  Taking inspiration from Factorio in some way is an interesting proposition. Will consider!

swatcoder 2 years ago

What do GitHub Actions and "team workflows" have to do with Data Structures? Even before he mentioned what he was inspired to ad, the course seemed scattered and superficial rather than foundational and preparatory.

I'm sure it's hard/impossible to drive one through the bureaucracy, but it sounds like he should have pitched a new course on Contemporary Team Practices or something.

joshhugOP 2 years ago

Like a lot of schools, our Data Structures course doubles as a soft intro to software engineering. I want them to know how to solve real problems efficiently and manage complexity.
Student time is incredibly precious. The hope here is that students will spend less than an hour on this, and in the process get a taste of how code reviews / automation can help teams function better. I'm hoping I can thread the needle and give them tools that will help them more efficiently complete their capstone project at the end of the semester, without spending too much of their time teaching them these tools.
FWIW, we have discussed the idea of a lower division software engineering class, and I was really hoping Pamela Fox would do this before she headed back to industry (interesting story there: https://blog.pamelafox.org/2022/05/my-experience-as-unit-18-...). We do have an upper div software engineering course as well, though it's not taken by most of our students.
taftster 2 years ago

That's a decent point, but at least they're trying? You can't just start a new course so easily, but adding these concepts into an existing course is much more feasible.

givemeethekeys 2 years ago

I learned how to write code in community college. After getting my Associates degree, going to university to complete my undergrad was easy. Why?

Even if we aced every test, all our assignments needed to work 100% in order to pass the class. The assignments were then graded on style. The professors were all current or former professionals - our code had to pass their test suite.

craigts 2 years ago

I would spend my entire time trying to cheat or subvert the rules.

taftster 2 years ago

And you would probably come away with a lot of (well deserved) knowledge from this assignment.
If a professor can inspire you to actively seek alternatives and/or think outside the box, I consider that a win.
- anon22981 2 years ago
  
  In my studies, we had a couple of courses on project models and the business end of things etc. The exam materials consisted of several pdfs and/or powerpoints. The exams were remote and not monitored.
  Technically it was pretty easy to use the materials on the exams, but this was limited by setting pretty short time limits on these tests. So you’d have to know the answers instead of going through the materials in several different files. Or that was the idea.
  My solution was to build a desktop application that parsed through pdfs and powerpoints in a given folder and presented all these pages in a zoomed out view. You could click and open the pages for closer view, etc. The juice was that you could feed keywords to a search and it’d show all pages where the keywords hit. So I could open up all the materials at the same time, write a keyword from the exam question and look through the relevant pages.
  Admittedly this didn’t teach me that much about the actual subject, but I think it was a net-positive by coming out as a better programmer.

Settings

I'm a Berkeley prof. Working at a startup led me to update my 1500-person class

Keyboard Shortcuts