Building a bar scheduler for our Hockey club — getting deeper

8 min read Original article ↗

Bavo

Two weeks ago I ventured into scheduling problems using Timefold and Claude, creating a Claude.md starter that allows me to leverage Timefold without any coding.

Press enter or click to view image in full size

check your reality, ai-nception coming in

This starter is:

  • a clone of Timefold.ai code samples for typical scheduling problems
  • a render of relevant Timefold.ai docs in markdown format
  • a Claude.md that maps to these documents so Claude can find them locally when needed, and explains how to create basic Timefold apps

This was verified by creating a proof-of-concept for a trainer sheduler in our field hockey club. Today I needed to solve a real world problem:
how to assign teams to bar duty 🗓️

The problem domain

That is something we do every season, and we try to do this in a fair way of course, spreading volunteering work across the teams. This is also something that attracts a lot of positive and negative feedback from the teams, as fairness is a relative concept. We used to do this manually, that’s a lot of trial and error, which results in a schedule that is hard to explain as it is based on intuition rather than raw numbers. And it is really annoying when teams reschedule matches… which they all seem to start doing as soon as you sent a final plan.

Press enter or click to view image in full size

a manual plan

The approach was to start from ground-up:

  • discuss why we want to open the bar (we want to have people have a good time at our club, we need the income)
  • discuss what members see as a fair distribution
  • with how many people they need to be and with who they prefer to work
  • what times they find most convenient to do this

As soon as we had that we could describe this as constraints for the solver to use in its model:

Bar Planning Rules

  1. Playing teams only (strict rule)
    a team can only be assigned to a bar shift during a time slot when they actually have a match playing. No team gets assigned to a slot that has nothing to do with them.
  2. Fair share based on team size (preference)
    bigger teams have more family members, so they take on more bar shifts. We try to make sure that the number of bar shifts per family is roughly equal across all teams — a team with 16 players should do about twice as many shifts as a team with 8 players.
  3. Spread shifts over time (preference)
    If a team has multiple bar shifts to cover, we try to space them out across the season rather than bunching them all together. Two bar shifts for the same team within 4 weeks of each other are considered too close.
  4. Fair share of opening and closing shifts (preference)
    Opening shifts (first shift of the day, extra setup work) and closing shifts (last shift of the day, cleanup) are heavier than regular shifts. We try to spread these across all teams so no single team always gets stuck with the early or late slot.
  5. Mix teams within a slot for small teams (preference)
    For small teams (quarter-field or smaller), we prefer to assign two different teams to the two helper spots in the same slot. This avoids relying too heavily on a single small team’s small pool of parents.
  6. Keep the same team within a slot for large teams (preference)
    For large teams (half-field or bigger), we prefer to assign both helper spots to the same team. Larger teams have enough families to fill both spots, and having a single team contact per slot makes coordination easier.

With these requirements agreed upon we can go and create a model!

Spec driven development

As we use Claude Code, but only have a simple Pro account, we often get halted dead in our tracks when going full steam ahead. To avoid that, we need to avoid long debugging cycles, as well as verbose chats to get to a plan.

Spec-driven development is a thing now, which requires that you handle a full specification to the agent so it can do it’s work until fruition. You need to be very detailed thinking like a functional analyst. The interesting thing is that this does not require implementation details, and thus it doesn’t require our Timefold starter template for Claude.

We can create specs at higher level outside that scope.

Why then not let Google Gemini have a go at that, sparing our precious Claude credits?

Creating a spec in Gemini

I asked Gemini (Pro) chat to be my analyst and ask questions until we had a full spec. Gemini brainstormed with me how to expresss the rules, finetune their wording and how the user experience should be like. How do we input data, how do we export? How do we visualize? How do we define shifts for a match schedule?

Press enter or click to view image in full size

This spec.md was then put in our git clone of our starter and then we got Claude going on this.

It delivered an app that worked, as it should, the spec even described the tests to use and what sample data to ingest to verify that.

Press enter or click to view image in full size

tadaaa!

while it worked great for ingestion and did what we needed to, the UI was a hot mess. That’s basically our fault, we didn’t specify how to do this at all. Somehow I was hoping it would do the same as for the previous planner app, which was a nice React-based GUI using Tailwind, with many pages. It actually chose to go for Vanilla-JS in a single page! Clumsy to use.

One gets used to this wizardry… It was a working app from a single command, and it did provide a solution. But we expect more these days. Life is hard for AI agents.

Solving bugs

What followed — sadly — was quite a few hours wasted on getting the UI right. Adding pages, making sure timestamps are in correct format, resizing things, adding calendar and calendar navigation, finding things hidden in invisible divs etc. This still takes time, a lot of it. Back and forth, trial and error.

All this is expected. What I didn’t see coming was how hard it was to get the Timefold model just right. It looked right to me… until I handed the rendered Excel over for review and got it sent back immediately. Parents were spread apart, while we asked to keep large teams together. To solve that little thing (it’s always just a little thing) we actually found multiple issues in the constraint and scoring system:

  • the keep team together constraint never fired due to a equals/hashcode issue (forgot to add, if so dont use Joiner.equal!)
  • the fairness constraint was blown up by 1000 to keep precision, but outweighed all others by far
  • the spread apart constraint was 10 times heavier than others even if it was just a minor rule because every day less than 28 apart counted, creating a high penalty every time affected

Press enter or click to view image in full size

getting insight into the scoring

We did have unit tests but did we have enough? Surely not as this equals bug should never have passed. Some knowledgeable friend at Timefold even told me: make sure to have them, and lots of them. He was right, we won’t make that mistake again.

Finetuning is the real work

So that was the real lesson learned: it requires expertise to finetune a model. This also need good data to verify that, and reweigh, and check until we are all happy. Putting numbers to things is the hard part. Not all things are equal, and not all can be equally compared. How does one compare fairness in load to being together with parent of same team?

Now it helps if you visualize the system. A lot. We discovered that our ruleset actually could never be fair, as our constraints did not allow it by design. If bar needs to be attended by team present, teams playing a full field always need to attend. Fair or not.

Press enter or click to view image in full size

this doesnt look fair, but life isn’t either

The final product

After tuning, we did get it right, enough. The great thing about this process is that it explains itself clearly. We can explain why some teams have heavier load than others. One has to do with having less members, while hogging a full field. Then you will have to show up more.

Press enter or click to view image in full size

a plan that works

Job done! If you want to create schedules as well, take a look at the Timefold Claude starter.

This shows another shift in how we do development. We used AI (Gemini) to create the specs for an AI agent to implement it (Claude Code) so an AI planner (Timefold) can execute it. We are now 3 levels deep. That’s becoming ai-inception?

We have many more scheduling problems to solve in our club. Looking forward to those now 🏗️