Ask HN: What algorithms should I research to code a conference scheduling app
I'm interested in writing a utility to assist with scheduling un-conferences. Lets take the following situation for an example:
* 4 conference rooms across 4 time slots, for a total of 16 talks.
* 30 proposed talks
* 60 total participants
Each user would be given 4(?)votes, un-ranked. (collection of the votes is a separate topic) Voting is not secret, and we don't need mathematically precise results. The goal is just to minimize conflicts.
The algorithm would have the following data to work with:
* List of talks with the following properties:
* presenter participant ID
* the participant ID for each user that voted for the talk
I'd like to come up with an algorithm that does the following:* fills all time slots with the highest voted topics
* attempts to avoid overlapping votes for any particular given user in a given time slot
* attempt to not schedule a presenter's talk during a talk they are interested in.
* Sugar on top: implement ranked preferences
My question: where do I start to research the algorithms that will be helpful? I know this is a huge project, but I have a year to work on it. I'm also not overly concerned with performance, but would like to keep it from being exponential.
Thank you for any references you can provide! Your problem is a scheduling problem. It has been well studied in operations research [1] / mathematical optimization [2]. Basically, you formulate your problem as a integer programming model [3] and use a solver [4] to solve it. You should check PuLP [5]. You can also ask your question at OR Exchange [6]. [1] https://en.wikipedia.org/wiki/Operations_research [2] https://en.wikipedia.org/wiki/Mathematical_optimization [3] https://en.wikipedia.org/wiki/Integer_programming [4] https://en.wikipedia.org/wiki/List_of_optimization_software After this, do a bit of research on goal programming or other approaches to multiobjective optimization. It could really help and depending on the approach, it's not that hard. It usually just adds a bunch of constraints or some precompuations to an already existing set of in/equalities for your model. ALLOCATING CONFERENCE DELEGATES TO WORKSHOPS: A SPECIAL TIMETABLING PROBLEM --
"The problem of assigning delegates to workshops at a conference can be formulated as a timetabling problem. Such an assignment must take into account the preferences of the delegates, as well as the number of participants each workshop can accommodate. This paper will report on a heuristic solution technique for a special case of such a problem." Thank you for the detailed response! Resource scheduling, CSP (Constraint Satisfaction programming) CSP: https://en.wikipedia.org/wiki/Constraint_satisfaction_proble... Scheduling (production processes): https://en.wikipedia.org/wiki/Scheduling_(production_process... Scheduling (computing): https://en.wikipedia.org/wiki/Scheduling_(computing) ... To an OS, a process thread has a priority and sometimes a CPU affinity. Thanks for the information! CSP definitely sounds like something to look into. What everyone else said, but if you want a "quick and dirty" way of doing it, you can just use a simple genetic algorithm: - Start with a couple (say 100) completely random schedules - Measure how good they are via a fitness function (say, start at 0 and everytime a constraint is violated add 1 point. You can be fancy and add different amounts depending on the type of constraint violated) - Sort the schedules from lowest to highest - Pick the top 20 or so - Generate other 80 schedules by mutating the 20 you selected before, randomly shuffling the talks, and possibly mixing up different schedules together (crossover). - Repeat for a couple thousands of generations, the population should quickly evolve towards better schedules - You might get stuck on local minima, but for the most part you can just run it again several times from the start and it eventually won't get stuck - Not guaranteed to give optimum results but should give decent enough results I was going to post exactly this. I always solve scheduling problems with GAs and get great results. It's easy to do and does not require any domain knowledge. It would probably take just a few hours to solve this using a GA, maybe even less. Search for "timetabling". Googling for "scheduling" isn't very helpful because it can mean so many different things, but "timetabling" will give you lots of info on a problem very similar to yours. Here's my idea to throw into the ring: You have a graph of 30 nodes. When a user votes for talks A,B,C,D, you add edges for all those 6 pairs with weight 1, or increment the weight if that edge already exists. Find the longest path. This is your track in Room 1. Remove those nodes from the graph.
Find the longest path. This is your track in Room 2. Remove those nodes from the graph. Repeat two more times. The problem you want to avoid is: although talks A and B are popular, they don't share the same audience members. You want to identify cohorts, or cliques, or whatever you want to call it, and keep them in the same room. Call it a "Track" and give it a name. This. I worked with a company that used this method to generate timetables for schools with way more variables and constraints. They often found working plans when other solutions did not. Maybe a little overengineered for your problem though I don't know of the formal way of solving such problems, but I have worked with them in the past. Since the numbers are not large, I would suggest just brute force every possible combination and give each one a score based on your needs. You can then just select the higher ranked ones. An example of assigning scores for this problem could be:
Select any 16 talks and assign them randomly to the 4x4 time slots, and assign the 60 viewers to these as well. Now calculate the score as:
+1 per vote these talks received per viewer that gets to attend them
-10 for every overlapping vote
-20 for a presenter not being able to attend a talk they are interested in Sum up these numbers to get a score for this combination. The actual numbers would depend on how important each criteria is to you. Calculate the scores for all combinations and pick the highest ranked ones. You can then be certain that the solutions you pick are mathematically the best ones. Implementing ranked preferences is also very easy this way. If the numbers could get large and performance is an issue, then a genetic algorithm would be a good bet. It's like a better/more optimised way of moving through the solution space than plain brute forcing. You essentially select 50 good combinations, calculate scores for each and then eliminate the weakest ones while keeping the best ones and merge their good parts to create a new generation of solutions. And you do this till the solution converges. There are libraries in most programming languages that do much of this work for you. Unless I am completely misunderstanding the problem statement, this is not a big project - the algorithm itself could be done in less than a weekend - maybe a few hours even. But yes, brute forcing or using genetic algorithms may not be the best way of solving this problem because of performance concerns. What you need is schedule optimization tool - https://developers.google.com/optimization/ Someone might have already done this before. The apps below are mostly for reserving conference rooms, but there might be one that handles voting and constraint satisfaction. Check out Constraint Optimisation algorithms. Try genetic algorithm. More detail in programming collective intelligence ( it covers group travel).
For constraint optimisation i think drools has a product. I cant find a link but such scheduling problems should be converted into color coding problem.
Look into google scholar for group scheduling Drag and drop. Visualization. Scheduling conferences is as much an optimization problem as it is a political problem. Check this out:
http://sci-hub.ac/10.1007/s10732-007-9066-7 Thanks. Skimmed through it and it definitely seems like it could be some very useful information. I'll be more thoroughly reading it soon. Check HasGeek, they have open source platform for this. hasgeek.com Prolog This looks like a special case of the knapsack problem https://stackoverflow.com/q/7774769/447514