So you want to write some challenges and have decent reviews...

Background

In TAMUctf 2020, we made some fairly major errors. Most notably, we had a challenge whose distributed binary was different than the one in the challenge.

This ended up turning into a very embarrassing kerfuffle which, when combined with our team's inexperience, ended up making us look very unprofessional. Which we were!

Additionally, we found that a few challenges had unintended solutions which were far simpler than the task had intended to be.

So let's make our process a little more rigorous and prevent any similar tomfoolery from happening again.

Team-splitting, or: CTF development as a CTF

Ultimately, the approach we've decided to implement this year is more about ensuring that challenges are solvable and consistent; that is, we ensure the challenges will always have a solution and that solution can be found by competitors (or, more specifically, us).

The approach is simple: we have more than 4 people on the team, so we divide ourselves in half: a "red" team, and a "blue" team which will individually create challenges. Each team will have their own organisation in GitHub, marked private, and together share an organisation which we'll get back to later.

Setting up

The first step is to create the repository that will contain all the challenges. This is initialised in the shared organisation, provided with a README, a .gitignore, and folders for distinguishing challenge categories. This repository should be cloned by each team leader, and the remotes changed such that private repositories can be created with the same histories as each other (something I lovingly call "blind forking"). This prevents the owner of the shared organisation from being able to see the private changes made by the other team in their own repository (as permissions of forked projects are inherited in GitHub).

Using it

Below is a swimlane diagram which demonstrates the process by which this approach is used. Dark green items represent code updates to the repository, light yellow represent issue updates to the repository, and orange indicates interaction with the challenge as hosted by the developing team. As a reminder, anything in the blue swimlane is not visible by the red team and vice versa and items in the shared swimlane can be seen by both.

diagram of the development process

This swimlane can be flipped such that the red team can develop challenges, requiring the blue team to review.

As a textual description:

  1. The blue team creates a skeleton for their challenge and push it to both their private repository and the shared repository.
  2. The blue team begins development of the challenge and raises a GitHub issue in the shared repository with the challenge description as it will be presented to competitors.
  3. Once the challenge has been updated with relevant sources, the challenge is hosted (by whatever means appropriate) by the blue team and the issue created in step 2 is updated with the details for connecting to the challenge.
  4. The red team merges the skeleton to their private repository and begins developing a solution.
  5. Once the red team successfully solves the challenge and acquires the flag, they write and push their solution to their private repository, then push that to the shared repository.
  6. The blue team verifies the solution. If it is an unintended solution, return to step 3 to patch out the unintended solution.
  7. The blue team merges the solution with the challenge sources in their private repository, then pushes that to the shared repository and turns off the hosting.
  8. The issue is closed as completed; the challenge is complete and has been sufficiently reviewed.

Some Caveats

  1. It's not the most straightforward approach if you're doing this entirely with git, but it prevents you from needing to maintain additional infrastructure!
  2. It's unclear who develops automated tests, if your team does that (though I personally suggest the developing team).
  3. Hosting may not perfectly represent how it is hosted during the competition. To mitigate this, TAMUctf packages everything in Docker, though this can also be limiting.
  4. This is definitely oriented towards teams of 4 or greater. If you're a one man team, you can't exactly not know how to solve your own challenges and if there's only two of you, then you might not each have the skillset to handle all the challenges the other "team" (person) throws at you.

Ultimately, it's up to your team to develop a system that works for you, but this year, we're gonna trial this out and see if it doesn't improve our challenge development cycle. I'll update this as we go for future reference.