Secure Code Warrior

Introducing Missions: The next phase of developer-centric security training

We're thrilled to announce a brand new feature release on the Secure Code Warrior platform: Missions. This all-new challenge category is the next phase in developer-ified security training, moving users from the recall of security knowledge, to applying it in a real-world simulation environment.

Since 2015, we've been engaging developers all over the world with a proactive, positive approach to security, helping them build the skills to secure their code, cut down on rework and remediation, and hopefully see the security team as something other than the fun police.

We're still committed to standing side-by-side with developers as they secure code across the galaxy, but it's time to shake things up and take our battle-hardened, security-aware developers to the next level.

We're thrilled to announce a brand new feature release on the Secure Code Warrior platform: Missions. This all-new challenge category is the next phase in developer-ified security training, moving users from the recall of security knowledge, to applying it in a real-world simulation environment. This scaffolded, microlearning approach builds potent secure coding skills that are job-relevant, and much more entertaining than (vertically) watching endless training videos in the background of a workday.

Our first playable, public Mission is a simulation of the GitHub Unicode breach. It might seem deceptively simple, but it's a really clever vulnerability that is fun to dissect. Security researcher 0xsha did a comprehensive case study on how this same bug can be used to exploit Django by way of case transformations, while also revealing how the vulnerability behavior can change between programming languages. There's a lot more to discover about this security issue, and our mission is a great place to start.

GitHub's head-on (case mapping) collision

In a blog post from November 28, 2019, security research group Wisdom reported on a security bug they discovered on GitHub. They outlined how they were able to utilize a Case Mapping Collision in Unicode to trigger a password reset email delivery to the wrong email address (or if we were thinking like an attacker, an email address of the threat actor's choosing).

While a security vulnerability is never good news, security researchers who rock a whitehat do provide some mercy -- not to mention the opportunity to avert disaster -- if they discover potentially exploitable errors in a codebase. Their blogs and reports often make for great reading, and it's kind of cool to learn about a new vulnerability and how it works.

In order to move to the next level of secure coding prowess, it is immensely powerful to not just find common vulnerabilities (especially any cool new ones - we all know that malicious threat actors will be looking for fertile ground to dig up some data with these new techniques) but also have safe, hands-on environment to understand how to exploit them as well.

So, let's do just that. Keep reading to discover how a Case Mapping Collision in Unicode can be exploited, how it looks in real-time, and how you can take on the mindset of a security researcher and try it out for yourself.

Ready to take on a Case Mapping Collision right now? Step right up:

Unicode: Complex, endlessly customizable, and more than just emojis

"Unicode" may not be in the lexicon of the average person, but chances are good that most people use it in some form every day. If you've used a web browser, any Microsoft software, or sent an emoji, then you've been up close and personal with Unicode. It's a standard for consistent encoding and handling of text from most of the world's writing systems, ensuring that everybody can (digitally) express themselves using a single character set. As it stands, there are over 143,000 characters, so you're covered whether you're using the Icelandic, or the Turkish dotless, or anything in between.

Due to the sheer volume of characters Unicode has in its set, a way of converting characters to another "equivalent" character is needed in many cases. For instance, it seems sensible that if you convert a Unicode string with a dotless to ASCII, that it should simply turn into an "i", right?

With a great volume of character encoding comes great potential for disaster

A case mapping collision in Unicode is a business logic flaw, and at its core, can lead to an account takeover of accounts not protected by 2FA. To illustrate the vulnerability in question, let's look at an example of this bug in a real code snippet:, function (req, res) {
  var email =;
  db.get(SELECT rowid AS id, email FROM users WHERE email = ?, [email.toUpperCase()],
      (err, user) => {
          if (err) {
          } else {
              generateTemporaryPassword((tempPassword) => {
                  accountRepository.resetPassword(, tempPassword, () => {
                      messenger.sendPasswordResetEmail(email, tempPassword);

The logic goes something like this:

  1. It accepts the user-provided email address, and uppercases it for consistency
  2. It checks if the email address already exists in the database
  3. If it does, then it will set a new temporary password (this isn't best practice, by the way. Instead, use a link with a token that enables a password reset)
  4. It then sends an email to the address fetched in step 1, containing the temporary password (this is very poor practice, for so many reasons. Yikes.)

Let's see what happens with the example provided in the original blog post, where a user requests a password reset for the email John@Gı (note the Turkish dotless i):

  1. The logic converts John@Gı to JOHN@GITHUB.COM
  2. It looks that up in the database and finds the user JOHN@GITHUB.COM
  3. It generates a new password and sends it to John@Gı

Note that this process ends up sending the highly sensitive email to the wrong email address. Oops!

How to cast out this Unicode demon

The interesting aspect of this specific vulnerability is that there are multiple factors that make it vulnerable:

  1. The actual Unicode casting behavior
  2. The logic determining email address to use, i.e. the user-provided email address, instead of the one that already exists in the database.

In theory, you can fix this specific issue in two ways, as identified in the blog post from Wisdom:

  1. Convert the email to ASCII with Punycode conversion
  2. Use the email address from the database, rather than the one provided by the user

When it comes to hardening software, it's a great idea to leave nothing to chance, employing as many layers of defense in place as possible. For all we know, there may be other ways to exploit this encoding - we're just not aware of them yet. Anything you can do to decrease risk and close windows that may be left open for an attacker is valuable.

Ready to try this for yourself?

Most developers are aware that compromised data is bad for business. However, security-aware engineers are a powerful antidote against growing vulnerabilities, breaches, and cybersecurity woes.

It's time to take your secure coding and awareness skills to the next level. Experience this GitHub vulnerability in an immersive, safe simulation, where you can see the impact of bad code in both frontend and backend contexts. Attackers have an advantage, so let's even the playing field and apply real skills with a whitehat counter-punch.