Security Review: HTML sanitizer in Thunderbird

I spent a few days working on a security review for Thunderbird's HTML sanitizer. Thunderbird has three presets for viewing mail: Original HTML, Simple HTML, and Plain Text. No matter which preset the user prefers, emails should not execute JavaScript. And this is where the HTML sanitizer joins our party.

This security review was discussed in one of my first weeks at Mozilla and though being a very interesting topic, it soon occured to me that I might have bitten off more than I could chew. So the security review got stuck in my queue and I finally dared to take a stab at it months later. (Thanks to those fellow Mozillians who helped me getting started!)

The key lesson about HTML sanitizers is: Don't even consider writing your own.

So without further ado, I started collecting bits and pieces together. First I required creating a recent build of Thunderbird. Then I looked into XPCShell tests (unit tests using Mozilla's privileged JavaScript libraries) and the nsIParserUtils interface. My next step was writing a basic sanitizer call, and it turned out comparably easy:

var ParserUtils =  Cc["@mozilla.org/parserutils;1"].getService(Ci.nsIParserUtils);
var sanitizeFlags = ParserUtils.SanitizerCidEmbedsOnly|ParserUtils.SanitizerDropForms|ParserUtils.SanitizerDropNonCSSPresentation;
var output = ParserUtils.sanitize("XXX HTML here", sanitizeFlags);

With this prototype, I could easily loop around a dataset of HTML vectors. For this I chose the vectors from the html5 security cheat sheet and RSnake's old XSS cheat sheet (thank you guys!)

Thankfully the html5 security cheat sheet has its attacks in a JSON file. Extracting them was as easy as taking this dataset and joining the vectors with the file that contains the actual attack payload, (i.e., JavaScript alerts and other triggers in various encodings). The XPCShell comes with a load() function which makes it very easy to include these JSON files.

The full test then looks a bit like this:

var Ci = Components.interfaces;
var Cc = Components.classes;

// gives us an items object:
load("html5sec_items.js");
// possible payloads for within those vectors (items[x].data)
load("html5sec_payloads.js");

// from html5sec.org's import.js:
for (var item in items) {
// replace the payload templates
  for (var payload in payloads) {
    var regex = new RegExp('%' + payload + '%', 'gm');
    items[item].data = items[item].data.replace(regex, payloads[payload]);
    if (items[item].attachment && items[item].attachment.raw) {
      items[item].attachment.raw = items[item].attachment.raw.replace(regex, payloads[payload]);
    }
  }
}
// initialize parser object
var ParserUtils =  Cc["@mozilla.org/parserutils;1"].getService(Ci.nsIParserUtils);
var sanitizeFlags = ParserUtils.SanitizerCidEmbedsOnly|ParserUtils.SanitizerDropForms|ParserUtils.SanitizerDropNonCSSPresentation;

for (var item in items) {
  // sanitize vector
  var out = ParserUtils.sanitize(items[item].data, sanitizeFlags);
  items[item].sanitized = out;
}

// results for html5sec cheat sheet
var mini_items = items.map(function(e) { return {data:e.data, sanitized:e.sanitized}; });

load("xss_rsnake.js"); // array of rsnake xss cheat sheet entries
rsnake_results = [];
for (var i in xss_rsnake) {
  var out = ParserUtils.sanitize(xss_rsnake[i], sanitizeFlags);
  rsnake_results.push({"data": xss_rsnake[i], "sanitized": out});
}
collected_results = mini_items.concat(rsnake_results);
dump(JSON.stringify(collected_results)); // full output as JSON

// html-strings to stdout:
for (var i of collected_results) {
  dump(i.sanitized);
}

After sanitizing all of these attack vectors, I had to review the results. Since this is my first dive into XPCShell tests, I didn't dare to hook all the logic behind script parsing, image loading, event handler registration and so forth. Instead I reviewed the sanitized output by hand (a JSON capable editor helps a lot). After that I also put the combined output into a single HTML file and opened it in the browser. The Firefox Developer Console helped me confirm that no resources were loaded and no scripts executed.

This means that the sanitizer successfully stripped all the scripts tags, self-submitting forms and event handlers: Security Review done!

For convenience, I have uploaded my test results as a JSON file. It is an array of objects in the format {"data": "...", "sanitized": "..."}.


If you find a mistake in this article, you can submit a pull request on GitHub.

Other posts

  1. Home assistant can not be secured for internet access (Sun 15 December 2024)
  2. Modern solutions against cross-site attacks (Tue 26 November 2024)
  3. Prompt Injections and a demo (Wed 18 September 2024)
  4. The Mozilla Monument in San Francisco (Fri 05 July 2024)
  5. What is mixed content? (Sat 15 June 2024)
  6. How I got a new domain name (Sat 15 June 2024)
  7. How Firefox gives special permissions to some domains (Fri 02 February 2024)
  8. Examine Firefox Inter-Process Communication using JavaScript in 2023 (Mon 17 April 2023)
  9. Origins, Sites and other Terminologies (Sat 14 January 2023)
  10. Finding and Fixing DOM-based XSS with Static Analysis (Mon 02 January 2023)
  11. DOM Clobbering (Mon 12 December 2022)
  12. Neue Methoden für Cross-Origin Isolation: Resource, Opener & Embedding Policies mit COOP, COEP, CORP und CORB (Thu 10 November 2022)
  13. Reference Sheet for Principals in Mozilla Code (Mon 03 August 2020)
  14. Hardening Firefox against Injection Attacks – The Technical Details (Tue 07 July 2020)
  15. Understanding Web Security Checks in Firefox (Part 1) (Wed 10 June 2020)
  16. Help Test Firefox's built-in HTML Sanitizer to protect against UXSS bugs (Fri 06 December 2019)
  17. Remote Code Execution in Firefox beyond memory corruptions (Sun 29 September 2019)
  18. XSS in The Digital #ClimateStrike Widget (Mon 23 September 2019)
  19. Chrome switching the XSSAuditor to filter mode re-enables old attack (Fri 10 May 2019)
  20. Challenge Write-up: Subresource Integrity in Service Workers (Sat 25 March 2017)
  21. Finding the SqueezeBox Radio Default SSH Password (Fri 02 September 2016)
  22. New CSP directive to make Subresource Integrity mandatory (`require-sri-for`) (Thu 02 June 2016)
  23. Firefox OS apps and beyond (Tue 12 April 2016)
  24. Teacher's Pinboard Write-up (Wed 02 December 2015)
  25. A CDN that can not XSS you: Using Subresource Integrity (Sun 19 July 2015)
  26. The Twitter Gazebo (Sat 18 July 2015)
  27. German Firefox 1.0 ad (OCR) (Sun 09 November 2014)
  28. My thoughts on Tor appliances (Tue 14 October 2014)
  29. Subresource Integrity (Sun 05 October 2014)
  30. Revoke App Permissions on Firefox OS (Sun 24 August 2014)
  31. (Self) XSS at Mozilla's internal Phonebook (Fri 23 May 2014)
  32. Tales of Python's Encoding (Mon 17 March 2014)
  33. On the X-Frame-Options Security Header (Thu 12 December 2013)
  34. html2dom (Tue 24 September 2013)
  35. Security Review: HTML sanitizer in Thunderbird (Mon 22 July 2013)
  36. Week 29 2013 (Sun 21 July 2013)
  37. The First Post (Tue 16 July 2013)
π