Debunking native in-browser spell-jacking

9 min read Original article ↗

Thomas Betous

Press enter or click to view image in full size

Spell-jacking (Image from author)

If like me you’re not particularly skilled at writing, spellcheckers have likely saved you countless times. Spellchecking is now a well-established feature, typically indicated by red underlines that highlight errors in your text. Sometimes you can even have some suggestions.

Press enter or click to view image in full size

Many browser vendors include their own spellcheckers, and unless you carefully review your settings, you might be surprised by the amount of information your browser sends to its vendor.

A few years ago, concerns arose when it was discovered that Chrome and Edge spellcheckers could potentially leak passwords. Essentially, when passwords become visible, they’re treated as regular text fields, and browsers attempt to spellcheck them. If you’ve allowed your browser to use network-based spellchecking, you could inadvertently send your password to the browser vendor.

This is why it’s now recommended to use spellcheck=”false” on password fields, explicitly telling the browser not to spellcheck these fields. Although this recommendation emerged in 2022, I wanted to examine current browser behavior. While I found articles explaining the vulnerability, none detailed how it was discovered.

My goal was to verify if disabling spellcheck on password fields remains relevant today, and I was curious about how to observe this behavior. In this article, I’ll explain how I created a sandbox to monitor Chrome’s spellchecker requests and what I learned from this experiment. More broadly, this article can serve as a guide for anyone interested in analyzing a software’s network behavior.

First try: The dummy test

My initial approach to observing the spellchecker was straightforward. I created a minimal HTML page containing only a textarea input. The purpose was to minimize network traffic, making it easier to identify suspicious requests.

<!DOCTYPE html>
<html lang="en">
<head>
<title>Test with text area</title>
</head>
<body>
<h1>Test with text area</h1>
<p>Enter your comments below:</p>
<textarea rows="4" cols="50"></textarea>
</body>
</html>

Then, I enabled “Enhanced spell check” in Chrome. This feature allows Chrome to use Google’s search service for spellchecking. You can enable this setting by navigating to chrome://settings/?search=Enhanced+Spell+Check

Press enter or click to view image in full size

Spell check settings in Chrome

Finally, I opened the developer console’s Network tab and typed “Helo world!” in the input text. As expected, “Helo” was underlined with red dots indicating a spelling error. However, surprisingly, nothing appeared in the network tab. Even when right-clicking on “Helo” to view suggestions, no network activity was visible. I tested this behavior across several other pages with the same result — no network activity appeared in the tab.

Press enter or click to view image in full size

Chrome browser with network console

If I disable “Enhanced spell check” and revert to basic spell check, I no longer receive spelling suggestions.

At this point, I hypothesized that Chrome wasn’t sending spellcheck requests from within the page, but rather directly from the browser itself, bypassing the network tab.

Second try: Build a box around Chrome

If I can’t track spellcheck activity in the network tab, I needed to find a network monitoring solution around Chrome. This reminded me of using Wireshark during my studies. Before diving into that tool, I conducted a quick search for alternative solutions that might better suit my needs. In addition to observe network traffic, I need to be able to examine the actual contents of the requests. Since Google probably uses HTTPS protocol, I’ll need to act as a trusted proxy to handle SSL certificates before forwarding requests to their intended destination.

Wait… but that’s a man-in-the-middle attack!

While exploring the web, I came across mitmproxy — short for Man-in-the-Middle Proxy. It’s a free, open-source tool that acts as a powerful Swiss army knife for debugging, testing, privacy analysis, and penetration testing. Mitmproxy allows you to intercept, inspect, modify, and replay web traffic, including protocols like HTTP/1, HTTP/2, HTTP/3, WebSockets, and any other SSL/TLS-encrypted traffic. Since this tool seemed like a perfect fit for my needs, I decided to give it a try.

There are three ways to use mitmproxy:

  • Command line interface for terminal usage
  • Web application for browser-based interaction
  • API for Python code integration

For this article, I’ll focus on the web application approach.

As additional requirements, I wanted to avoid modifying my Chrome installation, installing fake SSL certificates, and ensure easy setup and teardown of this system. This led me to create a Docker-based setup using two containers:

All the source of the following experiment is accessible in this Github repository.

This is the Docker Compose configuration and my custom Chrome image:

# Docker compose

version: "3.8"
services:
mitmproxy:
image: mitmproxy/mitmproxy
volumes:
# This is where the image will store certificates for HTTPS decryption
- ./mitmproxy-cetificates:/home/mitmproxy/.mitmproxy
ports:
# 8080 port will be for the intercept requests
- "8080:8080"
# 8081 port will be to reach the web interface
- "127.0.0.1:8081:8081"
command: mitmweb --showhost --web-host 0.0.0.0 --set web_password="mitm"
networks:
- proxy
chrome-vnc:
build:
# Custom chrome image is in the same directory
# and is called chrome-vnc.Dockerfile
context: .
dockerfile: chrome-vnc.Dockerfile
volumes:
# This is where I store my pages for experimentation
- ./html-pages:/root/html-pages
platform: linux/amd64
ports:
# This is the port for VNC connection
- "5900:5900"
depends_on:
- mitmproxy
networks:
- proxy
environment:
# Set proxy configuration when starting container
- HTTP_PROXY=http://mitmproxy:8080
- HTTPS_PROXY=http://mitmproxy:8080

networks:
proxy:
driver: bridge

# Custom Chrome image 

# Use a lightweight base image
FROM --platform=linux/amd64 ubuntu:24.04

# Set environment variables to avoid prompts during installation
ENV DISPLAY=:99
ENV CHROME_VERSION="google-chrome-stable"

# Update and install necessary packages
RUN apt-get update && \
apt-get install -y wget gnupg x11vnc xvfb fluxbox libnss3-tools && \
apt-get clean

# Set the Chrome repo.
RUN wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | apt-key add - && \
sh -c 'echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'

# Install Chrome.
RUN apt-get update && apt-get install -y $CHROME_VERSION

# Set up VNC server with a fixed password
RUN mkdir -p ~/.vnc && \
x11vnc -storepasswd 'password' ~/.vnc/passwd

# Install certificates
RUN mkdir mkdir -p $HOME/.pki/nssdb
RUN certutil -N -d $HOME/.pki/nssdb --empty-password
COPY mitmproxy-cetificates/mitmproxy-ca.pem /usr/local/share/ca-certificates/mitmproxy-ca.pem
RUN certutil -d $HOME/.pki/nssdb -A -t "C,," -n mitmproxy -i /usr/local/share/ca-certificates/mitmproxy-ca.pem

# Expose the port for VNC
EXPOSE 5900
CMD ["sh", "-c", "Xvfb :99 -screen 0 1024x768x16 & x11vnc -display :99 -forever -usepw \
& fluxbox \
& DISPLAY=:99 google-chrome"]

When completed, running docker-compose up launches these two containers. This allows the three following points:

  • Access Chrome through a VNC client via port 5900.
  • Access the mitmproxy web interface via localhost:8081.
  • All traffic from the container with Chrome will go through mitmproxy on port 8080.

All components are configured for testing. When I repeat the same test as my initial attempt with this setup, I can observe spellchecker requests. This confirms what I suspected: spellcheck requests are being fired underneath.

What about the password fields

Having confirmed that regular text inputs were being spellchecked, I then examined whether the same spellchecking behavior occurred in password fields. I created a minimal test page to reduce network traffic and better isolate suspicious requests. According to a 2022 article, password fields are spellchecked only when their content are made visible (un-hidden). Just to be clear, un-hiding a password involves turning a password field into a text field.

<!DOCTYPE html>
<html>
<head>
<title>Login</title>
</head>
<body>
<h1>Login</h1>
<input type="text" name="username" placeholder="Username" />
<div style="display: flex">
<input type="password" name="password" placeholder="Password" />
<button>Unhide</button>
</div>
</body>
<script>
document.querySelector("button").addEventListener("click", function () {
var password = document.querySelector("input[name='password']");
var button = document.querySelector("button");
if (password.getAttribute("type") === "password") {
password.setAttribute("type", "text");
button.textContent = "Hide";
} else {
password.setAttribute("type", "password");
button.textContent = "Unhide";
}
});
</script>
</html>

After conducting multiple tests with my setup, I couldn’t detect any suspicious requests. Upon investigating the Chromium repository, I found an issue and the related PR indicating that this behavior previously existed but has since been patched. Available from Chromium version 106, this patch ensures that any field previously of type password remains non-spell-checkable even if it is converted to a text field. Since Chrome and Edge are Chromium-based browsers, this explains why I couldn’t reproduce the behavior.

The claim that Chrome and Edge send passwords through spellchecker is no longer correct.

Does spellchecker attribute still hold importance?

Even if the risk to inadvertently sent password to spellcheck services has been mitigated, the spellchecker attribute continues to play a significant role in safeguarding user confidentiality.

Other types of sensitive data can still be at risk. Personal information such as names, addresses, or any proprietary data entered into forms are often subject to spellchecking. Ensuring that these inputs are not spellchecked can prevent potential leaks of sensitive information. Moreover, these fields do not need to be spellchecked. Additionally, the user experience will be improved since this text will not be underlined with red dots. You win on all counts.

As a developer, you must decide whether to disable the spellchecker based on the nature of the field. This decision requires carefully weighing enhanced security against reduced user experience, as preventing potential data leaks means sacrificing a widely-used convenience feature in the browser. It’s why you must not disable spellcheck everywhere.

Sign-up and sign-in forms are a good starting point, but you should evaluate each form field individually. Ask yourself whether the field will contain Personally Identifiable Information or Protected Health Information that could directly identify an individual. This analysis will help determine which fields require additional privacy protection measures.

How can I protect my users against spell-jacking?

As a developer, protecting users against spell-jacking is challenging since most potential leaks can originate from the client’s browser configuration (settings, extensions, plugins, etc.). However there are few things that you can do :

  1. Configure Spellcheck Attributes: For input fields that handle sensitive information, such as passwords, personal details, or proprietary data, ensure the spellcheck attribute is set to false. This prevents the browser spellchecker from looking the contents of these fields and potentially send it to external spellchecking services.
    <input type="text" spellcheck="false" /
  2. Use Secure Input Types: For password fields, use the type=”password” attribute, which inherently prevents spellchecking and also masks the input for added security.
  3. Inform Users: When it’s possible, inform users about the importance of checking browser settings related to spellchecking and privacy. Encourage them to disable network spellchecking, especially when entering sensitive data, to enhance their security and privacy.

The next time you create an input field on your website, consider all these implications. You might prevent valuable data from being exposed! At Doctolib, we started to protect our password fields directly in our design system. Since our company works with Personal Identifiable Information and Protected Health Information, it’s at the heart of our shared commitment to our users.

Thank you for reading. If you have any questions or insights to share, feel free to reach out or leave a comment.