A developer's perspective: the problem with screen reader testing
jaketracey.com> Sadly, there’s not currently any way for a developer to identify the type or version of a screen reader that is being used
That's at least partly because there are some vocal blind people who don't want websites to be able to know that they're running a screen reader at all, for fear of discrimination. I believe that stance is misguided. I'll illustrate why with a story. A few years ago, my best friend, who is blind, was trying to do something on PayPal, and couldn't complete the task with his screen reader. I tried to do the same task, with the same screen reader, and didn't have any problem. So I figure we got caught in an A/B test or phased rollout. And it occurred to me that PayPal would never know that he failed to complete the process because he was using a screen reader. If we allowed websites to know what screen reader a user is running, they could collect useful data that could help them improve. And frankly, the problem that we actually have with accessibility is not willful discrimination, but indifference.
P.S. It was a weird feeling to hear the name of a product that I developed from the ground up in the "What about ..." section heading. Yeah, I'm talking about System Access, the most obscure (and perhaps poorly named) screen reader mentioned in the article. No offense taken though; I understand where the author is coming from.
I can't imagine websites doing anything but dropping screen reader users into a separate experience; rather than doing the work to design for accessibility in all their experiences and auditing and testing to confirm.
Separate, but equal isn't equal.
> I can't imagine websites doing anything but dropping screen reader users into a separate experience; rather than doing the work to design for accessibility in all their experiences
I agree with this. I will say that developers who are 100% unwilling to entertain changes to their "main" UI to accommodate accessibility are in the clear minority. But even the most well-meaning devs and designers will ask questions like, "could we just do this for screen reader users ...?". It's a slippery slope from there, with technical debt, legacy implementation and ghetto user flows all the way down.
I wouldn't be surprised if some sighted users turn on the screen reader to get the 'worse' (read: superior) user experience. In some cases, the screen reader supported experience would be faster, easier to use, and more reliable.
It might well be superior when released, but will it continue to support everything, or will it sit around with no updates and miss out on new features, or simply stop working and have no one notice?
In large, the reason the screen reader versions would exist is the threat of lawsuits due to the ADA. I work professionally in this realm and accessibility is a huge part of our development and QA process. A release doesn't go out unless thoroughly tested for accessibility. I presume that would just map on to the screenreader site forks.
Still, mainstream developers make decisions based on data. So if we want them to serve our community, shouldn't we help them by giving them data on how we use their sites and apps?
> Still, mainstream developers make decisions based on data. So if we want them to serve our community, shouldn't we help them by giving them data on how we use their sites and apps?
Quite possibly. But there is an argument for that being voluntary rather than mandatory, even if that decision is provided via a switch in OS-or-screen-reader-level settings. But then there are questions such as:
- Should such a setting be switched on by default?
- What sort of users are likely to enable/disable it?
- If there is any correlation between level of technical skill and privacy awareness, will results be skewed towards the users with lower levels of technical knowledge?
- If a product team uses the data to solicit feedback from users who are detected to be running an assistive technology, but quote "power users" unquote have turned off that detection, again, will that feedback be representative?
Anecdotally, I will say that after working with some iOS development teams where this behaviour is natively available, cases which rely on an explicit detection of VoiceOver still seem rare. Whereas, use of features like accessibility label overrides without an explicit check are used a lot. On the other hand, it's becoming much more common to perform explicit checks for more visually-oriented accessibility features, like reduced motion or high contrast, some of which can even be carried out on the web now.
We also need to take into account the biases in this data. If a given website isn't screen-reader accessible, the screen reader usage statistics are going to be very low. This shouldn't be surprising to anyone, after all, why would blind people go to a website they can't comfortably use? Some developers would probably still use that data to explain why accessibility isn't important, though.
Regardless of that, as a blind person, I'm all for a flag that lets you detect screen reader usage. I believe a good compromise here is disabling this flag in private/incognito mode.
What if screenreaders randomly switched between revealing and hiding themselves? Developers would get statistic but couldn't deliver a different UI to screenreader users.
> Separate, but equal isn't equal.
The world of mobile web sites demonstrates that this swings both ways. Some web sites have mobile-specific interfaces that greatly enhance the experience when using the site on a small screen touch device compared to a large screen keyboard/mouse device. Wikipedia for example, the desktop interface is terrible on a phone and the mobile interface is terrible on a desktop. Obviously other sites are well known to have swung in the opposite direction and whichever variant is not their primary target is significantly worse.
I think that in a similar way the ability to serve specific content optimized for screen readers would allow those who care to deliver a much better experience, but likewise those who don't might make something worse.
---
Of course the simple web site purist in me wants to say that every web site should just stick to basic formatting and text-primary designs that work well in any browsers, but that idea is not just unrealistic thanks to marketing people but it would simply not work for so many modern sites and web applications.
I say the solution is the same as it's always been for when bad web sites do stupid things based on user agent or whatever else. Lie to them. As long as the screen reader can turn off the identifier tag it seems like the worst case outcome is a minor annoyance to have to blacklist that web site from receiving the tag.
--- edit: also a side thought, the things that make a screen reader work well also make it easier to separate content from ads, so there is an inherent commercial incentive for ad-supported content providers to only do the minimum legally required.
I’m genuinely asking here out of lack of experience — Is it separate but equal, or is it customizing the user experience? I’m not blind but it always seemed weird to me to try and retrofit visual abstractions to someone who may not be as comfortable with those abstractions then to build an experience directly to that user. Mobile sites often have different experiences than desktop sites, and we would say that’s an experience customized for a small screen, not separate but equal. Wouldn’t designing a separate experience from the ground up built for someone with visual impairment be better than fitting them into a system that’s probably wasn’t set out for them in the first place? You’re hampering the experience because you’ll always be biasing for the sighted, unless your ux lead is visually impaired.
> I’m genuinely asking here out of lack of experience — Is it separate but equal, or is it customizing the user experience?
It's separation, because it introduces the chances of an inequitable experience being created as the "customised" version for screen reader users drifts out of sync. This may not even be down to anything malicious; plenty of companies create mobile-friendly versions of websites and apps, only to find that as years go by, they no longer have the budget to facilitate their upkeep. Far from assigning them to the trash heap of history, these often continue to be provided, offering a substandard experience, and this is despite companies having so much data available about high mobile usage. The chances of it happening to an accessibility-specific version are higher because the likely user take-up is lower, and as this post and thread indicate, speciality skills are often required to make a good job of it. Not like responsive web design, where experts are ten a penny.
There are other reasons this approach is flawed, of course:
1. Screen reader users aren't the only disabled people out there. Heck, even within that group, there are those who use the software as an absolute necessity, and others who use it in addition with other assistive tech like speech recognition or a magnifier. Nobody in their right mind is going to create a separate experience for every subset of disabled users. Even if they did, how would they be surfaced?
2. Some people are only temporarily disabled, e.g. because of an injury. They don't have the lifelong feel for how to look for accessibility settings or separate modes, so they aren't likely to benefit from a shiny accessible version. Making your main product inclusive prepares for this.
3. The aim should be to allow customisations to be included which aren't intrusive to those who don't need them, e.g. via techniques like WAI-ARIA[1] which allow additional context to be added for screen reader users while remaining invisible to everyone else.
Do you also take exception to braile translations replacing "red button" with "third button down"?
> Do you also take exception to braile translations replacing "red button" with "third button down"?
This is a different case for two significant reasons:
1. Utilising purely sensory references in web materials is a violation of WCAG success criterion 1.3.3, Sensory Characteristics (level A)[1]. Naturally, if you're physically brailling something, it's probably not web content but the point still stands: if your web page tells people to press the red button on the left it's not accessible to some.
2. By including alternative instructions in some form/format (or just making them inclusive from the start), you're not asking anybody to out themselves as having a permanent or circumstantial disability. You're just preparing for the case that they do. This is assuming the alternative option is available to everyone (e.g. on a web page or as a physical braille sign), rather than as something that is locked away and must be explicitly requested.
[1] https://www.w3.org/WAI/WCAG21/Understanding/sensory-characte...
well, the fact is the use of a screen reader (or at least most) can be detected relatively easily by having an element in the screen reader flow that is hidden from sighted users, and if you wanted for extra surety having an element that will be interacted with by anyone accessing the page - but that is hidden from screen readers but available to sighted users.
detecting interaction with visually hidden element, but no interaction with visually shown element, can be registered as screen reader. Obviously this can also happen with bots, but probably you would prefer to register the bot as screen reader rather than to try to filter them out and inadvertently filter out a screen reader (which can often happen with bot filtering strategies anyway)
> well, the fact is the use of a screen reader (or at least most) can be detected relatively easily by having an element in the screen reader flow that is hidden from sighted users
This isn't true, at least not enough to be useful. The majority of screen readers present web pages in a kind of virtual buffer, which users can browse with the arrow keys and other shortcuts (e.g. H to jump by heading). In this mode, the majority of screen readers don't fire focus events to let you know, "hey, your screen-reader-only element has been hit". You would be limited to:
1. users operating a screen reader which does fire focus events;
2. trying to highjack scroll events for this purpose; and
3. less technically-inclined users (read: beginners) who move through the page with Tab/Shift+Tab before they've learned the other keystrokes their screen reader offers.
I guess #3 explains my confusion, as whenever I use a screenreader it's for testing and I just tab through.
I suppose each screen reader then has its own hotkey shortcuts? Do you know of any resource that puts shortcuts of most popular screen readers together?
If you happen to be using Windows, Narrator has a well-written tutorial (no, I didn't write it) that goes into the most common shortcuts, and you can get a complete list with Narrator+F1 (the "Narrator" key is either Caps Lock or Insert). I believe VoiceOver on Mac also has a tutorial.
> there are some vocal blind people who don't want websites to be able to know that they're running a screen reader at all, for fear of discrimination
This makes sense, it's probably not a good idea to give that information to the website. It would only make people using screen readers more vulnerable to scams.
Similar to how "scam callers" work. If a "scam caller" rings someone and an older person answers the phone. The fact that they can hear an older person, means that they have found an easy target, and can use specific tactics to take advantage of them (techno jargon, etc..). If they don't have that information, then it's much harder for them to use specialised tactics to manipulate people.
I doubt that anyone would take advantage of the ability to automatically identify a screen reader visiting a website to scam blind people. In any case, I think that lack of usage data is a much bigger problem.
I certainly had to tweak HTTP user agent headers in the past to get content out of some websites. IIRC Twitter (or possibly some other website(s)?), until recently, seemed to be somewhat usable without JS, but the content was covered and made unusable with a message about JS being required -- which isn't behaviour based on reported user agent capabilities, but still intentional breaking depending on them. It's not hard to imagine that some websites will decide to not support screen readers, and to block them completely and explicitly, to avoid related issue reports and complaints (again, as they do with JS, cookies, CSS). Or at least messages like "you need to upgrade to JAWS in order to access this website", since the website will be tested just with that.
Actually I think varying behaviour depending on reported user agent was a source of frustration to users from the beginning. Maybe it'll turn out fine this time or in this case, but those concerns sound valid to me.
To say that it's fine to disclose the use of a screen reader because willful discrimination is not a problem we actually have -- that, if I'm using the phrase properly, begs the question.
I think this lack of ability to distinguish screen reader users also comes with some serious drawbacks. I've been in situations (I do a lot of heavy data viz work) where providing alternative content for impaired users as well as graphical content for sighted users is a performance no go. If I could reliably detect which kind of content is preferred, then it would not be a problem. But if >90% of your audience is to suffer choppy frame rates (and 100% of your bosses) just to generate invisible content, the end result is that there is no alternative content (or crappy one: "A chart showing the relationship between instance count and cost" <- this leaves out all the interesting stuff).
I completely agree with you. At the very least it could be a feature you opt in to. Would help a ton with the issue of fragmentation as well.
One interesting fact to note, that WebAIM doesn't reflect at all, is the recent rise in usage of Chinese screen readers. ZDSR for Windows is still an insignificant and meaningless minority, but I'm not sure how long it's going to remain that way, considering it hasn't been available outside of China for very long. However, Commentary for Android is getting some significant usage, particularly in poorer countries where Android is the only thing most people can afford. It offers superrior experience and performance to Talkback, and isn't prohibitively expensive, so it's getting some popularity.
I wouldn't worry about testing with those screen readers for now, as there still aren't that many people using them, but it's something worth looking out for in the future.
Thanks for this info, it was the first time I’d run across either of them.
> According to the latest WebAIM Screen Reader User Survey, when it comes to desktop screen reader usage, JAWS and NVDA are practically equal in usage, with around 40% of respondents reporting that they use one or the other.
I wouldn't take this data too seriously. The WebAIM user survey is only available in english, and usually filled by tech-savvy blind users who are part of the blind community and are told about it.
At this point, JAWS is mostly used in corporate environments in the United States, mostly due to the number of scripts already written for it, a business-friendly (non GPL) license, and easy enforcement of restrictions given by IT, which are features that NVDA doesn't provide. Some countries give out JAWS for free to their blind residents, so the number of JAWS users there is probably going to be pretty significant too. However, in most parts of the world, NVDA is the screen reader most people use. As a person living in eastern Europe, with many friends from around the world (including the U.S.), I know exactly two people using JAWS as a daily driver.
My biggest problem when using a screenreader for testing, is that my usage isn't the same as a blind person would use it. I mostly press "next sentence" or tab and am really slow. My testing is also biased by the fact that I know how it looks (what is on each page, which part I'm trying to reach).
When I visited a blind person to test for us at a previous workplace, I was astonished about what we found. It was very different from our own attempts. His voice-speed and navigation was so fast that the parts we felt were sluggish just took him a second to navigate through. He had other issues, however.
FYI: There is a small, low traffic group for blind developers: https://groups.google.com/g/blind-dev-works
I'm one of the co-owners.
I’m absolutely certain there are plenty of NFB members who will gladly provide free screen reader accessibility testing for apps they use if the developer is responsive to feedback.
Yes, developers should be responsive to feedback, and especially so when it comes to accessibility, but I'm not sure I'm entirely comfortable with the idea of relying entirely or primarily on unpaid work for an entire area of testing.
Of course, it always depends on the overall funding situation of the app, but if funding exists, then I think AX testing should be paid like the highly qualified work it is.
> I’m absolutely certain there are plenty of NFB members who will gladly provide free screen reader accessibility testing for apps they use if the developer is responsive to feedback.
You could be right, but keep in mind that this may not significantly lower the amount of time required for developers to understand and remediate problems. Users have wildly differing levels of technical experience, so you may end up with plenty of feedback that you then have to spend hours understanding, sorting, de-duplicating and following up on.
Not to mention the fact that, bluntly, users who aren't being paid as "experts" just may not be that willing to shit all over your product. I have encountered more than one case of a limited subset of screen reader users reporting a positive experience with a component which broke every rule in the book, and caused very real problems for users outside of that core group.
As a totally blind back-end developer I'm sure I'd be one of those people. If I'm doing accessibility testing for development tools my thoughts are probably worth while. If I'm doing accessibility testing for a bank my thoughts are probably less valuable. Unless it's horribly broken I'm tech savvy enough to usually get by. I don't expect all blind people to have 20 years of programming experience and the general technical aptitude that comes along with that.
Before doing screen reader testing on complex web components, what I see as some kind of lack box testing where you test your whole screen reader + browser stack, it is useful to have a look at what the browser passes to a screen reader. Especially Firefox has a very nice accessibility tree panel in the devtools these days. In my experience, the more visual tree that is shown there is also easier/faster to read for users that are not blind and are not that quick when using screen readers.
Also, keep in mind that something that technically works correctly with screen readers is just the beginning. User testing might reveal lots of issues you wouldn't think of yourself. And yes, I know that resources are usually limited and there is not much room for user testing, especially testing with screen reader users and other groups that have some kind of disability. I recently worked as the accessibility lead of a mobile COVID exposure notification app that had a very simple UI and a hard accessibility requirement. We had the luxury to do extensive user testing and even in this simple interface we found lots of small changes that improved the experience for screen reader users.
Would it be possible for you to do a writeup on what you found that might be transferrable?
Yes, I would like to publish some lessons in the future somewhere. However, a few quick takeaways:
* The microcopy matters, a lot. We had a button stating "I've got a notification: read what you should do after getting a notification" (from the top of my head and freely translated from Dutch, we didn't have an English translation back then). This was part of a bunch of buttons on the main screen that all gave information. Some screen reader users got confused and thought that they had a notification. If you don't see the visual layout, it is not obvious that this is just a plain button and not a bold text in red that is giving you a warning. * In the same category: the app has a status text that says "The app is working fine" or "The app is not working fine". Visually, the error state is signified by an exclamation mark and styling that makes clear that this is a serious issue. However, in text there is just one word, not, to signify that there is a serious issue. Following WCAG, the info signified by the exclamation mark icon was available in text, so no text alternative was required. However, we gave it a text alternative anyway to ensure screen reader users were also clearly alerted that something is wrong. Same goes for the "all is ok" icon, we gave that one a text alternative as well to ensure users all is fine.
In addition, I wish screenreaders had a text mode, where they print what they say and maybe provide cues on possible actions. Actual screenreader users work with astonishing speaking speeds and great memorization of commands, but without the experience a bare-bones but more visual interface would likely be easier to use.
They do! Not fully optimized for developers perhaps, but check out "Speech Viewer" in NVDA and the "Braille Viewer" and "Speech History" in JAWS. This guide has some screenshots:
https://www.accessibility-developer-guide.com/setup/screen-r...
Also Narrator's developer mode, which you can toggle with Control+Narrator+F12 (where the "Narrator" key is either Caps Lock or Insert).
Disclosure: I used to work on the Narrator team at Microsoft.
And, in addition to those JAWS tools, there is also JAWS Inspect (https://www.paciellogroup.com/products/jaws-inspect/)
> In addition, I wish screenreaders had a text mode, where they print what they say and maybe provide cues on possible actions.
NVDA, mentioned prominently in the article as a free and open source screen reader, has a floating-window-style speech viewer. I sometimes use it when demonstrating a screen reader user's experience of a particular component when asked to share my audio, because slowing down the screen reader to a rate that everyone on the call will understand will also make the meeting much longer.
JAWS also has a speech history viewer, and there are keystrokes to dump VoiceOver speech as text and audio files.
BTW, NVDA are hiring (in Australia) at the moment:
https://www.nvaccess.org/category/careers/ https://news.ycombinator.com/item?id=25639590
> When it comes to screen reader version fragmentation, there is very little in the way of either documentation or support for developers. Fixing issues often comes down to a case of trial and error, retesting and hoping for the best.
The author may be interested in the ARIA-AT project[1], which aims to thoroughly test assistive technology support for WAI-ARIA and HTML constructs. It's still a relatively young effort, but the community group is open and always happy for participation.
There are a number of browsers that run in Docker and that I can remote control (using Selenium, Playwright, Puppeteer or whatever) in my CI systems to run at least some smoke tests, making sure that basic features are available.
Does anyone know if something like that is available for screen readers - at least for a free and open source one?
> There are a number of browsers that run in Docker and that I can remote control (using Selenium, Playwright, Puppeteer or whatever) ... > > Does anyone know if something like that is available for screen readers - at least for a free and open source one?
Nothing in this space is really mature yet, but there are some efforts to make it a reality. The ARIA-AT project[1], which aims to test assistive technology support for various WAI-ARIA and HTML constructs, is aiming to automate its testing across multiple screen readers [2]. NVDA, the free and open source screen reader mentioned in the article, also includes some integration-style tests[3].
[1] https://github.com/w3c/aria-at [2] https://github.com/w3c/aria-at/issues/349 [3] https://github.com/nvaccess/nvda/tree/master/tests/system
Great to hear that people are working on this, thanks for sharing. Looking forward to seeing that work mature.
Forgive the second top-level comment, but I have some thoughts on Narrator and Edge. Disclosure: I worked on the Windows accessibility team at Microsoft during the transition from EdgeHTML to Chromium, and as a third-party screen reader developer before that. But I won't divulge anything confidential here.
It probably comes as no surprise that EdgeHTML and Chromium have completely different accessibility implementations. Narrator always had the best support for EdgeHTML. I was a third-party screen reader developer when EdgeHTML first came out, and for us third-party developers, EdgeHTML was a drastic change from IE. For over a decade, we had provided access to IE by injecting code into the IE process (yes, Windows lets you do that) and accessing the IE DOM in-process using COM. We did something similar for Firefox and Chromium, but using the IAccessible2 API (also COM-based). To improve security, old Edge disallowed this kind of injection; it could only be accessed through the UI Automation API. Narrator was built for this; the rest of us had to adapt after the fact. And since we could only access UIA through inter-process communication, not in-process like we did with the IE DOM and IAccessible2, there were performance problems, even with Narrator. (Luckily, I got to help solve those problems during my time on the Windows accessibility team.)
With Chromium (in both Google Chrome and the new Edge), screen readers can still inject code in-process and use the legacy IAccessible2 API. And NVDA, JAWS, and System Access (which I developed before joining Microsoft) do that. These third-party screen readers access Chrome and new Edge in the same way, at least inside the web content area, so if you're testing with one of these screen readers, it probably doesn't matter which browser you use. The situation with Narrator and Chromium-based browsers is more interesting. Narrator uses the UI Automation API to access all applications. Chromium has a native UIA implementation, largely contributed by the Edge team, but while that implementation is enabled by default in the new Edge, it isn't yet in Chrome. So Narrator accesses Edge using UIA. But for Chrome, and other Chromium-based apps (e.g. Electron apps), Narrator uses a bridge from IAccessible2 to UIA that's built into the UIA core module. So in corner cases, there may be differences in how Narrator behaves in Chrome and Edge.
So, should developers test with Narrator and/or Edge? Well, I may be too biased to answer that. But I think it's likely that Narrator usage is on the rise. While I was on the Narrator team at Microsoft, we heard from time to time about praise that Narrator was getting in the blind community. (Naturally I can't take full credit for that; it was a team effort.) Moreover, since Narrator is the option built into Windows, there will come a point (if it hasn't come already) when it's good enough for many users and they have no reason to seek a third-party alternative. Also, there are some PCs where Narrator is the only fully functional screen reader, specifically those running Windows 10 S (the variant that doesn't allow traditional side-loaded Win32 apps). I'd guess that an increasing number of students and users of corporate PCs are saddled with that variant of Windows. And while I can't say anything about future versions of Windows, one can make an educated guess based on the broader trajectory of the industry.
As for whether it's worth testing with Edge as opposed to Chrome, I don't know. Fortunately, browser usage data is readily available.
I did a bit more reading and noticed in the WebAIM stats that Narrator usage is indeed on the rise. Given that it is included in the OS, I would hope that it ends up as the defacto standard in much the same way as VoiceOver is for macOS.
Really interesting to hear some of the technical details from someone that worked on it - thank you!
Is there some software library that turns websites into text, that these screen readers all use, or do they implement their own?
> Is there some software library that turns websites into text, that these screen readers all use, or do they implement their own?
They implement their own. But the browser also has responsibilities in this area, to construct an accessibility tree from the DOM which screen readers can parse.
If you design your website simply, you don't need to know that the user is using a screen-reader, nor what other strange situation that you'd never even imagined or accounted for may be happening. They'll be able to use it either way, because you made it universally accessible, rather than covering certain classes or cases individually.