Next.js and the corrupt middleware: the authorizing artifact

zhero-web-sec.github.io

94 points by ash a year ago · 37 comments

Reader

kfarr a year ago

Timeline is interesting

Timeline:

02/27/2025: vulnerability reported to the maintainers (specifying that only versions between 12.0.0 and 12.0.7 were vulnerable, which was our understanding at the time)

03/01/2025: second email sent explaining that all versions were ultimately vulnerable, including the latest stable releases

03/05/2025: initial response received from the Vercel team explaining that versions 12.x were no longer supported/maintained (probably hadn’t read the second email/security advisory template indicating that all were vulnerable)

03/05/2025: another email sent so that the team could quickly take a look at the second email/security advisory template

03/11/2025: another email sent to find out whether or not the new information had been taken into account

03/17/2025: email received from the Vercel team confirming that the information had been taken into account

03/18/2025: email received from the Vercel team: the report had been accepted, and the patch was implemented. Version 15.2.3 was released a few hours later, containing the fix (+backports)

03/21/2025: publication of the security advisory

simonw a year ago

"initial response received from the Vercel team explaining that versions 12.x were no longer supported/maintained"
That doesn't mean they shouldn't issue an alert to developers still running those versions advising them to upgrade ASAP.
nine_k a year ago

OK, tangentially: let's assume that Next is poorly maintained; what are some good alternatives? Of course everything that Next does can be assembled by hand from various smaller modules on top of Express, or similar. What are some more cohesive sets?
- solardev a year ago
  
  I don't think there's anything quite as featureful. For basic sites, Astro is fine, but it's not as powerful. Vite can be used for basic client or server side renders. Nuxt and SvelteKit have some of the basic features.
  But I don't think there is a drop-in replacement for ALL that Next does. The strength of Next is in packaging together what would otherwise be like twenty different packages and servers (especially if you make use of all the Vercel specific features). And then it adds incredibly powerful (but often complex) hybrid caching strategies that combine what would traditionally be done by different daemons altogether (a KV store, a memory cache, a HTTP cache, CDN. etc.). And then it adds a bunch of additional features like the middleware layer, image processing and caching, etc. I don't know of any other frontend-focused JS framework with such features in one package.
  These are more common in the full-stack world. Next takes some of those traditional backend concerns and puts them in the hands of frontend devs, for better or worse. If you know a bit of both, it can be a great shortcut. If you overestimate your ability/knowledge, it can be a great footgun.
  - the_mitsuhiko a year ago
    
    > But I don't think there is a drop-in replacement for ALL that Next does.
    The entirety of vite + tanstack (in particular the upcoming tanstack-start) is getting quite close. For quite a few uses that folks currently use next fork, I would argue that much of what tanstack does is a better fit. Eg: non marketing sites, but SaaS style dashboards.
    
    solardev a year ago
    
    Does that put you in a situation where the builder/bundler is made by a different vendor than the router & cache management layer?
    That was one of the nice benefits of Next when it first came out, vs Frankensteining these basic concerns together on top of React with a bunch of different libs that don't always track each other in terms of upgrade compatibility, often resulting in dependency hell.
    Is that still the case today?
    
    the_mitsuhiko a year ago
    
    > Does that put you in a situation where the builder/bundler is made by a different vendor than the router & cache management layer?
    Like in next? Next uses webpack (at least for the most part, there is now also turbo support but it's limited to dev builds for now) which is built by other people. Tanstack is intentionally building on vite and from what I can tell there is quite a deep cooperation going on. Most frameworks outside of the Vercel sphere have all put themselves on vite and started embracing it. Solid, Vue, Remix and Tanstack are all on vite and leveraging that rather than building their own infrastructure.
    I think next.js is a terrific project for the record, but I happen to mostly sit in the space where it doesn't quite play out its strengths. [1] So I'm quite used to frankensteining over the years and Vite has made my life much more pleasant in that regard. It feels quite cohesive and it's so damn quick compared to the status quo ante.
    [1]: That thing is highly interactive SaaS software with backends that are not written in JavaScript.
johnnyAghands a year ago

What is intersting?
- edoceo a year ago
  
  The lag & missing key details
  - johnnyAghands a year ago
    
    Ah ok, yeah.. unfortnutely this type of lag/mismgmt is pretty common once a company gets big enough. Often times the right people don't get involved on first-pass... even at tech-first companies like this -- though at that point perhaps you're no longer tech-first :/
    
    simonw a year ago
    
    Companies that get large enough to have a dedicated security team can often reverse this trend.

iterateoften a year ago

Maybe im misunderstanding how people are building endpoints these days, but every post about this I see how it can bypass auth.

Wouldn’t this bypass auth only for sites where auth is true/false?

I’ve never worked on a site were auth is a boolean. Auth is always a relative. The middleware is only there to identify the user. Then when querying for objects, you query objects related to that user.

Or if you are serving an admin page you check that the user is an admin.

I honestly find it more astounding that people put an admin security check to check the url of a page and redirect away in a middleware and no security on the views themselves.

Is this form of checking paths in middleware officially from NextJS or did people just get lazy? Seems like the worst way to build auth I could ever dream up across any framework or language.

If a middleware is bypassed all endpoints should return empty responses. In my nextjs apps the middleware is simply a convenience method for the user if they are logged out they get redirected to the login page. But all api endpoints check for the active user and serve objects relative to the user.

flufluflufluffy a year ago

> “Is this form of checking paths in middleware officially from NextJS or did people just get lazy?”
This is a common way to implement auth in many frameworks, not just JS ones. Off the top of my head I know that Laravel (PHP) does it this way. I guess you could call it “being lazy” but most people would refer to it using the age-old software engineering terms “don’t repeat yourself” and “separation of concerns.”
The authentication middleware can choose to redirect or not based on many things: did the user provide the correct credentials, does the user have the necessary permissions to access the requested resource, etc… And you can put all the logic for determining those many things in a single place, so that it can easily be updated. Individual routes can remain as they are, and you don’t have to worry about forgetting to implement some part of the auth checking logic on one of them.
Etheryte a year ago

This attack isn't only about auth, auth is just the most drastic obvious example. Middleware is often used for a wide range of sanity checks. You could bypass, say, limit checks and ask the server to return an infinite number of items per page, quickly overloading the server and resulting in DoS.
ljm a year ago

I don’t know how to put this any other way but my experience with NextJS or just JS-first full stack is that they are still first and foremost a tech stack for frontend devs and the backend piece in that context is an afterthought.
I’ve worked on a few in my time and ‘API routes’ were rarely, if ever, authenticated, and there wouldn’t be a consistent strategy for data access. If anything, everything was built in the context of satisfying a react hook and getting on with the UI.
But I don’t squarely blame developers for it, it’s more like an inverted full stack where the browser is first class and the serverless edge SSR ISG SSG app router component craziness does not help you build out a stable API. Does sell a hell of a lot of SaaS though.
simonw a year ago

How serious this vulnerability is depends entirely on how the site that's being attacked uses middleware. The auth thing is just the most obvious example of how an attacker can do bad things if they have the ability to selectively disable middleware by passing names as a colon separated list in an HTTP header.
(I've built sites that would have been affected by this in the past, had I used Next and middleware for auth. I've worked on plenty of systems where there are only a small set of users each with the same level of permissions - gating private documentation for example.)
fabian2k a year ago

Probably depends on the complexity of the permissions in the application. I'd also expect something more along the lines you described for more complex applications. The middleware would do authentication, but then just attach that information to the request. Later parts would then use the attached information to make decisions about permissions/authorization.
In more complex cases this would be outside middleware, so it should fail as no authentication/authorization information is attached to the request if you skip that middleware.
But putting the security checks into middleware could easily make sense for more rigid or simple cases. In C# for example I can add attributes to the methods that handle each endpoint. So if you need a basic admin/no-admin check you could add a [RequireAdmin] attribute on the relevant endpoints and use a middleware to check that.
I would agree that checking the URL in middleware to make decision about permissions would be a bad idea, it moves this important check to a mostly invisible place.
This probably also allows different attacks, e.g. skipping middleware that does other security-relevant checks (maybe anti-CSRF mechanisms could be vulnerable here).
hombre_fatal a year ago

I don’t think it really matters. All you have to do is write middleware or handlers that assume upstream middleware have run, and then that’s vulnerable to this attack.
For example it’s common to write middleware on /admin so that all of your /admin/* handlers don’t have to repeat the same authz logic. And the platform breaking invariants that you should be able to depend on is why it’s a security bug.

nine_k a year ago

What surprises me here is that the client side of the request / response is not considered a cunning, bitter enemy, as it should be. Why is x-middleware-subrequest even accepted in production? Why is x-middleware-rewrite even returned? They are instrumental to the attack, and the client has no business accessing them, ever, in my book.

If these headers are only expected to be available within a trusted zone, and some fronting HTTP server should strip them from incoming requests and outgoing responses, why are they named like regular HTTP headers, and not in some scary, easy-to-filter-way, like x-INTERNAL-ONLY-middleware-something?

To my mind, the server should accept the bare minimum of headers needed to serve the request, and issue the minimum amount of headers to provide a well-formed response, while being completely opaque to the client. Any nifty diagnostics like x-middleware-rewrire belong to the logs; correlate by request ID. Any nifty internal processing tweaks in plain text, like x-middleware-subrequest, are, to my mind, bad architecture. If you need to pass such info between HTTP endpoints internally, use something like a JWT.

simonw a year ago

The vulnerability can be understood through this code snippet:

  const subreq = params.request.headers['x-middleware-subrequest'];
  const subrequests = typeof subreq === 'string' ? subreq.split(':') : [];
  // ...
  for (const middleware of this.middleware || []) {
    // ...
    if (subrequests.includes(middlewareInfo.name)) {
      result = {
        response: NextResponse.next(),
        waitUntil: Promise.resolve(),
      };
      continue;
    }
  }

Pass an x-middleware-subrequest HTTP header with a colon-separated list of middleware names to skip.

https://github.com/vercel/next.js/blob/v12.0.7/packages/next...

solid_fuel a year ago

What would this feature ever be used for? I'm surprised such a thing exists, instead of simply defining a different set of routes for a different set of middleware

kawsper a year ago

Does anyone know which versions of Next.js that is supported?

I don't seem to be able to find a promise from Vercel, but https://endoflife.date/nextjs mentions that 15 and 14 gets security support.

ldjkfkdsjnv a year ago

The culture of security within FAANG could not be more opposite than the way that vercel handled this. In big tech, this would have been looked at in 48 hours, and across thousands of systems all oncalls would have been paged to do an emergency deploy. Probably within 5 days, almost the whole company would have deployed the patch.

Vercel to me seems like it is run by hype men, and the CEO is certainly technical, but these people are not in the weeds in the way they come off.

czk a year ago

Also worth noting that this commit in Dec 2024 previously added a bunch of internal headers (aside from this one) to a restricted external access list (one of them was vulnerable to SSRF) and there was never a CVE for it.
https://github.com/vercel/next.js/pull/73482/files
Source: https://news.ycombinator.com/item?id=43449986
tengbretson a year ago

Maybe they based their on-call protocol on what people say they want in hn threads.

soulchild77 a year ago

This very recent PR updates the docs to basically remove all common (and previously recommended) middleware use-cases, rendering them almost completely useless:

https://github.com/vercel/next.js/pull/77438

rohan_ a year ago

Most don't understand this issue:

Auth middleware is used for _routing_ (e.g. if you're not signed-in, you'll be redirected to the sign-in page).

This just means a 500 is thrown due to the auth() call returning null on the server.

simonw a year ago

That depends entirely on how you implemented your middleware.
This vulnerability also isn't explicitly about auth: it's about attackers being able to send a colon separated list of middleware to skip. That could affect applications in all kinds of unexpected ways depending on what they are using middleware for and how they designed their application.

johnnyAghands a year ago

Can someone tl;dr: why there is even logic to bypass middleware in the first-place, I feel like I'm missing something obvious here...

simonw a year ago

It's quite common for server-side web frameworks to send a single request through their stack multiple times / especially when there is any form of "middleware" concept involved.
Often there's a need to skip some middleware on the second or third time through.
I've built systems in the past that do all sorts of re-dispatching.
One example: in development my API might live at /api/... but in production I might use api.my.site - with middleware that detects that host, rewrites the incoming request to add that /api/ prefix and then runs it though the stack again.
Authentication is a very common way this pattern is applied - check cookies / authorization headers / whatever, then add the authenticate user to the request somehow and re-dispatch the request through the stack so other layers can see who the user is.
- the_mitsuhiko a year ago
  
  While you are generally right here I wonder how common this is with middlewares. Many have order dependencies and there are normally no loops involved. I don’t think I have come across this for middlewares at least. Kinda curious about the particular motivation here.
  - simonw a year ago
    
    Yeah I've been contemplating this with my own Datasette project recently: it doesn't have an official mechanism for "redispatch this request from the root again" but I've been tempted to add one.
    My GraphQL plugin for example works by firing off internal requests against Datasette's REST API and I ended up needing some gnarly hacks to get authentication to work with that.
    
    the_mitsuhiko a year ago
    
    From my experience there are dragons if you try to make this work in general through a mechanism like that. I have only ever regretted this kind of stuff later when the interactions were not entirely clear.
    I like this kind of mechanism (circuit-breaker) as a last ditch effort to prevent failures by then erroring out before it does more damage. I never made any good experiences with silently disabling stuff.
  - johnnyAghands a year ago
    
    Yeah that was my understanding as well —- but I’m not a framework author so wasn’t sure if this was a common practice.
    Trade-offs aside, I personally find the idea of re-running the request through the stack a bit hacky.
- ljm a year ago
  
  NextJS however is likely constrained by its architecture and the decision to use serverless and edge compute for the backend.
  Relying on obscure headers for conditional logic this way is certainly one way to avoid bringing in an extra dependency. And the middleware concept itself is fairly primitive compared to what you could do in any server-side API.
  Arguably, though, the middleware itself is being trusted as the entry-point to the API when it’s barely more than a reverse proxy. It’s not really a vulnerability if you only auth’d the middleware and not your actual routes.
teaearlgraycold a year ago

To prevent infinite loops internally.

Settings

Next.js and the corrupt middleware: the authorizing artifact

Keyboard Shortcuts