Exploiting CSRF against search with Lucene

23 points by bobedybobbob 10 years ago · 10 comments

Reader

So the article suggests using a timkng attack on a Lucene searchbox to determine if an item exists or not (at least thats what I gather).

Considering most likely the searchbox will already tell youif something exists, whats the purpose?

I think I'm missing something here.

breatheoften 10 years ago

This is a csrf attack -- it allows a site B (malicious) to exploit the trust relationship that exists between site A (running some Lucerne index behind a web service -- perhaps elastic search?) and a user C's web browser. Suppose site A allows logged in users to search an index -- user C is logged into A and user C visits site B while logged into A. Site B sends down JavaScript telling C's browser to send requests to site A -- because the user is logged into A, A sends back responses containing sensitive data. The JavaScript from B has no access to the data that was sent from A due to the single origin policy however it does have the ability to time how long the response took -- hence the mechanics of this attack.
As mentioned in the article, A can protect against this by requiring a csrf token to be included in all the requests sent to it (this is on top of the authentication/session cookie which establishes the trust relationship between A and C's browser. A csrf token is a random unguessable token that the server sends to C's browser in a form that cannot be accessed by B -- the JavaScript from A is expected to retrieve this token and send it along with future requests to A. The server A then needs to validate that any request from browser C contains the csrf token -- this allows A to distinguish between requests from browser C which were generated on behalf of code from A (should be allowed) and requests which were generated by browser C on behalf of code from some other domain (potentially malicious)
- Illniyar 10 years ago
  
  Thank you, I missed that.
jimrandomh 10 years ago

Many deployments of Lucene will restrict what results users can see based on who they're logged in as. For example, consider a webmail implementation which lets users search their own emails, stored in Lucene, with an index on the subject. This attack would allow someone to extract the subject lines from someone else's inbox.
- merb 10 years ago
  
  not if you have a server that handles the search box and is CSRF save. And if it isn't csrf save you don't need the timing attack since you already have the content or not.
  - tshadwell 10 years ago
    
    This is incorrect. As I believe the author notes, Same Origin Policy prevents you accessing the results of endpoints you can CSRF with at least one exception (JSONP). The author uses the timing of the forged response to determine if the value was cached. Again, the attacker cannot access any information from a cross-site forged request in this case other than timing data.
    
    merb 10 years ago
    
    it's not. The Webserver will MOSTLY handle authentication and CORS BEFORE sending requests to Lucense / ES. Everything else is just, dumb. And wasted Resource Power. You could even use Lucene's Query engine, you just need to proxy everything.
    User Input -> (CSRF / Auth) from Your Server -> Your Server -> Lucene
    Most implementations will do it like that since everything else is unsafe by design, so the article is pointless.

chatman 10 years ago

Lucene has no HTTP interface of its own. This is not a Lucene security issue.

100k 10 years ago

Another excellent reason to write your own query parser instead of using Lucene's. Lucene's query parser is way too powerful to expose to end users.

isoos 10 years ago

One shall say that never expose your backend queries to the public. This goes for SQL, Lucene or any other database and search technology, as they will reveal much more than you'd like to.

Settings

Exploiting CSRF against search with Lucene

Keyboard Shortcuts