Exploiting CSRF against search with Lucene
idontplaydarts.comSo the article suggests using a timkng attack on a Lucene searchbox to determine if an item exists or not (at least thats what I gather).
Considering most likely the searchbox will already tell youif something exists, whats the purpose?
I think I'm missing something here.
This is a csrf attack -- it allows a site B (malicious) to exploit the trust relationship that exists between site A (running some Lucerne index behind a web service -- perhaps elastic search?) and a user C's web browser. Suppose site A allows logged in users to search an index -- user C is logged into A and user C visits site B while logged into A. Site B sends down JavaScript telling C's browser to send requests to site A -- because the user is logged into A, A sends back responses containing sensitive data. The JavaScript from B has no access to the data that was sent from A due to the single origin policy however it does have the ability to time how long the response took -- hence the mechanics of this attack.
As mentioned in the article, A can protect against this by requiring a csrf token to be included in all the requests sent to it (this is on top of the authentication/session cookie which establishes the trust relationship between A and C's browser. A csrf token is a random unguessable token that the server sends to C's browser in a form that cannot be accessed by B -- the JavaScript from A is expected to retrieve this token and send it along with future requests to A. The server A then needs to validate that any request from browser C contains the csrf token -- this allows A to distinguish between requests from browser C which were generated on behalf of code from A (should be allowed) and requests which were generated by browser C on behalf of code from some other domain (potentially malicious)
Thank you, I missed that.
Many deployments of Lucene will restrict what results users can see based on who they're logged in as. For example, consider a webmail implementation which lets users search their own emails, stored in Lucene, with an index on the subject. This attack would allow someone to extract the subject lines from someone else's inbox.
not if you have a server that handles the search box and is CSRF save. And if it isn't csrf save you don't need the timing attack since you already have the content or not.
This is incorrect. As I believe the author notes, Same Origin Policy prevents you accessing the results of endpoints you can CSRF with at least one exception (JSONP). The author uses the timing of the forged response to determine if the value was cached. Again, the attacker cannot access any information from a cross-site forged request in this case other than timing data.
it's not. The Webserver will MOSTLY handle authentication and CORS BEFORE sending requests to Lucense / ES. Everything else is just, dumb. And wasted Resource Power. You could even use Lucene's Query engine, you just need to proxy everything.
User Input -> (CSRF / Auth) from Your Server -> Your Server -> Lucene
Most implementations will do it like that since everything else is unsafe by design, so the article is pointless.
Lucene has no HTTP interface of its own. This is not a Lucene security issue.
Another excellent reason to write your own query parser instead of using Lucene's. Lucene's query parser is way too powerful to expose to end users.
One shall say that never expose your backend queries to the public. This goes for SQL, Lucene or any other database and search technology, as they will reveal much more than you'd like to.