Windows file system supports scatter/gather IO.(Of course, other platform does)
But I don't know when do I use the IO mechanism.

Could you explain me a proper case?

And what benefit can we get from using the I/O mechanism?(Just a little IO request?)

asked Aug 22, 2010 at 9:07

Benjamin's user avatar

You use Scatter/Gather IO when you are doing lots of random (i.e. non-sequential) reads / writes, and you want to save on context switches / syscalls - Scatter/Gather is a form of batching in this sense. However, unless you've got a very fast disk (or more likely, a large array of disks), the syscall cost is negligible.

If you were writing a Database server, you might care about this, but anything less than a big-iron machine handling thousands or millions of requests a second won't see any benefit.

answered Aug 22, 2010 at 22:51

Ana Betts's user avatar

4 Comments

Now in 2017 it’s not uncommon to see 100k IOPS SSD in a mid-range laptop. Does it mean we’re effectively using the big machines you’re talking about, and should therefore implement vectorized IO for random reads?

It only seems appropriate to answer a comment posted 7 years after the original answer, with a comment posted 7 years after the original comment :-) The answer is NO. Scatter/Gather largely took advantage of the mechanics of HHDs (rotating discs). SSDs work differently and don't generally benefit from this technique.

Actually it does, because while you are right that there is no seek time, SSDs are so fast that keeping them saturated requires you to keep their command queue full and the only way to do that is via asynchronous I/O, and syscall costs do start to add up here. In 2024, we indeed do have incredibly fast I/O.

And this is why I love StackOverflow. ChatGPT can never replace experience. Now I am going to google what a command queue is in the context of SSDs or Disks

Paul -- one extra note: one additional advantage is that you hand multiple requests to the disk driver at the same time. The driver then can sort the requests and issue them in the optimal order. While syscall time is small, seek time (many milliseconds) can be punitive (that's less than 1000 I/O's/sec).

Chris's comment about demonstrating the efficiency is pragmatic. Mother nature never lies. Well, almost never.

answered Sep 17, 2010 at 21:15

MJZ's user avatar

2 Comments

Currently scattered I/O in NT doesn't actually do anything special besides map in different pages in one contiguous segment, and drivers don't know about it. So no, the drivers don't "sort the requests and issue them in the optimal order".

Any async I/O will also do this already, the only time you won't get this is if you're a single process doing sync I/O to random pages, since the kernel has no information as to what page you'll ask for next

I would imagine that you would use scatter gatehr IO when you (a) suspected your application had a performance bottleneck, and (b) you built a performance analysis framework that could show significant improvment using it.

Unless you can show a provable improvement, the additional code complexity is just a risk, and theres no magic recipe that says that, when some condition is met, and application will automatically benefit in a significant way from some programming cleverness.

Or - to put it another way - dont base major architectural decisions based on the statements of 'some guy on an internet forum'. Create a test, and find out.

answered Aug 22, 2010 at 20:23

Chris Becke's user avatar

Comments

in posix, readv and writev read from or write to discontinuous memory but to read and write discontinuous file ranges from discontinuous memory in one go you want readx and writex which were one of the proposed posix additions

doing a readx is faster then doing a lot of reads as it's only one system call and it lets the disk scheduler have the most io's to reorder i remember some one saying that for the ext2/3/.. fsck program that they wanted this as it knows what ranges it wants

answered Nov 29, 2010 at 20:23

Dan D.'s user avatar

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.