-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RandomAccess.FlushToDisk #86836
Comments
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsFollowing the handy guidance by @adamsitnik on how to do fast file I/O on .NET 6, I decided to take advantage of the RandomAccess class in .NET 6 and so far it has all the methods I need except one: the equivalent for the FileStream.Flush(bool) function. Basically, I am working on a database-like project where multiple threads can write to random 4KB pages in the same file (i.e. exactly the kind of thing This leads to two questions:
I know there are several
|
I think this might be an oversight. @adamsitnik ? |
Thanks @danmoseley for following up! 👍 While waiting for @adamsitnik to provide some more insight, I was checking out the docs. For Windows, I found this sentence about the flags we can pass to the CreateFile function:
For Linux, I found this sentence about the
The Windows In any case, I think adding |
As @danmoseley wrote, it's an oversight ;) Based on your feedback I believe that we should add following method: namespace System.IO;
public static class RandomAccess
{
+ public static void FlushToDisk(SafeFileHandle handle);
} For now you can use this code as a workaround, but please keep in mind that it won't work on WASM and for non-regular files like Pipes & Sockets. |
Background and motivation
We don't need API Proposalnamespace System.IO;
public static class RandomAccess
{
+ public static void FlushToDisk(SafeFileHandle handle);
} API Usageusing SafeFileHandle sfh = File.OpenHandle("test.txt", FileMode.CreateNew, FileAccess.Write, FileShare.Read);
RandomAccess.Write(sfh, new byte[1024], fileOffset: 0);
RandomAccess.FlushToDisk(sfh); Alternative DesignsNone of the supported OSes exposes a way to perform an async flush to disk, but anyway we could include an async overload too. RisksI can't see any. |
I wonder whether Flush() would be a better name. It's consistent with FileStream (albeit with true). FlushToDisk() makes me stop and think what other kind of flush there is here (is there a buffer?). |
Thanks @adamsitnik for following up and for the workaround code 👍 As far as naming goes, using When someone wants to flush, they type So I vote Incidentally, when writing the docs for this function it may be worth advising users to check for errors (especially if they are following the "do many writes then issue one fsync at the end" approach). On Linux, according to this issue encountered by the Postgres guys, if PS: thanks to both of you and the rest of the team on the amazing work you are doing with .NET! The |
presumably it would throw IOException in such a case, just like FileStream.Flush() does. |
Thank you for your kind words!
My motivation was to avoid this confusion. I am sure that the API Review Board is going to choose the best name.
We are definitely going to throw on error |
Remember that you need to ensure sufficient buffer alignment (and proper endianness and padding) when using Cast. When reading from a file, I'd recommend using a correctly typed array and AsBytes instead. |
I am doing something like this: // allocate 4KB page buffer.
var buffer = new byte[4096];
// read page from disk into buffer using RandomAccess.ReadAsync().
ReadPageFromDisk(buffer);
// access the buffer as if it was an array of 64-bit elements.
var elements = MemoryMarshal.Cast<byte, ulong>(buffer);
elements[0] = ulong.MinValue;
elements[1] = ulong.MaxValue; Presumably the CLR takes care of aligning the buffer allocated via Speaking of buffer alignment issues while we are on the topic of enhancing the |
No, if you want to have your code be portable and work on platforms like S390X you'd need to do: // allocate 4KB page buffer.
var elements = new ulong[4096 / sizeof(ulong)];
// read page from disk into buffer using RandomAccess.ReadAsync().
ReadPageFromDisk(MemoryMarshal.AsBytes(buffer));
if (!BitConverter.IsLittleEndian)
for (int i = 0; i < elements.Length; i++)
elements[i] = BinaryPrimitives.ReverseEndianness(elements[i]);
// access the buffer as if it was an array of 64-bit elements.
elements[0] = ulong.MinValue;
elements[1] = ulong.MaxValue; With custom structs you'd also probably need to use |
Many thanks for the tip, I will research this some more to make sure I am considering all the issues 👍
Awesome, I have been toying around with some ideas today, will share them there 👍 |
@adamsitnik Did you intend to add a label to send this to API review? |
I thought I already did ;) Thanks! |
namespace System.IO;
public static class RandomAccess
{
public static void FlushToDisk(SafeFileHandle handle);
} |
@ericmutta now this is approved, any interest in adding it ? |
@danmoseley yes, I am interested in this one too because it relates directly to that database-like project I mentioned earlier. But before I ask for the issue to be assigned to me, I just need to sort out a couple of other tasks on my plate over the next week or two, then I will be back 👍 PS: I believe the plan is to ship this with .NET 8 in November this year. Is there a cut-off date for implementation of features planned for .NET 8? I would like to plan accordingly so that if I take this on, I can do it well and wrap up everything in time for .NET 8. |
Not announced (@jeffschwMSFT @jeffhandley do you plan to put up the usual sticky issue with the shutdown schedule?). My guess is for community contributions like this, merged no later than mid August - possibly earlier. Allow plenty of time for the PR process. But it's not a days away thing yet. |
@danmoseley I am beginning to get the hang of this (e.g. I have finally managed to open the relevant code in Visual Studio and have the full editing experience). Please assign the task to me since I have already started working on it (I know I said I would be back in a week or two but I couldn't resist 😁) |
@ericmutta I've assigned you! Here are some tips that you may find useful:
.\build.cmd -c Release -subset clr+libs+libs.tests
.\build.cmd -c Release -subset clr.corelib+clr.nativecorelib+libs.PreTest && .\dotnet.cmd build -c Release .\src\libraries\System.IO.FileSystem\System.IO.FileSystem.sln && .\dotnet.cmd build -c Release .\src\libraries\System.IO.FileSystem\tests\System.IO.FileSystem.Tests.csproj /t:Test Good luck! |
Many thanks @adamsitnik, the tips you gave were a huuuge time-saver 🚀 For the implementation I am thinking of simply calling My prototype looks like this: public static void FlushToDisk(SafeFileHandle handle)
{
ValidateInput(handle, fileOffset: 0);
FileStreamHelpers.FlushToDisk(handle);
} I have also managed to setup a dedicated test class with a failing test, which looks like this: using Microsoft.Win32.SafeHandles;
using Xunit;
namespace System.IO.Tests
{
public class RandomAccess_FlushToDisk : RandomAccess_Base<long>
{
protected override long MethodUnderTest(SafeFileHandle handle, byte[] bytes, long fileOffset)
{
RandomAccess.FlushToDisk(handle);
return 0;
}
protected override bool UsesOffsets => false;
[Fact]
public void FailingTest()
{
Assert.True(false, "This test is failing because it is not implemented yet.");
}
}
} Everything builds and runs as expected so far, so assuming I am heading in the right direction, I am ready to start the official implementation (tests, XML docs, etc) 👍 |
Sounds great to me! Regarding the tests and the implementation the only open question I have is what should be the behavior for
On Windows, FlushFileBuffers does the following:
The simplest solution would be to unify the behavior and just throw for non-seekable files before calling the sys-call. But by doing that, we would loose the extra capability provided by Windows. @stephentoub what is your opinion on this? |
It looks like the current Unix implementation already does this by ignoring errors returned by |
@ericmutta let's keep the current behavior then 👍 |
Thanks @adamsitnik 👍 One more thing...how can we really test that
The challenge is in step 3: when reading the data back, that read is most likely going to be served from the file system cache. For the test to really prove that the data was written to disk, we need the read to be performed directly from disk, bypassing the file cache entirely. This implies the need for the PS: this being the age of AI and all, I asked ChatGPT how to test fsync and it had several suggestions. One easy one was to check the file's timestamp after calling |
This is tricky indeed. We actually support runtime/src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamHelpers.cs Line 81 in f25bc7b
We have some tests that use it: runtime/src/libraries/System.IO.FileSystem/tests/RandomAccess/NoBuffering.Windows.cs Line 13 in f25bc7b
You can extend these tests and test it properly on Windows. When it comes to Unix my best idea is to simply ensure that it does not throw. |
Fix #86836 Co-authored-by: Dan Moseley <[email protected]>
Following the handy guidance by @adamsitnik on how to do fast file I/O on .NET 6, I decided to take advantage of the RandomAccess class in .NET 6 and so far it has all the methods I need except one: the equivalent for the FileStream.Flush(bool) function.
Basically, I am working on a database-like project where multiple threads can write to random 4KB pages in the same file (i.e. exactly the kind of thing
RandomAccess
was designed to handle). At some point I need to make a call tofsync()
(on Linux) orFlushFileBuffers()
(on Windows). If I had aFileStream
on hand I could just callmyFileStream.Flush(true)
and that would flush changes to disk usingfsync
in an OS-independent way. However, I do NOT have aFileStream
object, only aSafeFileHandle
because I use the new File.OpenHandle() function to open my files.This leads to two questions:
RandomAccess
class does not have aFlush(bool)
function that does an fsync under the covers?SafeFileHandle
and I want to fsync any writes made via that handle?I know there are several
FileStream
constructors that take aSafeFileHandle
parameter, but creating aFileStream
object only so I can callFlush(bool)
doesn't feel quite right and I suspect I am missing something!The text was updated successfully, but these errors were encountered: