Forum List Message List New Topic Print View

Maxim Dounin

March 02, 2018 11:08AM

Hello!

On Fri, Mar 02, 2018 at 10:12:02AM +0100, Nagy, Attila wrote:

> On 02/28/2018 03:08 PM, Maxim Dounin wrote:
> > The question here is - why you want the file to be on disk, and
> > not just in a buffer? Because you expect the server to die in a
> > few seconds without flushing the file to disk? How probable it
> > is, compared to the probability of the disk to die? A more
> > reliable server can make this probability negligible, hence the
> > suggestion.
> Because the files I upload to nginx servers are important to me. Please
> step back a little and forget that we are talking about nginx or an HTTP
> server.

If file are indeed important to you, you have to keep a second
copy in a different location, or even in multiple different
locations. Trying to do fsync() won't save your data in a lot of
quite realistic scenarios, but certainly will imply performance
(and complexity, from nginx code point of view) costs.

> We have data which we want to write to somewhere.
> Check any of the database servers. Would you accept a DB server which
> can loose confirmed data or couldn't be configured that way that a
> write/insert/update/commit/whatever you use to modify or put data into
> it operation is reliably written by the time you receive acknowledgement?

The "can loose confirmed data" claim applies to all database
servers, all installations in the world. There is no such thing
as 100% reliability. And the question is how probable data loss
is, and if we can ignore a particular probability or not.

> Now try to use this example. I would like to use nginx to store files.
> That's what HTTP PUT is for.
> Of course I'm not expecting that the server will die every day. But when
> that happens, I want to make sure that the confirmed data is there.
> Let's take a look at various object storage systems, like ceph. Would
> you accept a confirmed write to be lost there? They make a great deal of
> work to make that impossible.
> Now try to imagine that somebody doesn't need the complexity of -for
> example- ceph, but wants to store data with plain HTTP. And you got
> there. If you store data, then you want to make sure the data is there.
> If you don't, why do you store it anyways?

So, given the fact that there is no such thing as 100%
reliability, you suggest to do not store files at all? I don't
think it's viable approach - and clearly you are already doing the
opposite. Rather, you want to consider various scenarios and
their probabilities, and minimize probabilities of loosing data
where possible and makes sense.

And that's why I asked you to compare the probability you are
trying to avoid with other probabilities which can cause data loss -
for example, the probability of the disk to die. Just for the
reference, assuming you are using a commodity HDD to store your
files, the probability that it will fail within a year is about 2%
(see, for example, Backblaze data, recent stats are available at
https://www.backblaze.com/blog/hard-drive-stats-for-2017/).

Moreover, even if you have numbers on hand and this numbers will
show that you indeed need to ensure syncing files to disk to reach
greater reliability, doing fsync() might not be the best way to
achieve this. For example, doing sync() instead after loading
multiple files might be a better solution, both due to lower
complexity and higher performance.

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

Reply Quote

RSS

Subject	Author	Posted
fsync()-in webdav PUT	Nagy, Attila	February 27, 2018 05:24AM
Re: fsync()-in webdav PUT	Maxim Dounin	February 27, 2018 08:26AM
Re: fsync()-in webdav PUT	Nagy, Attila	February 28, 2018 04:32AM
Re: fsync()-in webdav PUT	Aziz Rozyev	February 28, 2018 05:06AM
Re: fsync()-in webdav PUT	Valery Kholodkov	February 28, 2018 01:26PM
Re: fsync()-in webdav PUT	Aziz Rozyev	February 28, 2018 04:44PM
Re: fsync()-in webdav PUT	Aziz Rozyev	February 28, 2018 05:08PM
Re: fsync()-in webdav PUT	pbooth	February 28, 2018 05:36PM
Re: fsync()-in webdav PUT	Valery Kholodkov	March 01, 2018 07:26AM
Re: fsync()-in webdav PUT	Valery Kholodkov	March 01, 2018 07:30AM
Re: fsync()-in webdav PUT	Nagy, Attila	March 02, 2018 04:02AM
Re: fsync()-in webdav PUT	Nagy, Attila	March 02, 2018 05:04AM
Re: fsync()-in webdav PUT	Nagy, Attila	March 02, 2018 03:52AM
Re: fsync()-in webdav PUT	Maxim Dounin	February 28, 2018 09:10AM
Re: fsync()-in webdav PUT	Valery Kholodkov	February 28, 2018 01:56PM
Re: fsync()-in webdav PUT	itpp2012	February 28, 2018 03:28PM
Re: fsync()-in webdav PUT	Nagy, Attila	March 02, 2018 04:14AM
Re: fsync()-in webdav PUT	Aziz Rozyev	March 02, 2018 05:44AM
Re: fsync()-in webdav PUT	Nagy, Attila	March 02, 2018 06:32AM
RE: fsync()-in webdav PUT	Reinis Rozitis	March 04, 2018 07:42AM
Re: fsync()-in webdav PUT	Nagy, Attila	March 05, 2018 05:32AM
RE: fsync()-in webdav PUT	Reinis Rozitis	March 05, 2018 06:56AM
Re: fsync()-in webdav PUT	Valery Kholodkov	March 05, 2018 08:14AM
Re: fsync()-in webdav PUT	Nagy, Attila	March 05, 2018 08:56AM
Re: fsync()-in webdav PUT	Richard Demeny	March 05, 2018 09:06AM
Re: fsync()-in webdav PUT	Maxim Dounin	March 02, 2018 11:08AM
Re: fsync()-in webdav PUT	Valery Kholodkov	March 02, 2018 02:50PM
Re: fsync()-in webdav PUT	Maxim Dounin	March 02, 2018 07:44PM

Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 233

Record Number of Users: 8 on April 13, 2023

Record Number of Guests: 421 on December 02, 2018