标签:dataset containe 相对 越来越大 数据库 set fsync sort example
Redis是一个开源的、基于内存的数据结构存储器,可以用作数据库、缓存和消息中间件。
Redis是一个key-value存储系统。和Memcached类似,它支持存储的value类型相对更多,包括string(字符串)、list(链表)、set(集合)、zset(sorted set --有序集合)和hash(哈希类型)。这些数据类型都支持push/pop、add/remove及取交集并集和差集及更丰富的操作,而且这些操作都是原子性的。在此基础上,redis支持各种不同方式的排序。与memcached一样,为了保证效率,数据都是缓存在内存中。
Redis是一个内存数据库,数据保存在内存中,但是我们都知道内存的数据变化是很快的,也容易发生丢失。幸好Redis还为我们提供了持久化的机制,会周期性的把更新的数据写入磁盘或者把修改操作写入追加的记录磁盘文件,分别是RDB(Redis DataBase)和AOF(Append Only File)。
本节我们重点介绍Redis的两种持久化机制。
RDB持久化是默认的持久化方式,RDB持久化是指在指定的时间间隔内将内存中的数据集快照写入磁盘(RDB是在某个时间点将数据写入一个临时文件,持久化结束后,用这个临时文件替换上次持久化的文件,达到数据恢复)。这种方式是就是将内存中数据以快照的方式写入到二进制文件中,默认的文件名为dump.rdb。
持久化时fork一个进程,遍历hash table,利用copy on write,把整个db dump保存下来。
save, shutdown, slave 命令会触发这个操作。
粒度比较大,如果save, shutdown, slave 之前crash了,则中间的操作没办法恢复。
既然RDB机制是通过把某个时刻的所有数据生成一个快照来保存,那么就应该有一种触发机制去实现这个过程。对于RDB来说,提供了三种机制:save、bgsave、自动化。我们分别来看一下
save/bgsave, shutdown, slave 命令会触发这个操作。
该命令会阻塞当前Redis服务器,执行save命令期间,Redis不能处理其他命令,直到RDB过程完成为止。具体流程如下:
客户端通过命令进行持久化存储
./redis-cli -h ip -p port save
由于Redis是用主线程来处理所有client请求,这种方式会阻塞所有请求。我们的客户端可能都是几万或者是几十万,这种方式显然不可取。
执行该命令时,Redis会在后台异步进行快照操作,快照同时还可以响应客户端请求(这是Redis rdb持久化默认的方式)。具体流程如下:
客户端通过命令进行后台持久化存储
./redis-cli -h ip -p port bgsave
具体操作是Redis进程执行fork操作创建子进程,RDB持久化过程由子进程负责,完成后自动结束。阻塞只发生在fork阶段,一般时间很短。基本上 Redis 内部所有的RDB操作都是采用 bgsave 命令。
自动触发是由我们的配置文件来完成的。在redis.conf配置文件中,我们可以去设置redis在n秒内如果有超过m个key被修改就执行一次RDB操作,这个操作就类似于在这个时间点来保存一次Redis的所有数据,相当于一次快照。所以这个持久化方法也通常叫做snapshots。
默认配置文件解读:
################################ SNAPSHOTTING ################################ # # Save the DB on disk: # # save <seconds> <changes> # # Will save the DB if both the given number of seconds and the given # number of write operations against the DB occurred. # # In the example below the behaviour will be to save: # after 900 sec (15 min) if at least 1 key changed # after 300 sec (5 min) if at least 10 keys changed # after 60 sec if at least 10000 keys changed # # Note: you can disable saving completely by commenting out all "save" lines. # # It is also possible to remove all the previously configured save # points by adding a save directive with a single empty string argument # like in the following example: # # save "" #若不需要持久化,那么你可以注释掉所有的 save 行,然后配置save "" 来停用保存功能。 save 900 1 #表示900 秒内如果至少有 1 个 key 的值变化,则保存 save 300 10 #表示60 秒内如果至少有 10000 个 key 的值变化,则保存 save 60 10000 #表示60 秒内如果至少有 10000 个 key 的值变化,则保存 # By default Redis will stop accepting writes if RDB snapshots are enabled # (at least one save point) and the latest background save failed. # By default Redis will stop accepting writes if RDB snapshots are enabled # (at least one save point) and the latest background save failed. # This will make the user aware (in a hard way) that data is not persisting # on disk properly, otherwise chances are that no one will notice and some # disaster will happen. # # If the background saving process will start working again Redis will # automatically allow writes again. # # However if you have setup your proper monitoring of the Redis server # and persistence, you may want to disable this feature so that Redis will # continue to work as usual even if there are problems with disk, # permissions, and so forth. stop-writes-on-bgsave-error yes #默认值为yes。当snapshot时出现错误无法继续时是否阻塞客户端“变更操作”。“错误”可能因为磁盘已满/磁盘故障/OS级别异常等。如果Redis重启了,那么又可以重新开始接收数据了 # Compress string objects using LZF when dump .rdb databases? # For default that‘s set to ‘yes‘ as it‘s almost always a win. # If you want to save some CPU in the saving child set it to ‘no‘ but # the dataset will likely be bigger if you have compressible values or keys. rdbcompression yes #默认值是yes。对于存储到磁盘中的快照,可以设置是否进行压缩存储。 # Since version 5 of RDB a CRC64 checksum is placed at the end of the file. # This makes the format more resistant to corruption but there is a performance # hit to pay (around 10%) when saving and loading RDB files, so you can disable it # for maximum performances. # # RDB files created with checksum disabled have a checksum of zero that will # tell the loading code to skip the check. rdbchecksum yes #默认值是yes。在存储快照后,我们还可以让redis使用CRC64算法来进行数据校验,但是这样做会增加大约10%的性能消耗,如果希望获取到最大的性能提升,可以关闭此功能。 # The filename where to dump the DB dbfilename dump.rdb #设置快照的文件名,默认是 dump.rdb # The working directory. # # The DB will be written inside this directory, with the filename specified # above using the ‘dbfilename‘ configuration directive. # # The Append Only File will also be created inside this directory. # # Note that you must specify a directory here, not a file name. dir /home/cmhy/redis-5.0.5 #设置快照文件的存放路径,这个配置项一定是个目录,而不能是文件名
我们可以修改这些配置来实现我们想要的效果。因为第三种方式是配置的,所以我们对前两种进行一个对比:
7639:M 22 May 2020 10:12:54.048 * 10000 changes in 60 seconds. Saving... //根据redis.conf 中save 60 1000配置,触发持久化 7639:M 22 May 2020 10:12:54.240 * Background saving started by pid 28173 //fork一个后台子进程 28173:C 22 May 2020 10:13:41.728 * DB saved on disk //保存到磁盘 28173:C 22 May 2020 10:13:41.876 * RDB: 585 MB of memory used by copy-on-write 7639:M 22 May 2020 10:13:42.221 * Background saving terminated with success
全量备份总是耗时的,有时候我们提供一种更加高效的方式AOF,工作机制很简单,redis会将每一个收到的写命令都通过write函数追加到文件中。通俗的理解就是日志记录。
如果你了解oracle的重做日志,那就容易理解,redis aof持久化和oracle重做日志类似。
把写操作指令,持续的写到一个类似日志文件里。(类似于从postgresql等数据库导出sql一样,只记录写操作)
粒度较小,crash之后,只有crash之前没有来得及做日志的操作没办法恢复。
优势:
AOF可以更好的保护数据不丢失,一般AOF会每隔1秒,通过一个后台线程执行一次fsync操作,最多丢失1秒钟的数据;
AOF日志文件没有任何磁盘寻址的开销,写入性能非常高,文件不容易破损;
AOF日志文件即使过大的时候,出现后台重写操作,也不会影响客户端的读写;
AOF日志文件的命令通过非常可读的方式进行记录,这个特性非常适合做灾难性的误删除的紧急恢复。比如某人不小心用flushall命令清空了所有数据,只要这个时候后台rewrite还没有发生,那么就可以立即拷贝AOF文件,将最后一条flushall命令给删了,然后再将该AOF文件放回去,就可以通过恢复机制,自动恢复所有数据
劣势:
对于同一份数据来说,AOF日志文件通常比RDB数据快照文件更大;
AOF开启后,支持的写QPS会比RDB支持的写QPS低,因为AOF一般会配置成每秒fsync一次日志文件,当然,每秒一次fsync,性能也还是很高的;
以前AOF发生过bug,就是通过AOF记录的日志,进行数据恢复的时候,没有恢复一模一样的数据出来。
ta的原理看下面这张图:
每当有一个写命令过来时,就直接保存在我们的AOF文件中。
为什么重写?
AOF的方式也同时带来了另一个问题。持久化文件会变的越来越大。为了压缩aof的持久化文件。redis提供了bgrewriteaof命令。将内存中的数据以命令的方式保存到临时文件中,同时会fork出一条新进程来将文件重写。
重写aof文件的操作,并没有读取旧的aof文件,而是将整个内存中的数据库内容用命令的方式重写了一个新的aof文件,这点和快照有点类似。
可以通过配置文件看到有三种AOf持久化方式
appendfsync always //每次修改,同步持久化,每次发生数据变更会被立即记录到磁盘 性能较差但数据完整性比较好 appendfsync everysec //每秒同步,异步操作,每秒记录 如果一秒内宕机,有数据丢失 appendfsync no //从不同步
比较:
默认配置解读:
############################## APPEND ONLY MODE ############################### # By default Redis asynchronously dumps the dataset on disk. This mode is # good enough in many applications, but an issue with the Redis process or # a power outage may result into a few minutes of writes lost (depending on # the configured save points). # # The Append Only File is an alternative persistence mode that provides # much better durability. For instance using the default data fsync policy # (see later in the config file) Redis can lose just one second of writes in a # dramatic event like a server power outage, or a single write if something # wrong with the Redis process itself happens, but the operating system is # still running correctly. # # AOF and RDB persistence can be enabled at the same time without problems. # If the AOF is enabled on startup Redis will load the AOF, that is the file # with the better durability guarantees. # # Please check http://redis.io/topics/persistence for more information. appendonly no //是否开启AOF持久化 # The name of the append only file (default: "appendonly.aof") appendfilename "appendonly.aof" //AOF持久化文件名称 # The fsync() call tells the Operating System to actually write data on disk # instead of waiting for more data in the output buffer. Some OS will really flush # data on disk, some other OS will just try to do it ASAP. # # Redis supports three different modes: # # no: don‘t fsync, just let the OS flush the data when it wants. Faster. # always: fsync after every write to the append only log. Slow, Safest. # everysec: fsync only one time every second. Compromise. # # The default is "everysec", as that‘s usually the right compromise between # speed and data safety. It‘s up to you to understand if you can relax this to # "no" that will let the operating system flush the output buffer when # it wants, for better performances (but if you can live with the idea of # some data loss consider the default persistence mode that‘s snapshotting), # or on the contrary, use "always" that‘s very slow but a bit safer than # everysec. # # More details please check the following article: # http://antirez.com/post/redis-persistence-demystified.html # # If unsure, use "everysec". # appendfsync always appendfsync everysec
//指定aof操作文件同步策略,always、eversec、no,默认为everysec是每s进行一次fsync调用,将缓冲区中的数据同步到磁盘。但是当这一次的fsync调用时间超过1s时Redis会采取延迟fsynccel,再等1s钟。也就是在2s后再进行一次fsync,这一次的fsync不管执行多久都会进行。这时候由于在fsync时文件描述符会被阻塞,所以当前的写操作会被阻塞。
//结论:绝大多数情况下Redis会每隔1s进行fsync,最坏情况下,2s钟会进行一次fsync。
# appendfsync no //如果同步策略设置为always或everysec,会造成后台存储进程(后台存储或写入aof文件)会产生很多磁盘I/O开销,当设置为no时,Redis不会主动调用fsync去将aof日志同步到磁盘,所以这一切都要靠操作系统的调试了,对大多数操作系统,是每30/s进行一次fsync,将缓冲区中数据同步到磁盘; # When the AOF fsync policy is set to always or everysec, and a background # saving process (a background save or AOF log background rewriting) is # performing a lot of I/O against the disk, in some Linux configurations # Redis may block too long on the fsync() call. Note that there is no fix for # this currently, as even performing fsync in a different thread will block # our synchronous write(2) call. # # In order to mitigate this problem it‘s possible to use the following option # that will prevent fsync() from being called in the main process while a # BGSAVE or BGREWRITEAOF is in progress. # # This means that while another child is saving, the durability of Redis is # the same as "appendfsync none". In practical terms, this means that it is # possible to lose up to 30 seconds of log in the worst scenario (with the # default Linux settings). # # If you have latency problems turn this to "yes". Otherwise leave it as # "no" that is the safest pick from the point of view of durability. no-appendfsync-on-rewrite no #注意,目前对这个情况还没有完美修正,甚至不同线程的 fsync() 会阻塞我们同步的write(2)调用。 #为了缓解这个问题,可以用下面这个选项。它可以在 BGSAVE 或 BGREWRITEAOF 处理时阻止fsync()。 #这就意味着如果有子进程在进行保存操作,那么Redis就处于"不可同步"的状态。 #这实际上是说,在最差的情况下可能会丢掉30秒钟的日志数据。(默认Linux设定) #如果把这个设置成"yes"带来了延迟问题,就保持"no",这是保存持久数据的最安全的方式。 # Automatic rewrite of the append only file. # Redis is able to automatically rewrite the log file implicitly calling # BGREWRITEAOF when the AOF log size grows by the specified percentage. # # This is how it works: Redis remembers the size of the AOF file after the # latest rewrite (if no rewrite has happened since the restart, the size of # the AOF at startup is used). # # This base size is compared to the current size. If the current size is # bigger than the specified percentage, the rewrite is triggered. Also # you need to specify a minimal size for the AOF file to be rewritten, this # is useful to avoid rewriting the AOF file even if the percentage increase # is reached but it is still pretty small. # # Specify a percentage of zero in order to disable the automatic AOF # rewrite feature. auto-aof-rewrite-percentage 100 #自动重写AOF文件。如果AOF日志文件增大到指定百分比,Redis能够通过 BGREWRITEAOF 自动重写AOF日志文件。 #工作原理:Redis记住上次重写时AOF文件的大小(如果重启后还没有写操作,就直接用启动时的AOF大小) #这个基准大小和当前大小做比较。如果当前大小超过指定比例,就会触发重写操作。 #你还需要指定被重写日志的最小尺寸,这样避免了达到指定百分比但尺寸仍然很小的情况还要重写。 #指定百分比为0会禁用AOF自动重写特性。 auto-aof-rewrite-min-size 64mb #文件达到大小阈值的时候进行重写 # An AOF file may be found to be truncated at the end during the Redis # startup process, when the AOF data gets loaded back into memory. # This may happen when the system where Redis is running # crashes, especially when an ext4 filesystem is mounted without the # data=ordered option (however this can‘t happen when Redis itself # crashes or aborts but the operating system still works correctly). # # Redis can either exit with an error when this happens, or load as much # data as possible (the default now) and start if the AOF file is found # to be truncated at the end. The following option controls this behavior. # # If aof-load-truncated is set to yes, a truncated AOF file is loaded and # the Redis server starts emitting a log to inform the user of the event. # Otherwise if the option is set to no, the server aborts with an error # and refuses to start. When the option is set to no, the user requires # to fix the AOF file using the "redis-check-aof" utility before to restart # the server. # # Note that if the AOF file will be found to be corrupted in the middle # the server will still exit with an error. This option only applies when # Redis will try to read more data from the AOF file but not enough bytes # will be found. aof-load-truncated yes #如果设置为yes,如果一个因异常被截断的AOF文件被redis启动时加载进内存,redis将会发送日志通知用户 #如果设置为no,erdis将会拒绝启动。此时需要用"redis-check-aof"工具修复文件。 # When rewriting the AOF file, Redis is able to use an RDB preamble in the # AOF file for faster rewrites and recoveries. When this option is turned # on the rewritten AOF file is composed of two different stanzas: # # [RDB file][AOF tail] # # When loading Redis recognizes that the AOF file starts with the "REDIS" # string and loads the prefixed RDB file, and continues loading the AOF # tail. aof-use-rdb-preamble yes #加载时Redis识别出AOF文件以“REDIS”开头字符串, #并加载带此前缀的RDB文件,然后继续加载AOF
一般来说,如果想达到足以媲美 PostgreSQL 的数据安全性, 你应该同时使用两种持久化功能。
如果你非常关心你的数据,但仍然可以承受数分钟以内的数据丢失, 那么你可以只使用 RDB 持久化。
有很多用户都只使用 AOF 持久化, 但我们并不推荐这种方式: 因为定时生成 RDB 快照(snapshot)非常便于进行数据库备份, 并且 RDB 恢复数据集的速度也要比 AOF 恢复的速度要快, 除此之外, 使用 RDB 还可以避免之前提到的 AOF 程序的 bug 。因为以上提到的种种原因, 未来我们可能会将 AOF 和 RDB 整合成单个持久化模型。 (这是一个长期计划。)
两种方式:修改配置文件、命令行修改
3.1、配置文件方式(需要重启)
1)关闭rdb
配置文件将:
Save 900 1 Save 300 10 Save 60 10000
注释掉,并打开save "" 的注释,使得 save "" 生效,即可关闭rdb;
2)关闭AOF
进入配置文件,将appendonly设置为no,默认是 appendonly no
3.2、命令行方式(不需要重启)
1)*关闭rdb的命令:
config set save ""
2)*关闭aof的命令:
config set appendonly no
该两种设置查询是否已修改成功,可分别通过config get save, config get appendfsync命令来查看。
参考:
配置文件详解https://www.cnblogs.com/pyng/p/11959018.html
https://www.cnblogs.com/shizhengwen/p/9283973.html
https://baijiahao.baidu.com/s?id=1654694618189745916&wfr=spider&for=pc
标签:dataset containe 相对 越来越大 数据库 set fsync sort example
原文地址:https://www.cnblogs.com/-abm/p/12923796.html