本文介绍两种clickhouse的数据备份方式,一种是官方提供的备份策略,一种是通过开源插件实现的备份策略
官方备份策略
clickhouse官方也提供了数据备份的功能,支持将数据备份到本地磁盘或者S3存储。
官方的备份策略,本质上是通过BACKUP 和 RESTORE 两个sql关键字来实现的, 这就导致,我们在配置clickhouse备份任务的时候,需要登录clickhouse 才行。
备份可以是完整的或增量的,可以包含表(包括物化视图、投影和字典)和数据库。备份可以是同步的(默认)或异步的。可以压缩备份。备份可以设有密码保护。
备份到本地硬盘
- 创建数据备份存储目录,比如
/backup
- 添加
/etc/clickhouse-server/config.d/backup_disk.xml
文件,内容如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| <clickhouse> <storage_configuration> <disks> <backups> <type>local</type> <path>/backups/</path> </backups> </disks> </storage_configuration> <backups> <allowed_disk>backups</allowed_disk> <allowed_path>/backups/</allowed_path> </backups> </clickhouse>
|
- sql命令
语法:
1 2 3 4 5 6 7 8 9 10 11 12 13
| BACKUP|RESTORE TABLE [db.]table_name [AS [db.]table_name_in_backup] [PARTITION[S] partition_expr [,...]] | DICTIONARY [db.]dictionary_name [AS [db.]name_in_backup] | DATABASE database_name [AS database_name_in_backup] [EXCEPT TABLES ...] | TEMPORARY TABLE table_name [AS table_name_in_backup] | VIEW view_name [AS view_name_in_backup] | ALL [EXCEPT {TABLES|DATABASES}...] } [,...] [ON CLUSTER 'cluster_name'] TO|FROM File('<path>/<filename>') | Disk('<disk_name>', '<path>/') | S3('<S3 endpoint>/<path>', '<Access key ID>', '<Secret access key>') [SETTINGS base_backup = File('<path>/<filename>') | Disk(...) | S3('<S3 endpoint>/<path>', '<Access key ID>', '<Secret access key>')]
|
- BACKUP|RESTORE : 关键字
- TABLE|DICTIONARY|DATABASE|TEMPORARY TABLE|VIEW|ALL: 备份的对象类型, 表|字典|数据库|临时表|视图|所有
- 在 ClickHouse 的 23.4 版本之前,ALL 只能适用于 RESTORE 命令。
- SETTINGS:
- id:备份或恢复操作的 id,随机生成 UUID,若未手动指定则使用。若已有同一 id 的操作在运行,则抛出异常。
- compression_method 和 compression_level
- base_backup:此源的前一个备份的目标。例如,Disk(‘backups’, ‘1.zip’)
- use_same_s3_credentials_for_base_backup:基础备份是否应从查询中继承凭据。仅对 S3 有效。
- use_same_password_for_base_backup:基础备份存档是否应从查询中继承密码。
- structure_only:如果启用,仅允许备份或恢复 CREATE 语句,而不包括表的数据
- s3_storage_class:用于 S3 备份的存储类。例如,STANDARD
- demo
1 2 3 4 5 6 7
| BACKUP TABLE test.table TO Disk('backups', '1.zip');
RESTORE TABLE test.table FROM Disk('backups', '1.zip')
RESTORE TABLE test.table FROM Disk('backups', '1.zip') SETTINGS allow_non_empty_tables=true
|
1 2 3 4 5
| BACKUP TABLE test.table3 AS test.table4 TO Disk('backups', '2.zip')
- table 重命名为 table2 并恢复 RESTORE TABLE test.table AS test.table2 FROM Disk('backups', '1.zip')
|
1 2 3 4
| BACKUP TABLE test.table TO Disk('backups', 'incremental-a.zip') SETTINGS base_backup = Disk('backups', 'd.zip')
- 从增量备份和 base_backup 恢复所有数据到新表 test.table2 RESTORE TABLE test.table AS test.table2 FROM Disk('backups', 'incremental-a.zip');
|
1 2 3 4 5 6 7
| BACKUP TABLE test.table TO Disk('backups', 'password-protected.zip') SETTINGS password='qwerty'; RESTORE TABLE test.table FROM Disk('backups', 'password-protected.zip') SETTINGS password='qwerty'
|
1 2 3
| BACKUP TABLE test.table TO Disk('backups', 'filename.zip') SETTINGS compression_method='lzma', compression_level=3
|
- 恢复特定分区
如果需要恢复与表相关的特定分区,可以指定这些分区。要从备份恢复分区 1 和 4:
1 2
| RESTORE TABLE test.table PARTITIONS '2', '3' FROM Disk('backups', 'filename.zip')
|
1 2 3 4 5 6 7
| BACKUP TABLE test.table TO Disk('backups', '1.tar')
RESTORE TABLE test.table FROM Disk('backups', '1.tar')
BACKUP TABLE test.table TO Disk('backups', '1.tar.gz')
|
1
| BACKUP TABLE test.table TO Disk('backups', '1.tar') ASYNC
|
1 2 3
| SELECT * FROM system.backups where id='7678b0b3-f519-4e6e-811f-5a0781a4eb52' FORMAT Vertical
SELECT * FROM system.backup_log WHERE id = '7678b0b3-f519-4e6e-811f-5a0781a4eb52' ORDER BY event_time_microseconds ASC FORMAT Vertical
|
备份到S3
- 获得S3 key和密钥
- demo
1
| BACKUP TABLE test.table TO S3('https://xxx.com/backup-S3/my_backup', 'ABC123', 'Abc+123')
|
1 2 3 4
| BACKUP TABLE test.table TO S3('https://xxx.com/backup-S3/my_incremental', 'ABC123', 'Abc+123') SETTINGS base_backup = S3('https://xxx.com/backup-S3/my_backup', 'ABC123', 'Abc+123')
RESTORE TABLE test.table AS data3 FROM S3('https://xxx.com/backup-S3/my_incremental', 'ABC123', 'Abc+123')
|
- 配置文件
如果觉得每次都在sql里写s3的配置,既繁琐,还不安全,可以参考上面配置文件,将s3的配置写到配置文件里,然后通过命令行参数来指定配置文件。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| <clickhouse> <storage_configuration> <disks> <s3_plain> <type>s3_plain</type> <endpoint>https://xxx.com/backup-S3</endpoint> <access_key_id>ABC123</access_key_id> <secret_access_key>Abc+123</secret_access_key> </s3_plain> </disks> <policies> <s3> <volumes> <main> <disk>s3_plain</disk> </main> </volumes> </s3> </policies> </storage_configuration>
<backups> <allowed_disk>s3_plain</allowed_disk> </backups> </clickhouse>
|
然后就可以通过s3_plain 来备份了
1 2 3
| BACKUP TABLE test.table TO Disk('s3_plain', 'cloud_backup');
RESTORE TABLE test.table FROM Disk('s3_plain', 'cloud_backup');
|
clickhouse-backup
如果觉得clickhouse官方的sql备份方式不方便,或者配置定时备份机制不容易,可以考虑使用clickhouse-backup,它是一个开源的备份工具,可以自动备份和恢复ClickHouse数据,且同样支持本地备份和S3备份,除此之外还支持ftp/sftp、gcs、cos、api等备份方式
- 安装clickhouse-backup
从官方仓库下载并安装clickhouse-backup
如果github连不上或者比较慢的话,可以从这里下载,这是我用过的2.6.39版本
下载完成后,解压到你希望安装的目录下
1
| tar -zxvf clickhouse-backup.tar.gz
|
然后添加配置文件/etc/clickhouse-backup/config.yml(默认路径),内容如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| clickhouse: host: localhost port: 9000 user: default password: "" database: ""
backup: dir: "/tmp/clickhouse-backup" compression: true
|
具体参考官方配置
然后就可以使用clickhouse-backup进行备份了
1 2 3 4 5 6 7 8 9 10
| clickhouse-backup create my_backup
clickhouse-backup create --config ./config.yml my_backup
clickhouse-backup list
clickhouse-backup restore my_backup
|
参考
clickhouse官方备份策略
clickhouse-backup官方配置