经过一番谷歌搜索后,我发现:
来自mysql 文档的注释:
MySQL Cluster 自动跨节点对表进行分片(分区),使数据库能够在低成本的商品硬件上水平扩展,以服务于读取和写入密集型工作负载,既可以从 SQL 访问,也可以直接通过 NoSQL API 访问。
关系数据库可以横向扩展吗?它会以某种方式基于 NoSQL 数据库吗?
有人有任何现实世界的例子吗?
如何在这样的数据库中管理 sql 请求、事务等?
经过一番谷歌搜索后,我发现:
来自mysql 文档的注释:
MySQL Cluster 自动跨节点对表进行分片(分区),使数据库能够在低成本的商品硬件上水平扩展,以服务于读取和写入密集型工作负载,既可以从 SQL 访问,也可以直接通过 NoSQL API 访问。
关系数据库可以横向扩展吗?它会以某种方式基于 NoSQL 数据库吗?
有人有任何现实世界的例子吗?
如何在这样的数据库中管理 sql 请求、事务等?
这是可能的,但需要大量的维护工作,解释 -
数据的垂直缩放(与 SQL 数据库中的规范化同义)被称为将数据列明智地拆分为多个表,以减少空间冗余。用户表示例 -
数据的水平缩放(与分片同义)被称为将行明智地拆分为多个表,以减少获取数据所花费的时间。用户表示例 -
这里要注意的关键点是,我们可以看到 SQL 数据库中的表被规范化为多个相关数据表。为了在多台机器上对此类表的数据进行分片,您需要相应地对相关的规范化数据进行分片,这反过来会增加维护工作。就像上面介绍的 SQL 数据库示例一样,
与 Order 表关联为一对多关系的 Customer 表
如果您将一些客户数据行移动到其他机器上(称为分片),您还需要将其相关的订单数据移动到同一台机器上,如果有多个相关表,这将是一项麻烦的任务。
由于 NOSQL 数据库遵循平面表结构(数据以聚合形式而不是规范化形式存储),因此便于分片。
I think the answer is, unequivocally, yes. You have to keep in mind that SQL is simply a data access language. There is absolutely no reason why it can't be extended across multiple computers and network partitions. Is it a challenging problem? Most certainly, and that's why software that does it is in its infancy.
Now, I think what you are trying to ask is "Can all features that I am familiar with and that arrive in a standard SQL-type relational database management system be developed to work with multiple servers in this manner?" While I admit I haven't studied the problem in depth, there are theorems out there that say "No, it cannot." Consistency-Availability-Partition Theorem posits that we cannot have all three qualities at the same level.
Now, for all practical purposes, "sharding" or "partitioning" or whatever you want to call it is not going away; to the contrary. This means that, given the degree to which CAP theorem holds, we are going to have to shift the way we think about databases, and how we interact with them (at least, to an extent). Many developers have already made the shift necessary to be successful on a No-SQL platform, but many more have not. Ultimately, sufficient maturity of the model and effective enough workarounds will be developed that traditional SQL databases, in the sense you refer, will be more or less practical across multiple machines. This is already starting to pan out, and I would say give it a few more years and we'll be to that point. Or we'll have collectively shifted thinking to the point where it is no longer necessary, and the world will be a better place. :)
感谢您的提问和回答。我试图向这样的人解释这一点:
就CAP
定理而言,你不能同时拥有这三个。因此,当发生分区(网络或服务器故障)时:
relational database
单个服务器上的A为您提供C
(一致性)。因此,当发生
P
(分区 - 服务器/网络故障)时,您不能拥有A
(可用性 - db down)
Anosql datastore
如果你想在发生A
aP
时,你不能拥有C
(一个或多个复制的分区将不同步,直到 n/w 回来并且它们都同步)。所以只会eventually consistent
编辑#2:根据 Manish 下面的评论提供更多视角。我的意图是通过示例来解释为什么你不能拥有所有 3 个。如下面的评论中所述,当 P 以牺牲 A 为代价时,还有其他数据库可以拥有 C。
Google Spanner 是一个可以水平扩展的关系数据库示例。分片和复制是自动完成的,因此无需担心。欲了解更多信息,请查看这篇论文。
是的,但是当存储增加时它需要迁移。
一些开源工具可以支持该功能,例如:Vitess或Apache ShardingSphere。