4

I have implemented the following ways of storing relational topology:

1.A general junction relation table:

Table: Relation

Columns: id parent_type parent_id parent_prop child_type child_id child_prop

On which joins are not generally capable of being executed against by most sql engines.

2.Relation specific junction tables

Table: Class2Student

Columns: id parent_id parent_prop child_id child_prop

On which joins are capable of being executed against.

3.Storing lists/string maps of related objects in a text field on both bidirectional objects.

Class: Class

Class properties: id name students

Table columns: id name students_keys

Rows: 1 "history" [{type:Basic_student,id:1},{type:Advanced_student,id:3}]

To enable joins by the sql engines, it would be possible to write a custom module which would be made even easier if the contents of students_keys was simply [1,3], ie that a relation was to the explicit Student type.

The questions are the following in the context of:

I fail to see what the point of a junction table is. For example, I fail to see that any problems the following arguments for a junction table claim to relieve, actually exist:

  • Inability to logically correctly save a bidirectional relations (eg there is no data orphaning in bidirectional relations or any relations with a keys field, because one recursively saves and one can enforce other operations (delete,update) quite easily)
  • Inability to join effectively

I am not soliciting opinions on your personal opinions on best practices or any cult-like statements on normalization.

The explicit question(s) are the following:

  1. What are the instances where one would want to query a junction table that is not provided by querying a owning object's keys field?
  2. What are logical implementation problems in the context of computation provided by the sql engine where the junction table is preferable?
  3. The only implementation difference with regards to a junction table vs a keys fields is the following:

When searching for a query of the following nature you would need to match against the keys field with either a custom indexing implementation or some other reasonable implementation:

class_dao.search({students:advanced_student_3,name:"history"});

search for Classes that have a particular student and name "history"

As opposed to searching the indexed columns of the junction table and then selecting the approriate Classes.

I have been unable to identify answers why a junction table is logically preferable for quite literally any reason. I am not claiming this is the case or do I have a religious preference one way or another as evidenced by the fact that I implemented multiple ways of achieving this. My problem is I do not know what they are.

4

1 回答 1

2

在我看来,你有几个实体

CREATE TABLE StudentType
(
    Id Int PRIMARY KEY,
    Name NVarChar(50) 
);

INSERT StudentType VALUES
(
    (1, 'Basic'),
    (2, 'Advanced'),
    (3, 'SomeOtherCategory')
);

CREATE TABLE Student
(
    Id Int PRIMARY KEY,
    Name NVarChar(200),
    OtherAttributeCommonToAllStudents Int,
    Type Int,
    CONSTRAINT FK_Student_StudentType
        FOREIGN KEY (Type) REFERENCES StudentType(Id)
)

CREATE TABLE StudentAdvanced
(
    Id Int PRIMARY KEY,
    AdvancedOnlyAttribute Int,
    CONSTRIANT FK_StudentAdvanced_Student
        FOREIGN KEY (Id) REFERENCES Student(Id)
)

CREATE TABLE StudentSomeOtherCategory
(
    Id Int PRIMARY KEY,
    SomeOtherCategoryOnlyAttribute Int,
    CONSTRIANT FK_StudentSomeOtherCategory_Student
        FOREIGN KEY (Id) REFERENCES Student(Id)
)
  1. 所有学生共有的任何属性在表格中都有列Student
  2. 具有额外属性的学生类型将添加到StudentType表中。
  3. 每个额外的学生类型都有一个Student<TypeName>表来存储其特定属性。这些表与 具有可选的一对一关系Student

我认为您的“稻草人”连接表是 EAV 反模式的部分实现,唯一明智的做法是当您不知道需要建模哪些属性时,即您的数据将完全是非结构化的. 当这是一个真正的要求时,关系数据库开始看起来不那么受欢迎了。在这些情况下,可以考虑使用 NOSQL/Document 数据库替代方案。


联结表在以下场景中很有用。

假设我们向模型添加了一个 Class 实体。

CREATE TABLE Class
(
    Id Int PRIMARY KEY,
    ...
)

可以想象,我们希望存储学生和班级之间的多对多关系。

CREATE TABLE Registration
(
    Id Int PRIMARY KEY,
    StudentId Int,
    ClassId Int,
    CONSTRAINT FK_Registration_Student
        FOREIGN KEY (StudentId) REFERENCES Student(Id),
    CONSTRAINT FK_Registration_Class
        FOREIGN KEY (ClassId) REFERENCES Class(Id)
)

该实体将是存储与学生注册课程特别相关的属性的正确位置,例如可能是完成标志。其他数据自然会与这个路口相关,比如班级特定的出勤记录或成绩历史。

如果您不以这种方式关联ClassStudent您将如何选择班级中的所有学生以及学生阅读的所有班级。性能方面,这很容易通过关键列上的索引进行优化。


当存在没有任何属性的多对多关系时,我在逻辑上同意,联结表不需要存在。但是,在关系数据库中,联结表仍然是一种有用的物理实现,也许像这样,

CREATE TABLE StudentClass
(
    StudentId Int,
    ClassId Int,
    CONSTRAINT PK_StudentClass PRIMARY KEY (ClassId, StudentId),
    CONSTRAINT FK_Registration_Student
        FOREIGN KEY (StudentId) REFERENCES Student(Id),
    CONSTRAINT FK_Registration_Class
        FOREIGN KEY (ClassId) REFERENCES Class(Id)
)

这允许简单的查询,如

// students in a class?
SELECT StudentId
FROM StudentClass
WHERE ClassId = @classId

// classes read by a student?
SELECT ClassId
FROM StudentClass
WHERE StudentId = @studentId

此外,这提供了一种从任一方面部分或完全管理关系的简单方法,这对于关系数据库开发人员来说是熟悉的,并且可以被查询优化器发现。

于 2014-03-13T09:54:20.707 回答