0

我有一个酒店评论数据集。数据集中的每个文件都针对不同的酒店,并包含访客为该酒店撰写的许多评论。我被分配了 5 个任务,列表:

1)我的数据集中的关系:

HotelReview(int: OverallRating, int: AveragePrice, url: URL, string: Author, string: Content, date: Date, int: No. Reader, int: No. Helpful, int: Overall, int: Value, int: Rooms, int: Location, int: Cleanliness, int: Checkin / front desk, int: Service, int: Business Service)

2)我的数据集的主键:

Author and URL (**Composite Key**)

3)功能依赖:

•   Content -> OverallRating, AveragePrice, URL, Author, Date, No. Reader, No. Helpful, Overall, Value, Rooms, Location, Cleanliness, Checkin / front desk, Service, Business Service

•   Author, URL -> Content -> OverallRating, AveragePrice, URL, Content, Date, No. Reader, No. Helpful, Overall, Value, Rooms, Location, Cleanliness, Checkin / front desk, Service, Business Service 

•   Author, Date -> OverallRating, AveragePrice, URL, Author, Date, No. Reader, No. Helpful, Overall, Value, Rooms, Location, Cleanliness, Checkin / front desk, Service, Business Service

4)潜在的候选键:

Content

但现在我正在努力完成第五个任务。我被要求将我的关系规范化为 BCNF (3.5NF)。我已经研究过如何做到这一点,但这对我来说没有意义,而且我无法在我自己的关系中复制规范化。任何帮助和建议将不胜感激。

以下是酒店数据集中的示例文件:

<Overall Rating>4
<Avg. Price>$173
<URL>http://...

<Author>everywhereman2
<Content>Old seattle getaway...
<Date>Jan 6, 2009
<img src="http://cdn.tripadvisor.com/img2/new.gif" alt="New"/>
<No. Reader>-1
<No. Helpful>-1
<Overall>5
<Value>5
<Rooms>5
<Location>5
<Cleanliness>5
<Check in / front desk>5
<Service>5
<Business service>5

<Author>RW53
<Content>Location! Location?       view from room of nearby freeway 
<Date>Dec 26, 2008
<No. Reader>-1
<No. Helpful>-1
<Overall>3
<Value>4
<Rooms>3
<Location>2
<Cleanliness>4
<Check in / front desk>3
<Service>-1
<Business service>-1

...new review e.t.c

这是一个表格形式的酒店评论示例:

表格形式的示例审查

蓝色色调代表标识评论所涉及的酒店的列,而黄色列代表我的复合主键(作者和

谢谢你的时间。

4

1 回答 1

1

鉴于您的功能依赖关系,有三个候选键:

{ (Author, Date) (Author, URL) (Content) }

如果你计算它们每个的闭包,你可以很容易地验证这一点。

由于这个原因,该关系已经处于 Boyce-Codd 范式(BCNF)中,因为对于每个依赖项,行列式都是一个(候选)键(这是 BCNF 的定义)。

于 2016-04-09T13:36:43.007 回答