我有一个酒店评论数据集。数据集中的每个文件都针对不同的酒店,并包含访客为该酒店撰写的许多评论。我被分配了 5 个任务,列表:
1)我的数据集中的关系:
HotelReview(int: OverallRating, int: AveragePrice, url: URL, string: Author, string: Content, date: Date, int: No. Reader, int: No. Helpful, int: Overall, int: Value, int: Rooms, int: Location, int: Cleanliness, int: Checkin / front desk, int: Service, int: Business Service)
2)我的数据集的主键:
Author and URL (**Composite Key**)
3)功能依赖:
• Content -> OverallRating, AveragePrice, URL, Author, Date, No. Reader, No. Helpful, Overall, Value, Rooms, Location, Cleanliness, Checkin / front desk, Service, Business Service
• Author, URL -> Content -> OverallRating, AveragePrice, URL, Content, Date, No. Reader, No. Helpful, Overall, Value, Rooms, Location, Cleanliness, Checkin / front desk, Service, Business Service
• Author, Date -> OverallRating, AveragePrice, URL, Author, Date, No. Reader, No. Helpful, Overall, Value, Rooms, Location, Cleanliness, Checkin / front desk, Service, Business Service
4)潜在的候选键:
Content
但现在我正在努力完成第五个任务。我被要求将我的关系规范化为 BCNF (3.5NF)。我已经研究过如何做到这一点,但这对我来说没有意义,而且我无法在我自己的关系中复制规范化。任何帮助和建议将不胜感激。
以下是酒店数据集中的示例文件:
<Overall Rating>4
<Avg. Price>$173
<URL>http://...
<Author>everywhereman2
<Content>Old seattle getaway...
<Date>Jan 6, 2009
<img src="http://cdn.tripadvisor.com/img2/new.gif" alt="New"/>
<No. Reader>-1
<No. Helpful>-1
<Overall>5
<Value>5
<Rooms>5
<Location>5
<Cleanliness>5
<Check in / front desk>5
<Service>5
<Business service>5
<Author>RW53
<Content>Location! Location? view from room of nearby freeway
<Date>Dec 26, 2008
<No. Reader>-1
<No. Helpful>-1
<Overall>3
<Value>4
<Rooms>3
<Location>2
<Cleanliness>4
<Check in / front desk>3
<Service>-1
<Business service>-1
...new review e.t.c
这是一个表格形式的酒店评论示例:
蓝色色调代表标识评论所涉及的酒店的列,而黄色列代表我的复合主键(作者和
谢谢你的时间。