我正在我的网站上实现一个标记系统,类似于一个 stackoverflow 使用,我的问题是 - 存储标签以便可以搜索和过滤它们的最有效方法是什么?
我的想法是这样的:
Table: Items
Columns: Item_ID, Title, Content
Table: Tags
Columns: Title, Item_ID
这太慢了吗?有没有更好的办法?
我正在我的网站上实现一个标记系统,类似于一个 stackoverflow 使用,我的问题是 - 存储标签以便可以搜索和过滤它们的最有效方法是什么?
我的想法是这样的:
Table: Items
Columns: Item_ID, Title, Content
Table: Tags
Columns: Title, Item_ID
这太慢了吗?有没有更好的办法?
一件物品会有很多标签。一个标签将属于许多项目。这对我来说意味着你很可能需要一个中间表来克服多对多的障碍。
就像是:
表:项目
列:Item_ID、Item_Title、内容
表:标签
列:Tag_ID、Tag_Title
Table: Items_Tags
Columns: Item_ID, Tag_ID
It might be that your web app is very very popular and need de-normalizing down the road, but it's pointless muddying the waters too early.
Actually I believe de-normalising the tags table might be a better way forward, depending on scale.
This way, the tags table simply has tagid, itemid, tagname.
You'll get duplicate tagnames, but it makes adding/removing/editing tags for specific items MUCH more simple. You don't have to create a new tag, remove the allocation of the old one and re-allocate a new one, you just edit the tagname.
For displaying a list of tags, you simply use DISTINCT or GROUP BY, and of course you can count how many times a tag is used easily, too.
If you don't mind using a bit of non-standard stuff, Postgres version 9.4 and up has an option of storing a record of type JSON text array.
Your schema would be:
Table: Items
Columns: Item_ID:int, Title:text, Content:text
Table: Tags
Columns: Item_ID:int, Tag_Title:text[]
For more info, see this excellent post by Josh Berkus: http://www.databasesoup.com/2015/01/tag-all-things.html
There are more various options compared thoroughly for performance and the one suggested above is the best overall.
根据您在问题中提供的数据,您无法真正谈论缓慢。而且我认为您甚至不应该在这个开发阶段过分担心性能。这称为过早优化。
但是,我建议您在 Tags 表中包含 Tag_ID 列。每个表都有一个 ID 列通常是一个好习惯。
我建议使用中间第三个表来存储标签<=>项目关联,因为我们在标签和项目之间有多对多的关系,即一个项目可以与多个标签相关联,一个标签可以与多个项目相关联。HTH,阀门。
如果空间将成为问题,请使用第三个表 Tags(Tag_Id, Title) 来存储标签的文本,然后将 Tags 表更改为 (Tag_Id, Item_Id)。这两个值也应该提供唯一的复合主键。
Items should have an "ID" field, and Tags should have an "ID" field (Primary Key, Clustered).
Then make an intermediate table of ItemID/TagID and put the "Perfect Index" on there.
Tag Schema: Tag tables and attributes:
Tables:
tags (each row only keeps information about a particular tag)
taggings (each row keeps information about trigger and who will receive the trigger )
products_tags (each row keeps information about tag with particular product)
tag_status (each row keeps track of a tag status)
Table: tags Attributes of tags table:
id(PK)
userId(FK users)(not null)(A tag only belongs to one user, but a user can create multiple tags. So it is one to many relationships.)
genreId(FK products_geners)(not null)
name (string) (not null)
description (string)
status (int) (0=inactive, 1=pending, 2=active, there could be more flag)
rank(int) (rank is the popularity of a particular tag), this field can be use for sorting among similar tags.)
type (int) (0=type1, 1=type2, 2=type3)
photo(string)
visibility (int) (0=public, 2=protected, 3 = private)(private means the tag only visible to assigned users of a product, protected means a tag only visible to all friends and followers of the creator of the tag, public means search by public, such as all admin created tag)
createdAt(timestamp for the tag was created at)
updatedAt (timestamp for the tag last time updated)
deletedAt (default value null) (timestamp when tag was deleted, we need this field because we will delete tag permanently from audit table).
Note: Keeping field no 10 will come handy later.
Table: taggings :
This table will be used for triggering such as broadcasting other users' feed or sending them notification. After a row inserted in this table, there will be a service who will read a row take associated action to remove the row.
Attributes of taggings table:
Id(PK)
tagId(a tagging row only belongs to a tag, but a tag can have multiple row).
taggableId (id of a user who will receive notification)
taggableType(int) (0=notification, 1=feed message)
taggerId(the person who triggered the broadcast)
taggerType(ad, product, news)
createdAt(timestamp for the tag was created at)
Table: products_tags
From user perspective a user able to create a tag after instantiating an product, so bellow table will keep information about which products has which tags.
Attributes of Attributes of taggings table:
Id (PK)
productId(FK)
tagId(FK)
Table: tag_status
When user will create a tag, a row will be created in this table with tagId and default status inactive/pending, admin will pull all tags from tags table where status=pending/inactive, after reviewing a tag if admin approved the tag then value of status in tag table will be approved and the row of tag_status will be removed. If admin is rejected then the value of the status field of tag_status table will be rejected and a trigger will be broadcasted and the receiver will send a notification to the associated user of that tag with a message that his tag is rejected .
id(PK)
senderId(Id of the user)
receiverId(Id of admin user)
createdAt(timestamp of created at)
updatedAt(timestamp of updated at)
deletedAt(timestamp of deletedAt) default value null
expiredAt (if a tag never gets approved it will expire after a certain time for removing its information from the database. If a rejected tag gets updated by user then expiredAt will reset to new future time)
status
Message (string varchar(256)) (message for user)