1) 那你有这个权利吗?-> 索引时,多值属性列表大小的种类限制在 20K 左右(这是您的情况,因为您将针对订阅者 ID 运行查询)Google App Engine 数据存储的 ListProperty 的最大大小/限制是多少?
总而言之,在这种用例中您将面临的限制是: - 索引多值属性大小 (20K) - 实体大小 (1MB) - 除非您在其中存储 blob,否则应该没问题
2)故障需要手动处理,因为我不知道有任何持久性框架可以做到这一点。Objectify 是唯一一个足够专门化 GAE 数据存储以具有这种功能的持久性框架,尽管如此 IDK,我并没有使用它。
3) 您需要清楚地了解促使您在 GAE 数据存储上对用例进行建模的约束。在我看来,您仍然深受关系数据库建模的影响:
由于您正在为数百万用户进行规划,因此您正在为规模和性能构建您的应用程序。这些“连接”正是您必须避免的,这就是为什么您一开始就没有使用 RDBMS。关键是:重复!非规范化,以便您的数据与您的用例匹配。
public class UserEntity {
@Id Key id;
String name;
/** INDEXED : to retrieve a user by display name */
String displayName;
/** For the sake of the example below */
int tweetCount;
/**
* USE CASE : See a user's followers from his "profile" page.
*
* Easily get subscribers data from your user entity.
* Duplicate UserEntity (this object) 's data in the UserSubscriberEntity.
* You just need to run an ancestor query on UserSubscriberEntity using the User id.
*/
List<UserSubscriberChildEntity> subscribers;
}
/** Duplicate user data in this entity, retrieved easily with an ancestor query */
public class UserSubscriberChildEntity {
/** The id of this entity */
@Id Key subscriberId;
/** Duplicate your User Entity data */
String name;
String displayName;
/** The id from the UserEntity referenced */
String userId;
}
public class TweetEntity {
@Id Key id;
/**
* The actual text message
*/
String tweetContent;
/**
* USE CASE : display the tweet maker name alongside the tweet content.
*
* Duplicate user data to prevent an expensive join when not needed.
* You will always need to display this along with the tweet content !
* Model your entity based on what you want to see when you display them
*/
String tweetMakerName;
String tweetMakerDisplayName;
/**
* USE CASE
* 1) to retrieve tweets MADE by a given user
* 2) In case you actually need to access the User entity
* (for example, if you remove this tweet and want to decrease the user tweet counter)
*
* INDEXED
*/
Key tweetMakerId;
/**
* USE CASE : display tweet subscribers from the "tweet page"
*
* Same as "UserSubscriberChildEntity", retrieve data fast by duplicating
*/
List<TweetSubscriberChildEntity> subscribers;
}
现在的核心问题是:如何检索“一位用户订阅的所有推文”?
跨实体分片您的订阅:
/**
* USE CASE : Retrieve tweets one user subscribed to
*
* Same goes for User subscription
*/
public class TweetSubscriptionShardedEntity {
/** unused */
@Id Key shardKey;
/** INDEXED : Tweet reference */
Key tweetId;
/** INDEXED : Users reference */
List<Key> userKeys;
/** INDEXED : subscriber count, to retrieve shards that are actually under the limitation of 20K */
int subscribersCount = 0;
/**
* Add a subscriber and increment the subscriberCount
*/
public void addSubscriber(Key userId) {
userKeys.add(userId);
subscribersCount++;
}
}
将所有内容连接在一起的示例推文服务:
/**
* Pseudo code
*/
public class TweetService {
public List<TweetEntity> getTweetsSubscribed(Key userId) {
List<TweetEntity> tweetsFollowed = new ArrayList<TweetEntity>;
// Get all the subscriptions from a user
List<TweetSubscriberShardedEntity> shards = datastoreService.find("from TweetSubscriberShardedEntity where userKeys contains (userId)");
// Iterate over each subscription to retrieve the complete Tweet
for (TweetSubscriberShardedEntity shard : shards) {
TweetEntity tweet = datastoreService.get(TweetEntity.class, shard.getTweetId);
tweetsFollowed.add(tweet);
}
return tweetsFollowed;
}
public void subscribeToTweet(Key subscriberId, Key tweetId) {
TweetSubscriberShardedEntity shardToUse = null;
// Only get the first shard with under 20000 subscribers
TweetSubscriberShardedEntity shardNotFull = datastoreService.find("
FROM TweetSubscriberShardedEntity
WHERE tweetId == tweetId
AND userKeys contains (subscriberId)
AND subscribersCount < 20000
LIMIT 1");
if (shardNotFull == null) {
// If no shard exist create one
shardToUse = new TweetSubscriberShardedEntity();
}
else {
shardToUse = shardNotFull;
}
// Link user and tweet
shardToUse.setTweet(tweetId);
shardToUse.getUserKeys().add(subscriberId);
// Save shard
datastoreService.put(shardToUse);
}
/**
* Hard to put in a transaction with so many entities updated !
* See cross entity group docs for more info.
*/
public void createTweet(UserEntity creator, TweetEntity newTweet) {
creator.tweetCount++;
newTweet.tweetMakerName = creator.name;
newTweet.tweetMakerDisplayName = creator.displayName;
newTweet.tweetMakerId = creator.id;
// Duplicate User subscribers to Tweet
for(UserSubscriberChildEntity userSubscriber : creator.subcribers) {
// Create a Tweet child entity
TweetSubscriberChildEntity tweetSubscriber = new TweetSubscriberChildEntity();
tweetSubscriber.name = userSubscriber.name;
// ... (duplicate all data)
newTweet.add(tweetSubscriber);
// Create a shard with the previous method !!
subscribeToTweet(newTweet.id, subscriber.id);
}
// Update the user (tweet count)
datastoreService.put(creator);
// Create the new tweet and child entities (duplicated subscribers data)
datastoreService.put(newTweet);
}
}