2022-05-17 08:50:43

we need to know what are the key points of the news feed system?
follow and unfollow
the new feed for everyone is different.

我们有两种模型来构建news feed: pull(主动) push(被动)

Pull model:
when user looking for his news feed, the system should get the latest 100 tweets of each of his friends and combined to 100 news feed(based on the timestamps)
algorithm: getNewFeed+ Merged K sort arrays.

• followings = DB.getFollowings(user=request.user) //get the following
• news_feed = empty
• for follow in followings: //iterate the following
• tweets = DB.getTweets(follow.to_user, 100) //get the tweets of each one of them
• news_feed.merge(tweets) //merge them
• sort(news_feed) //merge k sort arrays them by time stamps
• return news_feed

time complexity analysis: suppose we have N following, so there will be the time of N DB reads+K merge time
the principle of Pull:
Push Model:
push happens when there are new tweets are being posted.
time complexity analysis:
• postTweet(request, tweet_info): //异步执行
• tweet = DB.insertTweet(request.user, tweet_info)
• AsyncService.fanoutTweet(request.user, tweet)return success

• AsyncService::fanoutTweet(user, tweet)
• followers = DB.getFollowers(user)for follower in followers:
• DB.insertNewsFeed(tweet, follower)

the principle of push:
facebook: pull
ig: push+pull
twitter: pull
可以话说回来 如何选择呢?每一种都可以,但是要针对每一个的缺点要有解决方法 这个再系统设计4S中的最后一步Scale中解决。
  1. 明星用户,适用push model的话 fanout的整个过程可能需要几个小时,既然Push不行 那么就换成pull? 可是哪有那么简单啊 想换就换!
    正确的思路:首先尝试在现有模型下做最小改动来进行优化 比如加几台用于做Push任务的机器。或者对长期的增长进行评估来判断是否值得转换整个模型。
    比较好的答案:针对普通用户 我们只Push 但是针对明星用户,我们不主动给push给所有的follower,而是当其关注着需要的时候 来到明星用户的timeline里面取i,并合并到newsFeed里面。
    2.明星用户 但是其粉丝数量出现摇摆 出现减小或者别的情况

push: 资源少,实时性要求低 用户发帖数量少 双向好友关系(即没有明星,比如朋友圈)
pull:资源充足 实时性要求高 用户大量发帖 有明星问题