Trade EverythingJul 11
free markets are responsible for our prosperity. letâs build more of them.
Tarek MansourToday, Twitter open sourced the code for its recommendation algorithm, the company announced in a blog post. The flow chart above shows the main components of the algo, and broadly describes the decision matrix by which Twitter shows users tweets. Find the GitHub depot here.
In its blog post, Twitter indicates that the main goal of the recommendation algorithm is essentially to optimize user engagement:
The recommendation pipeline is made up of three main stages that consume these features:
(1) Fetch the best Tweets from different recommendation sources in a process called candidate sourcing.
(2) Rank each Tweet using a machine learning model.
(3) Apply heuristics and filters, such as filtering out Tweets from users youâve blocked, NSFW content, and Tweets youâve already seen.
The blog post goes into detail about each of these three stages. For example, candidate sourcing, as the company refers to it, is the process by which the algorithm identifies potential tweets to surface in the recommendation timeline. Currently, the company starts with a target of 1,5000 tweets, which is on average split evenly between people you follow, and people you donât:
For each request, we attempt to extract the best 1500 Tweets from a pool of hundreds of millions through these sources. We find candidates from people you follow (In-Network) and from people you donât follow (Out-of-Network). Today, the For You timeline consists of 50% In-Network Tweets and 50% Out-of-Network Tweets on average, though this may vary from user to user.
The algorithm treats In-Network and Out-of-Network tweets differently. In-Network tweets are ranked using Real Graph, which optimizes for engagement. Identifying relevant Out-of-Network tweets is âtrickier,â and is focused on the issue of finding tweets that will be most relevant to the user. The company decides whatâs most relevant by attempting to show users tweets that people they follow engaged with, and tweets from people who have liked tweets similar to what the user has himself liked. Twitter also uses âEmbedding spaceâ to determine whatâs relevant â
Embedding space approaches aim to answer a more general question about content similarity: What Tweets and Users are similar to my interests?âŠ
There are 145k communities, which are updated every three weeks. Users and Tweets are represented in the space of communities, and can belong to multiple communities.
We can embed Tweets into these communities by looking at the current popularity of a Tweet in each community. The more that users from a community like a Tweet, the more that Tweet will be associated with that community.
âAt the end of the day you should be able to trust what you see, and know that it is not manipulated, or is the least manipulated information in the world,â CEO Elon Musk said in a Twitter Spaces today upon release of the algorithmâs code.
âOur optimization is unregretted user-minutes,â he said. âWe donât want users to have a hangover when theyâre done.â
âThe goal is to build trust through transparency with users,â Musk said in the Twitter Space. âI donât think you should trust any social media algorithm that is a black box and you donât know whatâs going on in there. Weâre trying to be the most trusted place on the internet, where you know why things are happening on Twitter. And it [should be] the least game-able system on the internet, is our goal.â
To "open source" something means to make the source code, design, or content of a project, product, or software freely available to the public. By doing this, Twitter is now allowing anyone to view, modify, and distribute the material, typically under specific licensing conditions that ensure the open nature of the project.
When a project or software is open-sourced, it encourages collaboration, innovation, and transparency. People from around the world can contribute their ideas, skills, and expertise to improve the project, fix bugs, or create new features. This process can lead to faster development, increased reliability, and a stronger sense of community involvement.
âItâs going to be quite embarrassing [at first], and people are going to find a lot of mistakes that we are going to fix quickly,â Musk said.
Amjad Masad says the algorithm identifies four categories of Twitter users: power users, democrats, republicans, and Elon Musk himself.
Tanay Jaipuria found code that inserts potentially irrelevant tweets in the recommendation timeline, just because the user has Twitter Blue.
@0xCygaar discovered, potentially, five things that determine a tweetâs reach: the accountâs number of blocks, mutes, abuse reports, spam reports, and unfollows.
This article is being updated throughout the day today, and some parts were drafted by GPT-4.
-Brandon Gorrell
0 free articles left