Use AI-generated content to introduce a project that emphasizes human data.

Guess who I'm talking about?

Recently, I came across the slogan on the Sapien homepage 'Powering AI with Human Data', which I found quite interesting. I then searched for related articles on Twitter, but they were all AI content tagged with cookies.

Projects discussing Human Data aren't using Human Data 😅😅


Let me briefly explain this project using Human Data:

Sapien is a data labeling factory, very similar to the data labeling business of @SaharaLabsAI—platforms publish labeling tasks, and users complete them to earn rewards. The reward and punishment model is also very similar—tasks are graded, users are also graded, and users with higher credit scores can undertake more advanced tasks.

If you want to know more about Sahara, you can check my previous tweets.

However, there are differences:

First point: Introduced a staking and reputation system.

Before starting the task, users need to stake a certain amount of SPN tokens; upon completion, if the labeling results are assessed as high quality, they will be rewarded with points and their user reputation level will increase, otherwise they may lose a portion of their staked tokens. This helps users take tasks seriously.

Second point: Task push algorithm

The official introduction states that there will be a task push algorithm, but there hasn't been any mention of this in the documentation yet. This is actually what I want to see; since we emphasize 'Human Data', we need to pay attention to people. People have personalities and interests, and the premise of making tasks entertaining is to cater to their preferences. I hope Sapien can do well in task algorithms regarding search and recommendation, rather than just simple task distribution.

By the way, let me introduce the concept of 'search and recommendation':

Search, advertising, recommendation, taking the first letters of each.

Search: When a user inputs a search term (query) in the search box, they hope to find the most relevant document(s) from a collection of documents.

Advertising: When you are scrolling through your friends' moments, the system displays an advertisement for you, where the advertising system selects the advertisement that you are most likely to click (convert) from numerous advertisements submitted by various advertisers.

Recommendation: When a user performs a swipe up action on Douyin, the system launches the next video, hoping to find the video that the user currently wants to watch from a large collection of videos.

So why do we find it hard to stop scrolling through Douyin or Xiaohongshu? It's actually because the search and recommendation algorithm is enticing us to keep scrolling. For products, this can greatly increase user stickiness, engagement, and time spent.

Returning to Sapien, if data labeling tasks could be assigned through a similar algorithm to allow users to match with the tasks they most want to do, that is the innovation I want to see.

Currently, I can only see a few tasks on the task board, but the official website states that the total number of tasks exceeds 100M, and it should still be in the experimental stage. I also hope @JoinSapien sees that I can make some contributions in task distribution; I see data as a very promising field.

#Aİ #Sapien