For more than 20 years, eBay has followed the traditional recommendation thinking of the same set of e-commerce for some years to recommend products on the entry page, product search page, price comparison field, similar product field, cross-selling field, advertising product field, and even the so-called "personalized recommendation" field.
This set of traditional practices is the recommendation of product similarity, taking the "main product" selected or being selected by the customer as the benchmark of comparison, comparing the characteristics with different "candidate recommended products", and then ranking according to the similarity, recommending the most similar products to customers.
Over the years, eBay has continued to strengthen its recommendation technology, such as calculating similarity from TF-IDF, which compares the frequency of word occurrences, to using NLP models to compare the semantics between product titles. They also continue to add different dimensions of product features to further compare the similarity of different aspects of the product. In addition, the main product will be replaced with other information according to the recommendation scenario, such as product search conditions, products recently viewed by customers, etc.
When strengthening the recommendation technology, eBay does not hesitate to invest a lot of time, manpower, and hardware costs, and repeatedly conducts large-scale, small-grained testing and optimization. They often perform thousands of experiments for a single function, or train multiple versions of AI models, just to make product recommendations as relevant as possible. For example, in order to enhance the ability of the similarity recommendation system to understand product descriptions, they used more than 3 billion product titles, trained two proprietary NLP models, and fine-tuned them again before putting them into practical use.
These practices have brought more or less results. However, regardless of how these technologies evolve, the final analysis is to compare the similarities of features between products, and it is assumed that similarity is higher than correlation, and correlation is higher than customer purchase intention.
Train an exclusive NLP model with 3 billion product titles to strengthen semantic similarity understanding
For more than 25 years, eBay has experimented with various product similarity recommendations. The most commonly compared product feature is the product title. Over the years, their method of comparing product titles has also been traditional, using statistical methods such as TF-IDF, which has been around for 50 years, and the Faljakar similarity coefficient of semantic intersection concatenation set comparison, to calculate the similarity of product titles.
It wasn't until the 2020s that eBay began to switch to the NLP model for embedding vector comparisons. They adopted the BERT family model, which was the dominant model at the time. After experimenting with an optimized version of the BERT model, Roberta, they found that the effect of using the NLP model to infer similarity was significantly higher than that of traditional statistical methods. In order to make the similarity inference as accurate as possible, they used their own 3 billion historical product titles and 2500,000,000,000 English, German, French, Italian, and Spanish sentences on Wikipedia are used as training data, and an exclusive BERT model is built by yourself. Then, in order to save the computing cost, eBay trained a lightweight student network microbert through the knowledge distillation model compression method and ebert as the teacher network.
In order to further improve the similarity inference effect of the Microbert model, eBay added the co-clicking data of goods, and used the infonce loss function to fine-tune the model to shorten the embedding vector distance between similar products. The fine-tuned model is called the Siamese Microbert and is used to put it into the actual recommendation scenario.
However, even if the popular BERT model is used, and even a variety of optimization and improvement methods have been tried, the model trained by eBay on product title data is still the recommended result of traditional product similarity thinking.
Integrate ** with text embedding vectors while comparing multimodal product features
After using text to compare the similarity of products, eBay began to try to compare products in combination with **similarity.
In the past, although they used AI models to generate ** and text modal embedding vectors, they were stored and processed independently, and it was difficult to use them together. In practice, the recommendation system still uses the text embedded vector of the product title as the main basis for recommendation. Recommend only from the similarity of the text of the product title, it may happen that the text description is close, but **not similar, or even low quality, **inconsistent, so that customers lose interest.
In order to effectively apply the vector to the recommendation, they project the graph and text vector into the same vector space, integrate them into a multi-modal embedding vector, and then compare the similarity. They used the Transh knowledge graph model to project the embedding vector onto the same hyperplane to ensure that the embedding vector represents the same product. Then, the triplet loss function is used to minimize the embedding vector of similar goods. After these processes, eBay can use both product** and text embedding vectors to compare product similarity.
They used the Siamese twin tower model to conduct two embedding vector comparisons, and calculated the probability of the main product and the recommended product being clicked at the same time according to the similarity, and the higher the similarity, the higher the probability that customers could click on the recommended product.
The first time is to compare the distance between the ** and the text embedding vector of the product itself to rule out the weakening of the similarity by the ** inconsistency. The second time, the main product and the recommended product, each ** embedded vector to compare the similarity.
Since then, eBay's recommendation model has developed from text recommendation to multi-modality, and the text is combined with the product characteristics of ** to compare the similarity of goods. Although ebay recommendation technology has taken a big step forward, it is still focused on product similarity, and what cannot be done through product similarity, even if it uses ** combined text, it still can't be done.
Transform customer behavior into product features, and project product information into the same vector space
eBay is not unaware of the importance of customer data, and has also used customer data to solve the recommendation comparison without a main product, but they are limited by the framework of product similarity thinking, even if they get customer data, they still use it to strengthen product similarity, and do not really give full play to the value of the diverse interest dimensions in customer data.
For example, on the eBay homepage, "similar to your recent views" and other fields, sometimes the main product information that customers have browsed has not been obtained. eBay uses a so-called "customer-oriented" ranking model, which is divided into two parts: "depth" and "breadth". In depth, eBay converts the products that customers browse and search, as well as the frequency and sequence of these behaviors, into corresponding product characteristics, as a "customer vector", which will act as the main product role in the general recommendation scene for comparison.
In addition, in terms of breadth, eBay uses memorization of common product sales and customer behavior characteristics to create general recommendations (rather than personalized) information, such as popular products, related products, and customer habits. Combining the output data of these two architectures, the model can recommend products based on past customer behavior when they are not clicking on the main product.
Although they used customer behavior data to create a "customer vector" to recommend products, this "customer vector" only extracted a series of product characteristics from customer behavior and used to compare the similarity with the recommended products, and still did not break away from the spirit of taking product similarity as the basis for recommendation.
The product similarity paradigm limits thinking and recommendation effect
eBay's emphasis on product similarity is deeply ingrained, and whenever it comes to optimizing their recommendation methodology, their first thought is, "How do I strengthen similarity inference?" ”
For example, they have a fee-based similarity product recommendation ad layout, using a gradient boosted tree to train the recommendation model behind it, using a variety of sorting algorithms, including popularity, product quality, and similarity, in order to improve the ad conversion rate, amplify the impact of popularity on the ranking.
As a result, products recommended through this page may appear that are very different from the main product, such as the **watch page, which appears ** and the appearance is far from the main product.
When the eBay team discovered this problem, they directly chose to increase the weight of product similarity, instead of further studying and fine-tuning the weights of different factors in the model. They mistakenly believe that simply adjusting for similarity will make the recommended product more relevant and more likely to cause consumers to buy it.
They even deliberately excluded the influence of popularity and fine-tuned a new model to filter the results of the old model. The new modality removes all sort functions related to popularity, and adds an objective function with a probability of purchase, giving higher weight to products with high similarity scores. However, after applying the new model, the recommendation results did not increase the purchase rate of customers as expected, and only some products were valid.
This result reflects that there are other influencing factors that need to be considered, but when the eBay team found that the recommendation mechanism was wrong, they were accustomed to treating product similarity as a panacea, and they thought about the solution from this perspective first.
eBay's approach to improving the recommendation mechanism is not an experiment with "which recommendation element is wrong", but only focuses on "how to strengthen the similarity weight and how much". This over-emphasis on product similarity limits the thinking space of their engineering team.
Over the years, eBay has continued to improve its recommendation mechanism, and it has indeed effectively improved its similarity-based recommendation method. However, Nitzan Mekel-Bobr** admits that it is not necessary to modernize the old practices to break the bottleneck faced by the eBay recommendation mechanism. Even if the similarity recommendation is done to the extreme, the recommendation scenarios that this thinking could not handle in the first place still cannot be handled.
He believes that for the eBay mechanism to make a breakthrough, it must fundamentally change its thinking. First of all, when defining product relevance, it is necessary to go beyond the 25-year obsession with similarity and consider more factors.
Not only that, but they also have to think outside the platform, otherwise, when the products that customers are interested in are not regarded as the same category by the platform, even if the relevance is obvious to the general public, the product similarity recommendation system will still not be able to recommend. The new thinking should be changed from the customer's point of view, trying to understand how customers perceive their products, and adjusting product categories and solutions according to each customer's interests, so as to achieve super personalized product recommendations.