"you may also like"? Find out the routine, and come to DIY a recommendation system yourself

Posted May 25, 202015 min read

Exclusive Image 1.png

Although it has not yet entered June, the summer promotion of major e-commerce companies has already begun in full swing. Discounts, full reductions, spikes, coupons ... Of course, after you have successfully emptied the shopping cart, there is still a "guess you like" link that people like to see. If you use it frequently, you will find that this type of recommended content usually fits your own taste. How does e-commerce do it? This involves the concept of "recommendation system".

What is the principle of the product recommendation system?

The recommendation system is to use the e-commerce website to provide product information and suggestions to customers, help users decide what products should be purchased, and simulate sales staff to help customers complete the purchase process. Personalized recommendation is to recommend information and commodities of interest to users based on their characteristics and purchase behaviors.
The earliest recommendation algorithm is usually established by association rules, such as the famous story of beer diapers, which is a classic case of using association rules to recommend products to promote transactions. The algorithms born for it, such as the Apriori algorithm, were invented by Amazon.

When we buy books on Amazon, we often see the following two tips:

  1. These books will be purchased by consumers together, and there is a certain discount on the price;
  2. People who bought this book will also buy other books.

Amazon will mine massive user records on the platform, discover these laws, and then apply these laws to actual sales. The results also show that the use of this algorithm optimization played a great role in improving Amazon's performance at that time.

Today, with the continuous expansion of e-commerce and the rapid growth of product types, customers need to spend a lot of time to find the goods they want to buy. This process of browsing a large amount of irrelevant information and products will undoubtedly cause consumers who are drowning in information overload to continue to lose. To solve these problems, personalized recommendation system came into being.

Personalized recommendation system is an advanced business intelligence platform built on the basis of massive data mining, which can help e-commerce websites to provide their customers with fully personalized decision support and information services. Different from traditional rule recommendation, personalized recommendation algorithms usually use machine learning or even deep learning algorithms to fully mine user information and behavior information, and then make effective recommendations.
There are many commonly used recommendation algorithms, the most classic of which is recommendation based on Matrix Factorization. The simple idea of matrix factorization is that each user and each item will have its own characteristics. Using the matrix factorization method, the user-characteristic matrix and the characteristic-item matrix can be decomposed from the scoring matrix. The advantage is to get the user's preferences and the characteristics of each item. The idea of matrix factorization has also been extended and generalized to deep learning and Embedding, so that building models can enhance the accuracy and flexibility of the model.

Try it out, use Amazon SageMaker to build a recommendation system based on Gluon

The method described below will use Amazon SageMaker , which can help developers and data scientists build, train, and deploy ML models. Amazon SageMaker is a fully managed service that covers the entire workflow of ML. It can mark and prepare data, select algorithms, train models, adjust and optimize models for deployment, prediction, and execution.

At the same time, this solution is based on the Gluon API. Gluon is an open source deep learning library launched by Microsoft and Amazon. This is a clear, concise, simple, but powerful deep learning API. This specification can improve the speed of developers learning deep learning without the need for Care about the selected deep learning framework. The Gluon API provides a flexible interface to simplify deep learning prototype design, creation, training, and deployment without sacrificing the speed of data training.

The following will introduce how to use Amazon SageMaker's custom script(Bring Your Own Script, BYOS for short) to run the training task of the Gluon program(MXNet backend) and make deployment calls.

First, let's take a look at how to run this project on Amazon SageMaker Notebook, run the training task locally, and then deploy it, using the relevant interface calls of Amazon SageMaker directly.

Solution Overview

In this example, we will use Amazon SageMaker to do the following:

  1. Environmental preparation
  2. Use Jupyter Notebook to download the dataset and preprocess it
  3. Use local machine training
  4. Use Amazon SageMaker BYOS for model training
  5. Managed deployment and inference testing
1. Environment preparation

First, create an Amazon SageMaker Notebook. The notebook instance type is best to choose ml.p3.2xlarge, because in this example, the local machine training part is used to test our code. It is recommended to change the volume size to 10GB or larger, because running The project needs to download some additional data.


After the notebook starts, open the terminal on the page and execute the following commands to download the code; or you can find Introduction to Applying Machine Learning/gluon \ _recommender \ _system.ipynb through Sagemaker Examples, and click Use to run the code.

cd ~/SageMaker
git clone https://github.com/awslabs/amazon-sagemaker-examples/tree/master/introduction_to_applying_machine_learning/gluon_recommender_system
2. Use Jupyter Notebook to download the dataset and preprocess it

This article uses the official Amazon open source data set, which contains 2,000 Amazon e-commerce users rated 160k videos, with a score of 1-5. You can visit the [Data Set Homepage]( https://s3.amazonaws . com/amazon-reviews-pds/readme.html) View the complete data description and download. Since this data set is very large, we will use a temporary directory for storage.

! mkdir/tmp/recsys /
! aws s3 cp s3://amazon-reviews-pds/tsv/amazon_reviews_us_Digital_Video_Download_v1_00.tsv.gz/tmp/recsys /
3. Data preprocessing

After downloading the data, you can read, browse and preprocess the data through Python's Pandas library.

First, run the following code to load data:

df = pd.read_csv('/tmp/recsys/amazon_reviews_us_Digital_Video_Download_v1_00.tsv.gz', delimiter = '\ t', error_bad_lines = False)

Then you can see the following results, because there are more columns, the screenshot here is not complete:


We can see that the data set contains many features(columns), and the specific meaning of each column is as follows:

  • marketplace:two-digit country code, here is "US"
  • customer \ _id:a random code representing the user who posted the comment, unique for each user
  • review \ _id:unique code for the review
  • product \ _id:Amazon common product code
  • product \ _parent:parent product code, many products belong to the same parent product
  • product \ _title:product description
  • product \ _category:product category
  • star \ _rating:the number of reviews, from 1 to 5
  • helpful \ _votes:number of useful comments
  • total \ _votes:total comments
  • vine:Whether it is a comment in the vine project
  • verified \ _purchase:Does the review come from a customer who has already purchased the product
  • review \ _headline:the title of the review
  • review \ _body:review content
  • review \ _date:review time

In this example, we are only going to build models using the three columns custermor \ _id, product \ _id and star \ _rating. This is also the minimum three columns of data we need to build a recommendation system. If the remaining feature columns are added when building the model, it can effectively improve the accuracy of the model, but this article will not include this part. At the same time, we will keep the product \ _title column for results verification.

df = df [['customer_id', 'product_id', 'star_rating', 'product_title']]

At the same time, because most of the videos have not been watched by most people, our data is very sparse. In general, the model of the recommendation system can handle sparse data well, but this generally requires large-scale data to train the model. In order to make the experimental example run more smoothly, here we will verify this sparse data scenario, and clean it, and use a denser reduced \ _df for model training.

The sparseness("long tail effect") can be verified by the following code

customers = df ['customer_id']. value_counts()
products = df ['product_id']. value_counts()
quantiles = [0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.96, 0.97, 0.98, 0.99, 1]
print('customers \ n', customers.quantile(quantiles))
print('products \ n', products.quantile(quantiles))

As you can see, only 5%of the customers commented on 5 or more videos, while only 25%of the videos were reviewed by more than 10 users.

Next, we will filter the data to remove long-tail users and products:

customers = customers [customers> = 5]
products = products [products> = 10]
reduced_df = df.merge(pd.DataFrame({'customer_id':customers.index})). merge(pd.DataFrame({'product_id':products.index}))

Then recode users and products:

customer_index = pd.DataFrame({'customer_id':customers.index, 'user':np.arange(customers.shape [0])})
product_index = pd.DataFrame({'product_id':products.index,
'' '' '' 'Item':np.arange(products.shape [0])})
reduced_df = reduced_df.merge(customer_index) .merge(product_index)

Next, we cut the prepared data set into training and validation sets. The verification set will be used as a model effect verification and will not be used in training:

test_df = reduced_df.groupby('customer_id'). last(). reset_index()
train_df = reduced_df.merge(test_df [['customer_id', 'product_id']],
On = ['customer_id', 'product_id'],
how = 'outer',

(Indicator = True)
train_df = train_df [(train_df ['_ merge']== 'left_only')]

Finally, we convert the data set from Pandas DataFrame to MXNet NDArray, this is because the GluNe interface based on MXNe will be used for model training:

atch_size = 1024
train = gluon.data.ArrayDataset(nd.array(train_df ['user']. values, dtype = np.float32),
Nd.array(train_df ['item']. Values, dtype = np.float32),
Nd.array(train_df ['star_rating']. Values, dtype = np.float32))
test = gluon.data.ArrayDataset(nd.array(test_df ['user']. values, dtype = np.float32),
Nd.array(test_df ['item']. Values, dtype = np.float32),
Nd.array(test_df ['star_rating']. Values, dtype = np.float32))
train_iter = gluon.data.DataLoader(train, shuffle = True, num_workers = 4, batch_size = batch_size, last_batch = 'rollover')
test_iter = gluon.data.DataLoader(train, shuffle = True, num_workers = 4, batch_size = batch_size, last_batch = 'rollover')
4. Use local machine training

First, we will train locally by defining a simple network structure. Here we will inherit the Gluon interface to build the MFBlock class. The following main network structure needs to be set. For more information on how to use Gluon to build a custom network model, you can refer to Hands-on Deep Learning . A brief description is as follows:

  • embeddings convert the input dense vector to a fixed length . Here we select "64", adjustable

  • dense layers full link layer, using ReLU activation function

  • dropout layer to prevent overfitting

    class MFBlock(gluon.HybridBlock):
    Def init __(self, max_users, max_items, num_emb, dropout_p = 0.5):
    Super(MFBlock, self) .
    init __()

    Self.max_users = max_users
    Self.max_items = max_items
    Self.dropout_p = dropout_p
    Self.num_emb = num_emb

    With self.name_scope():
    Self.user_embeddings = gluon.nn.Embedding(max_users, num_emb)
    Self.item_embeddings = gluon.nn.Embedding(max_items, num_emb)

    Self.dropout_user = gluon.nn.Dropout(dropout_p)
    Self.dropout_item = gluon.nn.Dropout(dropout_p)
    Self.dense_user = gluon.nn.Dense(num_emb, activation = 'relu')
    Self.dense_item = gluon.nn.Dense(num_emb, activation = 'relu')

    Def hybrid_forward(self, F, users, items):
    A = self.user_embeddings(users)
    A = self.dense_user(a)

    B = self.item_embeddings(items)
    B = self.dense_item(b)
    Predictions = self.dropout_user(a) * self.dropout_item(b)
    Predictions = F.sum(predictions, axis = 1)
    Return predictions

    set up network

    num_embeddings = 64
    net = MFBlock(max_users = customer_index.shape [0],
    Max_items = product_index.shape [0],
    Num_emb = num_embeddings,
    (Dropout_p = 0.5)

    Initialize network parameters

    ctx = mx.gpu()
    net.collect_params(). initialize(mx.init.Xavier(magnitude = 60),
    ctx = ctx,
    (Force_reinit = True) _

    Set optimization parameters

    opt = 'sgd'
    lr = 0.02
    momentum = 0.9
    wd = 0.
    trainer = gluon.Trainer(net.collect_params(),
    , Opt,
    'Wd':wd, wd,

At the same time we also need to build an evaluation function to evaluate the model. MSE will be used here:

def eval_net(data, net, ctx, loss_function):
Acc = MSE()
For i,(user, item, label) in enumerate(data):

User = user.as_in_context(ctx)
Item = item.as_in_context(ctx)
Label = label.as_in_context(ctx)
Predictions = net(user, item) .reshape((batch_size, 1))
Acc.update(preds = [predictions], labels = [label])

return acc.get() [1]

Next, define the training code and give an example of several rounds of training:

def execute(train_iter, test_iter, net, epochs, ctx):

Loss_function = gluon.loss.L2Loss()
For e in range(epochs):

Print("epoch:{}". Format(e))

For i,(user, item, label) in enumerate(train_iter):
User = user.as_in_context(ctx)
   ¢ item = item.as_in_context(ctx)
Label = label.as_in_context(ctx)

With mx.autograd.record():
Output = net(user, item) output = net(user, item)
Loss = loss_function(output, label)

Loss loss.backward()
Print("EPOCH {}:MSE ON TRAINING and TEST:{}. {}". Format(e,
Eval_net(train_iter, net, ctx, loss_function),

(Eval_net(test_iter, net, ctx, loss_function)))
Print("end of training")
Return net
epochs = 3
trained_net = execute(train_iter, test_iter, net, epochs, ctx)

It can be seen that the printed training log is shown in the following figure, which shows that the training is successfully performed, and Loss decreases with the increase of the number of iterations:


5. Use Amazon SageMaker BYOS for model training

In the above example, we trained a smaller model step by step using the local environment to verify our code. Now, we need to organize the code and perform scalable and distributed managed training tasks on Amazon SageMaker.

First, we must organize the above code into a Python script, and then use the pre-configured MXNet container on SageMaker. We can use this container in many flexible ways. For details, please refer to mxnet-sagemaker-estimators.

Next, the following steps will be performed:

  • Encapsulate the work of all data preprocessing into a function, in this article prepare \ _train \ _data
  • Copy and paste all training related codes(functions or classes)
  • Define a function named Train for:
    Add a code for Sagemaker training cluster to read data
    Define hyperparameters as a dictionary as input parameters, in the previous example we defined globally
    Create a network and perform training

We can see the Recommender.py script in the downloaded code directory, which is the edited example.

Now, we need to upload the data locally to S3 so that Amazon SageMaker can directly read it when running in the background. This method is usually very common for large data sets and production environments.

boto3.client('s3'). copy({'Bucket':'amazon-reviews-pds',
Buckle, bucket,

) Prefix + '/train/amazon_reviews_us_Digital_Video_Download_v1_00.tsv.gz')

Finally, we can create an MXNet estimator through the SageMaker Python SDK. We need to pass in the following settings:

  • The type and number of training instances. The MXNet container provided by SageMaker supports both single-machine training and multi-GPU training. You only need to specify the number of training models to switch
  • S3 path of model storage and its corresponding permission setting
  • The hyperparameters corresponding to the model. Here we will increase the number of Embedding. We can see later that this result will be better than the previous results. The hyperparameter configuration here can be further adjusted to obtain a more accurate model.

After completing the above configuration, you can use .fit() to start the training task, which will create a SageMaker training task to load data and programs, run the Train function in our Recommender.py script, and save the model results to the incoming S3 path.

m = MXNet('recommender.py',
Py_version = 'py3',
Role = role,
Train_instance_count = 1,
Train_instance_type = "ml.p2.xlarge",
Output_path = 's3://{}/{} /output'.format(bucket, prefix),
Hyperparameters = {'num_embeddings':512,
'Opt':opt, opt,
"Lr ':lr," lr,
'Momentum':momentum, momentum,
'Wd':wd, ww,
Framework_version = '1.1')
m.fit({'train':'s3://{}/{} /train/'.format(bucket, prefix)})

After the training is started, we can see this training task in the Amazon SageMaker console, click on the details to see the training log output, and monitor the machine's GPU, CPU, memory and other usage rates to confirm that the program can work normally.



6. Hosted deployment and inference testing

After completing the training locally, you can easily deploy the above model into a real-time port that can be called in the production environment:

predictor = m.deploy(initial_instance_count = 1,

(Instance_type = 'ml.m4.xlarge')
predictor.serializer = None

      <!-/\ * Font Definitions \ */@ font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso -generic-font-family:roman; mso-font-pitch:variable; mso-font-signature:3 0 0 0 1 0;} @ font-face {font-family:"Segoe UI"; panose-1:2 11 5 2 4 2 4 2 2 3; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-469750017 -1073683329 9 0 511 0 ;} @ font-face {font-family:Microsoft Yahei; panose-1:2 11 5 3 2 2 4 2 2 4; mso-font-charset:134; mso-generic-font-family:swiss; mso- font-pitch:variable; mso-font-signature:-2147483001 718224464 22 0 262175 0;} @ font-face {font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2 4; mso-font -charset:0; mso-generic-font-family:modern; mso-font-pitch:fixed; mso-font-signature:-536869121 64767 1 0 415 0;} @ font-face {font-family:"\\ @Microsoft     "; mso-font-charset:134; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-2147483001 718224464 22 0 262175 0;}/\ * Style Definitions \ */p.MsoNormal, li .MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin-top:2.5pt; margin-right:0cm; margin-bottom:2.5 pt; margin-left:0cm; mso-para-margin-top:.5gd; mso-para-margin-right:0cm; mso-para-margin-bottom:.5gd; mso-para-margin-left:0cm; text-align:justify; text-justify:inter-ideograph; mso-pagination:none; layout-grid-mode:char; word-break:break-all; font-size:10.5pt; mso-bidi-font-size :11.0pt; font-family:"Segoe UI", sans-serif; mso-fareast-font-family:Microsoft Yahei; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font :minor-bidi; mso-font-kerning:1.0pt; layout-grid-mode:line;} pa, li.a, div.a {mso-style-name:code; mso-style-update:auto; mso -style-unhide:no; mso-style-qformat:yes; mso-style-link:"code character"; margin:0cm; margin-bottom:.0001pt; text-align:justify; text-justify:inter-ideograph ; mso-pagination:none; layout-grid-mode:char; background:#BFBFBF; mso-background-themecolor:background1; mso-background-themeshade:191; word-break:break-all; font-size:1 0.5pt; mso-bidi-font-size:11.0pt; font-family:Consolas; mso-fareast-font-family:Microsoft Yahei; mso-bidi-font-family:"Times New Roman"; mso-bidi- theme-font:minor-bidi; mso-font-kerning:1.0pt; layout-grid-mode:line;} span.a0 {mso-style-name:"code character"; mso-style-unhide:no; mso -style-locked:yes; mso-style-link:code; font-family:Consolas; mso-ascii-font-family:Consolas; mso-fareast-font-family:Microsoft Yahei; mso-hansi-font-family :Consolas; background:#BFBFBF; mso-shading-themecolor:background1; mso-shading-themeshade:191; layout-grid-mode:both;} .MsoChpDefault {mso-style-type:export-only; mso-default- props:yes; font-family:isometric; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}/\ * Page Definitions \ */@page {mso -page-border-surround-header:no; mso-page-border-surround-footer:no;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header- margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;}->

After the above command is run, we will see the corresponding endpoint configuration in the control panel. When it is successfully created, you can test it. For this, you can issue an HTTP Post request, or you can simply call the SDK's .predict() directly, and then you will get the prediction result returned:\ [5.446407794952393, 1.6258208751678467 ]

predictor.predict(json.dumps({'customer \ _id':customer \ _index \ [customer \ _index \ ['user' \]== 6 \]\ ['customer \ _id' \]. values.tolist() ,

 'product \ _id':\ ['B00KH1O9HW', 'B00M5KODWO' \]}))

On this basis, we can also calculate the error of the model on the test set, the result is 1.27. This result is better than 1.65 when our local Embedding is set to 64, which also shows that by adjusting the network structure, we can continuously optimize the model.

test_preds = []
for array in np.array_split(test_df [['customer_id', 'product_id']]. values, 40):
Test_preds + = predictor.predict(json.dumps({'customer_id':array [:, 0].tolist(),

(Product_id ':array [:, 1].tolist()}))
test_preds = np.array(test_preds)
print('MSE:', np.mean((test_df ['star_rating']-test_preds) ** 2))

to sum up
This article describes how to use Amazon SageMaker to build a simple recommendation system based on Gluon, and deploy it. This can be a good introductory tutorial for everyone to get started with the recommendation system. However, it is worth noting that this article, as a basic example, does not include hyperparameter tuning, network structure optimization, and the introduction of multiple features. These are all tasks that are necessary to build a complete recommendation system to improve accuracy. If you need to use a more complex and in-depth recommendation system model, or build other applications based on Gluon, please refer to the relevant content of our subsequent releases, or refer to the official Amazon SageMaker documentation.


Bottom Picture 2.png