Introduction
Using the same dataset and problem that I was trying to solve in Machine Learning for the Lazy Engineer – BigQuery, I am going to show you how to come up with a solution using Vertex AI. I suggest you read through the previous post to get the context.
Solution
Quoting the heading from the product page for Vertex AI:
Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case.
https://cloud.google.com/vertex-ai
I am going one step further with this solution by exposing an HTTPS endpoint (online prediction) that can accept a message and predict if the message as either Ham or Spam.
Steps
- Create a Dataset
From the Vertex AI service page, select Create Dataset option and then select Text data type: - Import data into the Dataset
Use the same Dataset as was used previously. But, swap the columns so that the message is in the first column and remove the first row (header). Then upload it into a GCS bucket. One uploaded, selected the previously selected dataset and click import:
You will receive an email once the import is complete. Once the import is complete, you should see something like this:
- Train the model
Click on the “TRAIN NEW MODEL” button in the above screenshot, you should see a screen like this: - Expose an Endpoint for prediction
The model training took more than 4 hours for me. Once the model is trained you will receive an email.
Then select Online Prediction and then create an endpoint as shown in the two screenshots below:
The endpoint can be tested using REST API, SDKs or you can test it from the console, as shown in the screenshots below. The bottom screenshot shows the probabilities of the message being Spam and Ham while testing from the console. - Conclusion
It is pretty easy to train and deploy a model in Vertex AI. I noticed that predictions were much quicker (sub 300ms) than when using BigQuery. But, this solution costs a bit more (around $10 for 6 hours compared to little to nothing when using BigQuery). I encourage you to give this a try, but, please delete the endpoint after you have no use for it (to save money):