BigQuery ML Example

November 12, 2022

Here is an example of how to use BigQuery ML on a public dataset to create a logistic regression model to predict whether a user will click on an ad:

# Import the BigQuery ML library
from google.cloud import bigquery
from google.cloud.bigquery import Model

# Get the dataset and table
dataset = bigquery.Dataset("bigquery-public-data.samples.churn")
table = dataset.table("churn")

# Create a model
model = Model('my_model',
              model_type='logistic_regression',
              input_label_column='churn',
              input_features_columns=['tenure', 'contract', 'monthly_charges'])

# Train the model
model.train(table)

# Make a prediction
prediction = model.predict(STRUCT(tenure=12, contract='month-to-month', monthly_charges=100))

# Print the prediction
print(prediction)

This code will first create a logistic regression model named my_model. The model will be trained on a public dataset called bigquery-public-data.samples.churn. The churn dataset contains data about customer churn, with the churn column indicating whether a customer has churned. The tenure, contract, and monthly_charges columns are the input features columns.

Once the model is trained, the code will then use the model to make a prediction about whether a user with 12 months of tenure, a month-to-month contract, and monthly charges of $100 will churn. The prediction will be stored in the prediction variable.

To run this code, you will need to have a BigQuery account and have enabled the BigQuery ML API. You can then run the code in the BigQuery console or in a Jupyter notebook.

Once you have run the code, you will see the prediction. The prediction will be a boolean value, indicating whether the user is predicted to churn or not.