☁️
Cloud Computing
  • Introduction
  • Terminologies
    • Container
    • Kubernetes (K8s)
    • Serverless Computing
  • Services
    • Docker
    • Terraform
  • ☁️Cloud Computing Platforms
    • Google Cloud
      • Google Cloud Essentials
      • Management
        • Cloud IAM
      • Compute
        • Compute Engine
        • Kubernetes Engine
      • Resources
    • IBM Cloud
      • IBM Cloud Shell
      • Compute
      • Containers
      • Developer tools
      • Integration
      • Storage
      • Cloud Paks
    • Microsoft Azure
      • Compute
        • Functions
        • App Services
      • Networking
      • Storage
      • Web
      • Mobile
      • Databases
        • Cosmos DB
      • Analytics
      • AI + Machine Learning
      • Internet of things
      • Security
      • DevOps
      • Monitoring
      • Management and governance
      • Azure Stack
    • Amazon Web Services
    • Resources
  • Qwiklabs Challenge Labs
    • Create and Manage Cloud Resources
    • Deploy and Manage Cloud Environments with Google Cloud
    • Create ML Models with BigQuery ML
    • Insights from Data with BigQuery
    • Build a Website on Google Cloud
    • Build and Deploy a Docker Image to a Kubernetes Cluster
    • Build and Secure Networks in Google Cloud
    • Set Up and Configure a Cloud Environment in Google Cloud
    • Build and Optimize Data Warehouses with BigQuery: Challenge Lab
    • Scale Out and Update a Containerized Application on a Kubernetes Cluster
  • Whizlabs Challenge League
Powered by GitBook
On this page
  • Task 2: Create a forecasting BigQuery machine learning model
  • Task 3: Create the second machine learning model
  • Task 4: Evaluate the two machine learning models
  • Task 5: Use the subscriber type machine learning model to predict average trip durations

Was this helpful?

  1. Qwiklabs Challenge Labs

Create ML Models with BigQuery ML

Task 2: Create a forecasting BigQuery machine learning model

CREATE OR REPLACE MODEL austin.model_1
OPTIONS(
    input_label_cols=['duration_minutes'], 
    model_type='linear_reg'
)AS SELECT duration_minutes, 
    address AS location, 
    start_station_name,
    CAST(EXTRACT(dayofweek FROM start_time) AS STRING) as dayofweek, 
    CAST(EXTRACT(hour FROM start_time) AS STRING) AS hourofday, 
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips 
JOIN `bigquery-public-data.austin_bikeshare.bikeshare_stations` AS stations 
ON trips.start_station_id = stations.station_id 
WHERE EXTRACT(year from start_time) = 2018;

Task 3: Create the second machine learning model

CREATE OR REPLACE MODEL austin.model_2 
OPTIONS(input_label_cols=['duration_minutes'], model_type='linear_reg') 
AS SELECT duration_minutes, subscriber_type, start_station_name, 
CAST(EXTRACT(hour FROM start_time) AS STRING) AS hourofday 
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips 
WHERE EXTRACT(year from start_time) = 2018;

Task 4: Evaluate the two machine learning models

SELECT SQRT(mean_squared_error)AS rmse, mean_absolute_error 
FROM ML.EVALUATE(MODEL austin.model_1, (
    SELECT duration_minutes, address AS location, start_station_name, 
    CAST(EXTRACT(dayofweek FROM start_time) AS STRING) as dayofweek, 
    CAST(EXTRACT(hour FROM start_time) AS STRING) AS hourofday, 
    FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips 
    JOIN `bigquery-public-data.austin_bikeshare.bikeshare_stations` AS stations 
    ON trips.start_station_id = stations.station_id 
    WHERE EXTRACT(year from start_time) = 2019)
);
SELECT SQRT(mean_squared_error)AS rmse, mean_absolute_error 
FROM ML.EVALUATE(MODEL austin.model_2, (
    SELECT duration_minutes, subscriber_type, start_station_name, 
    CAST(EXTRACT(hour FROM start_time) AS STRING) AS hourofday 
    FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips 
    WHERE EXTRACT(year from start_time) = 2019)
);

Task 5: Use the subscriber type machine learning model to predict average trip durations

SELECT start_station_name, 
    AVG(duration_minutes) AS avg_duration, 
    COUNT(*) AS total_trips 
FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` 
WHERE EXTRACT(year FROM start_time) = 2019 
GROUP BY start_station_name 
ORDER BY total_trips DESC;
SELECT avg(predicted_duration_minutes), 
    COUNT(duration_minutes) 
FROM Ml.PREDICT(MODEL austin.model_2, (
    SELECT duration_minutes, subscriber_type, 
        start_station_name,  
        CAST(EXTRACT(hour FROM start_time) AS STRING) AS hourofday 
    FROM `bigquery-public-data.austin_bikeshare.bikeshare_trips` AS trips 
    WHERE EXTRACT(year from start_time)=2019 
    AND start_station_name = '21st & Speedway @PCL' 
    AND subscriber_type='Single Trip' )
);
PreviousDeploy and Manage Cloud Environments with Google CloudNextInsights from Data with BigQuery

Last updated 3 years ago

Was this helpful?