In this Blog you are going to see what Big Query is, its best feature of Big Query (BQML), Areas of BQML, with clear example to understand its easiness of building machine learning model with simple SQL code.
The blog will go through the following topics:
Let’s dive into the article.,
With built-in technologies like machine learning, business intelligence and geospatial analysis, Big Query is a managed service data management warehouse that can enable you to manage and analyse your data. With no need for infrastructure administration, Big Query’s serverless architecture enables you to leverage SQL queries to tackle the most critical issues facing your company. You may query data in terabytes in a matter of seconds and petabytes of data in a matter of minutes thanks to Big Query’s robust, distributed analytical engine.
Built-in ML Integration (BQ ML), Multi cloud Functionality (BQ Omni), Geospatial Analysis (BQ GIS), Foundation for BI (BQ BI Engine), Free Access (BQ Sandbox), Automated Data Transfer (BQ Data Transfer Service). These are the amazing features of BQ, in this blog we will discuss the most amazing feature of Big Query which is Big Query ML.
*An amazing feature of Big Query is Big Query ML,
Big Query ML allows you to use standard SQL queries to develop and run machine learning models in Big Query. Machine learning on huge datasets requires extensive programming and ML framework skills. These criteria restrict solution development within each organization to a small group of people, and they exclude data analysts who understand the data but lack machine learning and programming skills. This is where Big Query ML comes in handy; it allows data analysts to employ machine learning using their existing SQL tools and skills. Big Query ML allows analysts to create and evaluate machine learning models in Big Query with large volumes of data.
The major advantages I’ve identified using BQML
Another blog worth reading: Databricks Overview, Why Databricks, and More
CREATE OR REPLACE MODEL
`testproject-351804.regression.house_prices2` OPTIONS(model_type = ‘linear_reg’, input_label_cols = [‘price’],l2_reg = 1, early_stop = false, max_iterations = 12, optimize_strategy = ‘batch_gradient_descent’) ASSELECT avg_house_age, avg_rooms, avg_bedrooms, avg_income, population, price/100000 AS priceFROM `regression.usa_housing_train`
SELECT avg_house_age, avg_rooms, avg_bedrooms, avg_income, population, price/100000 AS price FROM `regression.usa_housing_train
SELECT * FROM ML.EVALUATE(MODEL `regression.house_prices2`,TABLE ` testproject- 351804._8b41b9f5a2e85d72c62e834e3e9dd60a58ba542d.anoncb5de70d_1e3d_4213_8c5d_bb10d6b9385b_imported_data_split_eval_data`)
*We need to look upon R-Squared, which is Coefficient of determination. Higher is better.
The model’s prediction process is as simple as calling ML.PREDICT
SELECT * FROM ML.PREDICT (Model `regression.house_prices2`,TABLE `regression.usa_housing_predict`)
See, how efficient is Big Query ML feature of Big Query, it predicted the house prices basing upon the trained data of avg_house_age, avg_rooms, avg_bedrooms, avg_income, avg_population.
Now you know how to create linear regression models in BigQuery ML. We have discussed how to build a model, assess it, apply it to make predictions, and analyse model coefficients.
In next coming blogs you will see other unique features of Big Query like Geospatial Analytics and Array/Structs.
Hope you find this useful.
By Ankit Kumar Ojha
By Uma Raj