Last year I built a couple of proof of concept solutions that used AI (Artificial Intelligence) to determine predictions for some data. At that time I looked around for how I could use PowerShell to build AI models and make the predictions. I stumbled on this presentation referencing this article from Tome Tanasovski. That got me started on H2O AI and successfully completing those proof of concepts. As hinted by Tome though it would be fantastic if there was a H2O AI PowerShell Module to simplify leveraging H2O AI with PowerShell. This post is a first release for my H2O AI PowerShell Module based off Tome’s efforts and a number of my requirements for what I wanted and needed from such a module. It is a simple wrapper for a series of POST requests to the H2O AI Server.
Prerequisites
- Java SE Runtime Environment
- H2O AI Open Source Platform is based on Java and is licensed under the Apache License, Version 2.0. In my environment I’m currently using Java version “1.8.0_251”.
- You will need Java configured to be in your environment path etc.
- H2O AI
- Download H2O AI and extract to the local host (e.g. c:\h2o)
- Note The H2O AI download is currently a ~240Mb (version 3.14.0.7). Uncompressed it is ~244Mb.
- Download H2O AI and extract to the local host (e.g. c:\h2o)
H2O AI PowerShell Module
The module is compatible with Windows PowerShell 5.1+ and PowerShell 6.x+. It can be installed from the PowerShell Gallery using Install-Module or manually downloaded from GitHub here.
Install-Module H2OAI
The H2O AI PowerShell Module contains five cmdlets in total, two of which are used internally and three used for orchestration of H2O AI;
- Start-H2O (Start the H2O AI Server)
- Stop-H2O (Stop the H2O AI Server)
- Get-H2OPrediction (Get a Prediction using H2O AI)
H2O AI Algorithms
The H2O AI PowerShell Module will accept the following algorithms with Get-H2OPrediction. Your Training and Prediction data will need to be of the appropriate type for the algorithm to work.
Note only GLM, GBM and DeepLearning have had any level of testing;
-
- ‘glm’, ‘gbm’, ‘”glrm’, ‘aggregator’, ‘deeplearning’, ‘drf’, ‘isolationforest’, ‘kmeans’, ‘naivebayes’, ‘pca’, ‘targetencoder’, ‘word2vec’
Start H2O AI
Import the H2O AI PowerShell Module and start H2O AI.
Import-Module H2OAI # Path to h2o.jar $dir = "C:\H2O\h2o-3.14.0.7" Start-H2o -H2oPath "$($dir)\h2o.jar"
Stop H2O AI
By default when Start-H2O is used a global variable is set with the Process ID of H2O AI. Issuing the Stop-H2O command will stop that Process ID.
Stop-H2O
Import Training Data, Build a Model and make a Prediction
Get-H2OPrediction is an all in one cmdlet to make using it super simple.
Pass Get-H2OPrediction with;
- a dataset
- a model algorithm
- a data split (defaults to 85% Train 15% Test)
- data to make a prediction from and
- the column to predict
The default URL for the H2O AI Server is http://localhost:54321
Note The Predict Column name is case sensitive to what is in your dataset. If the dataset has the column heading as ‘class‘ then you call Get-H2OPrediction with -predictColumn Class it will FAIL.
Iris Example
Below shows using Tome’s Iris example with the module. Train a Deep Learning model with Iris flower data split at 85% / 15% for Train and Test. Then determine Iris sub-species given flower measurements.
Time Series Example
Train a Generalised Data Model with a time series dataset with a 85% / 15% split for Train and Test, and predict the next Close value given the last value in the test/train dataset.
Example DataSet
$sourceData | Select-Object -First 10 | Format-Table
Open High Close Low Volume Date ---- ---- ----- --- ------ ---- 3.68 3.8 3.74 3.68 50208 23-01-2017 3.74 3.75 3.75 3.66 47972 24-01-2017 3.75 3.8 3.8 3.73 33952 25-01-2017 3.8 3.8 3.79 3.77 32822 27-01-2017 3.8 3.8 3.73 3.68 21552 30-01-2017 3.75 3.75 3.65 3.6 50763 31-01-2017 3.69 3.69 3.64 3.61 59377 01-02-2017 3.64 3.66 3.55 3.51 120869 02-02-2017 3.64 3.66 3.49 3.49 75814 03-02-2017 3.49 3.54 3.44 3.43 86494 06-02-2017
Below shows an example of predicting a value with a time series based dataset using the H2O AI PowerShell Module.
Summary
Big thanks to Tome Tanasovski for his work in figuring out the H2O AI Server REST calls. Hopefully this module is of benefit to others. If something doesn’t work then I’ll happily accept Pull Requests on the module here.