H2O AI PowerShell Module

Last year I built a couple of proof of concept solutions that used AI (Artificial Intelligence) to determine predictions for some data. At that time I looked around for how I could use PowerShell to build AI models and make the predictions. I stumbled on this presentation referencing this article from Tome Tanasovski. That got me started on H2O AI and successfully completing those proof of concepts. As hinted by Tome though it would be fantastic if there was a H2O AI PowerShell Module to simplify leveraging H2O AI with PowerShell. This post is a first release for my H2O AI PowerShell Module based off Tome’s efforts and a number of my requirements for what I wanted and needed from such a module. It is a simple wrapper for a series of POST requests to the H2O AI Server.

Prerequisites

  1. Java SE Runtime Environment
    • H2O AI Open Source Platform is based on Java and is licensed under the Apache License, Version 2.0. In my environment I’m currently using Java version “1.8.0_251”.
    • You will need Java configured to be in your environment path etc.
  2. H2O AI
    • Download H2O AI  and extract to the local host (e.g. c:\h2o)
      • Note The H2O AI download is currently a ~240Mb (version 3.14.0.7). Uncompressed it is ~244Mb.

H2O AI PowerShell Module

The module is compatible with Windows PowerShell 5.1+ and PowerShell 6.x+. It can be installed from the PowerShell Gallery using Install-Module or manually downloaded from GitHub here.

Install-Module H2OAI

The H2O AI PowerShell Module contains five cmdlets in total, two of which are used internally and three used for orchestration of H2O AI;

  • Start-H2O (Start the H2O AI Server)
  • Stop-H2O (Stop the H2O AI Server)
  • Get-H2OPrediction (Get a Prediction using H2O AI)

H2O AI Algorithms

The H2O AI PowerShell Module will accept the following algorithms with Get-H2OPrediction. Your Training and Prediction data will need to be of the appropriate type for the algorithm to work.

Note only GLM, GBM and DeepLearning have had any level of testing;

    • ‘glm’, ‘gbm’, ‘”glrm’, ‘aggregator’, ‘deeplearning’, ‘drf’, ‘isolationforest’, ‘kmeans’, ‘naivebayes’, ‘pca’, ‘targetencoder’, ‘word2vec’

Start H2O AI

Import the H2O AI PowerShell Module and start H2O AI.

Import-Module H2OAI

# Path to h2o.jar
$dir = "C:\H2O\h2o-3.14.0.7"
Start-H2o -H2oPath "$($dir)\h2o.jar"

Start H2O AI PowerShell M

Stop H2O AI

By default when Start-H2O is used a global variable is set with the Process ID of H2O AI. Issuing the Stop-H2O command will stop that Process ID.

Stop-H2O

Import Training Data, Build a Model and make a Prediction

Get-H2OPrediction is an all in one cmdlet to make using it super simple.

Pass Get-H2OPrediction with;

  • a dataset
  • a model algorithm
  • a data split (defaults to 85% Train 15% Test)
  • data to make a prediction from and
  • the column to predict

The default URL for the H2O AI Server is http://localhost:54321

Note The Predict Column name is case sensitive to what is in your dataset. If the dataset has the column heading as ‘class‘ then you call Get-H2OPrediction with -predictColumn Class it will FAIL.

Iris Example

Below shows using Tome’s Iris example with the module. Train a Deep Learning model with Iris flower data split at 85% / 15% for Train and Test. Then determine Iris sub-species given flower measurements.

H2O AI PowerShell Module Iris Prediction

Time Series Example

Train a Generalised Data Model with a time series dataset with a 85% / 15% split for Train and Test, and predict the next Close value given the last value in the test/train dataset.

Example DataSet

$sourceData | Select-Object -First 10 | Format-Table

Open High Close Low Volume Date
---- ---- ----- --- ------ ----
3.68 3.8 3.74 3.68 50208 23-01-2017
3.74 3.75 3.75 3.66 47972 24-01-2017
3.75 3.8 3.8 3.73 33952 25-01-2017
3.8 3.8 3.79 3.77 32822 27-01-2017
3.8 3.8 3.73 3.68 21552 30-01-2017
3.75 3.75 3.65 3.6 50763 31-01-2017
3.69 3.69 3.64 3.61 59377 01-02-2017
3.64 3.66 3.55 3.51 120869 02-02-2017
3.64 3.66 3.49 3.49 75814 03-02-2017
3.49 3.54 3.44 3.43 86494 06-02-2017

Below shows an example of predicting a value with a time series based dataset using the H2O AI PowerShell Module.

H2O AI PowerShell Module Time Series Prediction

Summary

Big thanks to  Tome Tanasovski for his work in figuring out the H2O AI Server REST calls. Hopefully this module is of benefit to others. If something doesn’t work then I’ll happily accept Pull Requests on the module here.