Tuesday, 9 July 2019

Deep Learning with H2O in Python

H2O.ai is focused on bringing AI to businesses through software. Its flagship product is H2O, the leading open source platform that makes it easy for financial services, insurance companies, and healthcare companies to deploy AI and deep learning to solve complex problems. More than 9,000 organizations and 80,000+ data scientists depend on H2O for critical applications like predictive maintenance and operational intelligence. The company – which was recently named to the CB Insights AI 100 – is used by 169 Fortune 500 enterprises, including 8 of the world’s 10 largest banks, 7 of the 10 largest insurance companies, and 4 of the top 10 healthcare companies. Notable customers include Capital One, Progressive Insurance, Transamerica, Comcast, Nielsen Catalina Solutions, Macy’s, Walgreens, and Kaiser Permanente.

Using in-memory compression, H2O handles billions of data rows in-memory, even with a small cluster. To make it easier for non-engineers to create complete analytic workflows, H2O’s platform includes interfaces for R, Python, Scala, Java, JSON, and CoffeeScript/JavaScript, as well as a built-in web interface, Flow. H2O is designed to run in standalone mode, on Hadoop, or within a Spark Cluster, and typically deploys within minutes.

H2O includes many common machine learning algorithms, such as generalized linear modeling (linear regression, logistic regression, etc.), Na¨─▒ve Bayes, principal components analysis, k-means clustering, and word2vec. H2O implements bestin-class algorithms at scale, such as distributed random forest, gradient boosting, and deep learning. H2O also includes a Stacked Ensembles method, which finds the optimal combination of a collection of prediction algorithms using a process 6 | Installation known as ”stacking.” With H2O, customers can build thousands of models and compare the results to get the best predictions.

Here is an example to use H2O-deeplearning in Python- 



Step 1-  First of all , we need to install H2o package in Python.

on anaconda prompt
pip install h2o


Step 2-  Initialize and start the cluster -

h2o.init()
from h2o.estimators.deeplearning import H2ODeepLearningEstimator


Step 3-  load train and test data set-

train = h2o.import_file("https://h2o-public-test-data.s3.amazonaws.com/smalldata/iris/iris_wheader.csv")


Step 4-  Creating test and train data set using split-

splits = train.split_frame(ratios=[0.75], seed=1234)



Step 5-  Configuring the model-

model = H2ODeepLearningEstimator(distribution = "AUTO",activation = "RectifierWithDropout",hidden = [32,32],input_dropout_ratio = 0.2,l1 = 1e-5,epochs = 10)


Step 6-  train(fit the model)-

model.train(x="sepal_len", y=["petal_len"], training_frame=splits[0])


Step 7-  predicting using trained model and creating a new column in test data-

(splits[1]['predicted_sepal_len'])=model.predict(splits[1])




One can compare sepal_len ( actual) and predicted_sepal_len ( forecasted )  values.


No comments:

Post a Comment