Return to page

Introduction to H2O Hydrogen Torch at H2O World Sydney 2022




Sanyam Bhutani, Senior Data Scientist, and Kaggle Grandmaster showcases H2O Hydrogen Torch during the Technical Track sessions at H2O World Sydney. 


H2O Hydrogen Torch is a no-code deep learning tool. H2O Hydrogen Torch unlocks value from this unstructured data to help teams understand it at scale. For the enterprise, H2O Hydrogen achieves AI transformation by changing the way they deliver value to both customers and internal teams.

Talking Points: 


  • Sanyam Bhutani, Senior Data Scientist,

Read the Full Transcript



Sanyam Bhutani:


Hey, everyone. It's great to be here. Thanks for joining us and to our virtual people, I'm looking at the cameras. Good morning, good afternoon, good evening. Wherever you are joining us from? How's everyone today? How's everyone today? It feels like everyone had a heavy lunch. Okay. We'll get started. This is one of my favorite tools. At H2O. We build an array of tools, as you all know. And I'll just do a quick poll by raising your hands. How many of you are familiar with deep learning techniques or deep learning terminology? Okay. A fair mix. And how many of you do you Kaggle? Has anyone Kaggled here? Okay, a few. And do you use these deep learning models artwork every day? Okay, a few. So you have the right talk because this tool is supposed to help you with all of that.


What is Hydrogen Torch?


H2O's Hydrogen Torch is the next framework in our lines of or in our armor of software that we built. And as you know, we are home to many Kaggle Grand Masters, many in the top 10 rankings. And fun fact, we are home to Kagglers who have been number one. So many people have reached number one in competitions, which is the hardest category possible. And now us as a company, what we are trying to do is we are trying to package their brains into softwares that we want to send to you in just an app store that you can simply run off of a browser. So for today's talk, I'll be talking about Hydrogen Torch and the different things it allows along with the short demo. So I'll try to run through my slides if it feels rushed. I'm just giving you an overview, and the demo might be more interesting for everyone.


Sanyam Bhutani's Introduction and Kaggle Background


I guess that's what we are here for. Quick word about our community. I lead our community efforts, so if you're interested in talking more with our Grand Masters or hanging out and just connecting with people, you can head over to this code or you can join Once again, that's Few words about me. As Phil said, I'm a Kaggle Grandmaster only in discussion. So as you can imagine, I talk more than building products. I build content around our community and I'm helping grow our community. I also have a well-known podcast called Chai Time Data Science. So if you want to know more about our Grand Masters, you can check that out. Hydrogen Torch allows a few things. The next big thing in machine learning, I feel, is unstructured data, because we by definition, have much more images on our phone.


How is Hydrogen Torch Different From Other Deep Learning Models?


The internet largely consists of text and deep learning is just now starting to pick up its power in production or in the real world. So Hydrogen Torch brings that power to you through the H2O AI app store, and it helps you apply these models very easily and allows people with different levels of experience to put these models in production. So if you don't have a lot of field expertise in deep learning, but you have a lot of domain knowledge, as we saw earlier in the day, you can use this tool. There are different levels with which you can interact. It's also for experienced deep learning engineers. So let's say you want to really handhold the model and figure out what you want to build, you're also allowed to do that. This is a no-code framework, so it's largely a UI based interaction.


What Can You Use Hydrogen Torch For?


And there are different use cases that we support right now across text, images, audio, and we've just updated the medical image one as well. So that will be shipping really soon. You can select the problem type, train the model, and then deploy to production just through the single framework. That's all Hydrogen Torch is. And I guess I could end my talk right here but I'll give an overview of what it enables. So you can imagine that you have your own data, which you can upload to the cloud. From there, you can select the problem type you want to work with, and then you can finetune or kickoff as many experiments you like at the end of it. We also make it really easy to download these models in either a particular format or in any format that you'd like to work with.


Most of them, we work closely with our customers to support that, and then you can deploy them to production as well. So the supported problem types right now are as follows. Across text, images, and audio, we support the following. And I'll give a quick overview of all of these, just run through them before giving a demo of one or two. We can decide amongst ourselves what you all would like to see, but across text, as you can see, most of the text problems that we use, machine learning for today's support, are the same for images as well. And we are just starting to build out our medical images support as well. So 3D CNN's, for those of you that work with medical data for audio data as well, we are shipping a bunch of features really soon, and we are always building on top of this.


So this slide might be outdated in just a week. And as you expect from H2O, we work really close with our customers. So if they are requested you can expect this list to be constantly updated. One thing I want to point out here is as I mentioned earlier, we have this incredibly large, one of the largest, I think the largest kaggle GrandMaster team. So when I say these problem types are supported, it's not that you can just upload your data and you can get a solution. You have all of the tricks that these Grand Masters, the best of the best, have come up with through literally fighting a battle on the Kaggle leaderboard. And you have their insights and their secrets, their tricks coded into all of these solutions. So even as a junior data scientist, if you just kick off the simplest experiment possible, there is a high chance that a model would be quite accurate, and I invite everyone to try the demo out later or just sign up on our cloud and try that out later on.


Use Cases For Hydrogen Torch


So I'll give an overview of the different problem statements I just spoke about. In images, we support image classification and regression. You can think about this as predicting whether there's pneumonia in the chest x-ray sample or not. Whether the landscape is from a certain country or not. You could count the sum of coins from the photos. You could also imagine you could deploy this in manufacturing. I'm trying to think of more use cases, but anywhere you need to detect or you need to classify an object inside an image, you could do that. For object detection, where you need to detect individual examples inside an image, that's also really easy to do. And as I mentioned earlier, we are always shipping the state of the art models. So in the demo later, we have quite opinionated models that are supported. And these are, again, coming from the Grand Master team, which thinks that these would be the best possible models.


Semantic Segmentation With Hydrogen Torch


For that example, you could work with semantic segmentation. So this is an example from one of the Kaggle competitions called Caravan, where you need to cut out objects in the background. If you have the latest IOS update on your phone, it's mildly annoying. But in the gallery you can tap on people and extract that and send that out stickers, I found that interesting. I don't know about you all, and that is also semantic segmentation running on your phones. So that's another example. You could have instant segmentation where you could segment out individual cars. So let's say you're trying to build a model to help the traffic authorities build the toll taxes or something like that. That's also possible with Hydrogen Torch. You could do metric learning. So metric learning is used for comparing products or comparing similarity during a similarity check.


Metric Learning With Hydrogen Torch


In deep learning, you create these models that create hyper representations inside of them. Metric learning allows you to compare that. Now, you don't need the knowledge of that. Even with Hydrogen Torch we usually always give suggestions, as you can expect with Driverless AI, you might have seen that, and you can just run through the options and also build models for metric learning or comparing products searching for photos of the same landmark. So even on our phones, when you are roaming around and we take a picture, and if you select a certain object, it shows up in the browser, or you can find details that are metric learning running on your phone. For text data we support a wide range of things as well, as I said earlier. So you could classify text. You could predict customer satisfaction with reviews. You could classify spammy words or not. And you could also have this across multi-languages. So that's also being built in right now. Did I lose my slides?


Okay, we're back. Sorry about that. You can also do something known as token classification, where you can extract entities from individual sentences. So helping your model understand what's going on inside of the sentence. I'll start running through these because I'm just giving an overview and we are all here for the demo, I assume. You could also do text span prediction. So you could find relevant information inside transcripts. I think this could be a startup idea, but with Hydrogen Torch, you could also simplify legal language. I know no one likes reading contacts. I don't, maybe you guys do, but you could also simplify those and you could really understand them. That's one possible use case you could try out with Hydrogen Torch. You could do sequence to sequence modeling, which means you could do translation or you could summarize text.


As I said earlier, you could do metric learning. So similarity searches for text as well, where you could compare fake reviews with real reviews. This is becoming more of a challenge as we come up with bots like bots that are powered by GPT3 and the next generation of transformer based models. For audio data, we also support a wide variety of use cases. So you could do simple classification or digression where you could identify birds or sounds of birds out there in the wild. Our Grand Team has won many of these competitions, so you can imagine it's the best tricks being coded into these use cases. You could also recognize the sentiment of a conversation. And just to appreciate how hard of a problem this is, your model literally just sees that spectrogram at the end of this slide.


Easier Model Interpretability Using Hydrogen Torch


So you just, by using Hydrogen Torch, can convert an audio recording and it takes care of everything in the most optimal way and helps you understand the sentiment of the conversation. We care a lot about model interpretability as well, and we like to make it easier for you to understand what's going on inside the model. How do you make sure it's working as it should? So for text data we allow you to visualize the most important words. So, as you can see, the different tabs. we'll take a closer look during the demo, but you can take a look at the insights from validation test sets, and you can see which are the most important words or which are the examples that your model got the most wrong predictions on. So this could help you understand your data as well. Maybe your data is mislabeled and maybe cuss words are marked as positive, which shouldn't be the case, or it could be for some examples. So you can take a deeper look and understand if things are working as they should, and you can iterate on it. 


Creating Image Classification Heat Maps


For image data, you can get a Grad-CAM visualization. So just to bring your attention to that. If you're not aware, this allows you to understand what your models are paying attention to. So just build these models using Hydrogen Torch. Now I'd like to understand what they're focusing on. So the heat maps, you see, the funky images, those weirdly colored areas apart from the blue is what the model is focusing on. So this looks like the right image because if you're trying to classify flaws, you'd want your model to focus on the flaws and not the window. And you can also do these checks visually for image data to make sure things are working as they should. For audio data as well, you could take a look at the spectrograms and the Grad-CAM visualization of that. So ideally again, this is a subjective discussion, but your model should focus on the graphs. So believe it or not, those are graphs of the audio being converted to spectrograms, and you can see that the model is indeed being triggered by that. And it's actually looking at that. And as I said earlier, it's very easy to deploy all of these models that you've built across different use cases. So either you could download your predictions if you just care about getting predictions on a new dataset. You could just do that. You could download the scoring pipeline as well.


Deploying Scoring Pipelines in Python Environments


Which allows you to predict new data using any model in a Python environment. So you can take the scoring pipeline and deploy it in your own Python environment, or you could deploy to H2O MLOps. And you can just use a simple rest API request to get your predictions. So all you have to do for that is basically click those three buttons and everything will be okay. And as I've been constantly saying, we make it fairly easy to do all of these steps. So it's very well integrated into our ecosystem, especially for MLOps. It's really easy to deploy all of these models. So what I'll do now is I'll switch to a demo and I'll quickly share my screen.


Hydrogen Torch Live Demo


Awesome. Is this visible in the back? So this is the home screen of Hydrogen Torch's interface. I have launched an instance from the app store and I have the app ready to go here. Hopefully this is a live demo. Hopefully nothing goes wrong. What this allows you to do is at the home screen, you can see the three steps that you care about. You can import the dataset, create an experiment, and you can also see the list of datasets that I've already imported. You can see the list of experiments that I've been working on. So I get a nice overview of what my validation metrics are looking like. What problem type was this on? What dataset was this from? And I'm personally a huge fan of the names that the experiments are given. So let's kick off an experiment now, and I'll go importing a dataset.


Importing Datasets to Train Your Models


It's fairly easy to import datasets from any platform that you work with, which could be S3, Data Lake, Kaggle, or you could literally upload your model as well. So let's say if you wanted to import a Kaggle dataset, you could put in your username and you could put in your secret key, and that would get imported. Let's try a dataset from S3. Does anyone have any suggestions? Should we do semantic segmentation? Should we do image classification? Feel free to shout out. Image? Okay, let's do semantic segmentation. So I'll import this data set, and what's happening now is it's getting downloaded really fast. I wish I had that internet. Really fast to the instance, and it's been loaded for configuration. As you can see, H2O already figured out the defaults, so I can select, I had already imported this. So it says number one, that's number one.


Setting Up a Training Model


It figured out the problem type is semantic segmentation. If it wasn't the correct one, I could click here. And these are all of the use cases that I just ran through, so I could select these, that looks correct. The data format is indeed what Hydrogen Torch likes, and if you're curious about that, you can also look at the documentation, and figure out what it likes and what it doesn't. The data frame has been picked up correctly as well. And the data folder looks correct. All of the other things look good to me. It figured out all of the columns by itself. So we are good to continue. I'll continue.


And from here it'll also perform a few sanity checks and show me the data set. So by sanity checks, I mean, if you've worked with any form of image data, usually there are corrupt images and you set your model to train overnight and you suddenly realize it's not training and it crashed two hours into it. Usually that happens when you have corrupt images or things like that. So all of those sanity checks just made sure nothing like that was going to happen. So we can continue further. Now, I can take a visual look here. So what I'm doing is I'm hovering my mouse over different things. And for this problem type, we are trying to find individual pieces of clothings. So it correctly identifies that yes, this is a coat, this is shorts, these are pants, it identifies shoes as well. So we've loaded the data set in the correct fashion. I did a quick check. I'm happy with this. I'll continue and now I can view all of the data sets. So what I'd like to do is I'd like to create an experiment and we'll go ahead with our data set. I'm feeling brave today, so we'll go with the master and I'll ask all of you for suggestions. So the problem type looks correct to me, and this experiment is going to be called stylish-ara, for some reason. The data frame folder has been picked correctly and all of the other options are picked correctly. Should we, should we train this on one fold or multiple folds?


Sorry, we support five fold. So I can't think of a use case where you would want 10 folds. Let's do two folds. I'll just go through the other options. We probably want to train on 15% of the data. So what's going on here, this is, as I mentioned earlier, a completely no-code framework. And all of these options that I'm selecting through are literally the experiment that I'm handholding and I've gone through the master settings. If you go through the normal settings, that's really easy. All of the defaults are baked in, but we are selecting what we want to kick off and we can kick off multiple experiments just through this UI. We can increase the image size. I think we can make it 512 by 512. It is indeed three channel images and we can switch the normalization strategy to maybe ImageNet. You can change the augmentation strategy as well. I'll enable mix up. What backbone should be used here. Any suggestions? I have very strong opinions about backbones I like.


Okay, I'll pick my favorite. Let's go with efficientnet-b5. And we'll use the architecture unit. So we are doing semantic segmentation, which requires a backbone and an architecture. So the architecture could be Unet, could be Unet++. And it basically is my new difference on how you shape the decoder and encoder and find nuances that I won't go into. I'll probably just go ahead with Unet++ the loss function looks good to me. The optimizer looks correct. So I think it is good to launch this experiment now. So I'll run an experiment, and as you can see, I was turning on two folds, not 10 folds. So it's queued these experiments already, and just in a few seconds you'll see that these get started and you'll start seeing the insights from there. I'll refresh my screen.


Interpreting Deep Learning Model Data


So what's going on now is, with those settings these experiments have already started, and in a few seconds I'll be able to get the insights as the models are being trained. So I get an overview, so this is a completed experiment, but I can get all of the charts of the metrics I care about. And you could also take a look at prediction insights, as I pointed out earlier. So this is an example of audio data, and I can see that the best examples were the following, I can't tell any difference here. And the worst example is the following. So visually, it looks like these images weren't just converted correctly. Oh no, the experiments failed. This is a live demo as I mentioned earlier. So I'll try another experiment. Maybe it's something simpler, let's say misclassification. And I'll quickly select the options here. I'll switch the backbone to something else because I'm feeling brave again. And I'll take off this experiment. Hopefully this one doesn't crash. I have proved that this is a live demo now. So what I've done now is I've gone through those not master settings. I went to the normal settings, which already suggested some good opinions and some default selections. What this allows you to do is if you really want to handhold the model, and let's say you're probably building thousands, more than that, models every day, every week, you can really select what you want to build and you can handhold the process. You can figure out what's the best possible accuracy you can squeeze out for your use case.


Now, if this did not crash, which it did not, you'll start seeing the insights already. So I can see that the learning rate is being changed with every step, so every iteration. And I can also monitor the batch loss from here. How's my model doing? If this graph was going up, I would have liked to stop this experiment. But it looks like it's going in the right direction. So we'll let it run. And as this happens, I can also see the train data inside. So I can see the flaws that are being predicted as whatever class they're being labeled to. So this is daisy.


This is tulips. I identified tulips. This looks like a rose. So that looks like a correct label to me. And we can also get an insight from validation data. So we can see that the best examples were daisies for now, and this is all happening in real time. So as the experiment is running, I'm able to poke around the model in real time and see what's going on. You can of course wait for the model to train, but as you kick these experiments off, you can always monitor what's going on. You can go back and take a look after they're done.


I can also see that the worst examples look like they're the ones where the images weren't taken properly. This just looks like grass, and I wouldn't want this in my dataset. So it looks like Hydrogen Torch is doing a good job so far. So I'll leave the last few minutes for any question. And I'll quickly also point out other problem types that we support. So in the UI, I can see all of the problem types that Hydrogen Torch currently supports. It's always being improved on. As I said, medical images have just been shipped and it will be up in the app in just a few days. I imagine right now the following image problems are supported, the following text and audio use cases are supported and you can just kick these experiments off from this dashboard. And just inside this app, you can get all your insights. So I'll end the demo here and I'll leave it open to questions.


Thanks. I like my shoes as well, so I'm glad someone noticed.


Does Hydrogen Torch Support Generator Models?


Any plans to support generator models? I'm not sure if that's on the roadmap as of now.


Does Hydrogen Torch Use FASTA?


No. I'm not sure if that's on the roadmap. Does Hydrogen Torch use FASTA in the backend? Not to my knowledge, but we do use tricks very similar to FASTA. So FASTA is one of my favorite open source frameworks. And I think a lot of the tricks that I personally learned in FASTA, you could have seen in the UI as well. So differential learning rates, doing different test time augmentation, all of that is supported really well in Hydrogen Torch. So whatever you learn from FASTA should be visible there. What if we would like the model to detect the gender of the person rather than fashion?


Do You Need GPU Power to Run Hydrogen Torch?


Now this, this becomes an ethics discussion, which is hard to solve in the two minutes that I have remaining. But you would need a label data set for that and I wouldn't go into the ethics or should you try to build a model or should you go in that direction. But we support data sets with labels. Someone says hello. Hello back . How do you explain the difference? I'll skip the next one. Do we need a GPU to run Hydrogen Torch? Yes you do. But you can also sign up for our AI cloud and you get a free 14 day trial there. We give you GPU instances, we can run these examples. So we get 14 days of unlimited playtime, in a sense, with Hydrogen Torch.


How Does Interpretability Work Over Ensembles in Hydrogen Torch?


How does interpretability work over ensembles? For Hydrogen Torch's use case we're giving you validation insights and similar things which aren't affected by ensembling, but for other products, we have actually worked closely with experts and built that in really well with Driverless AI, and all of our products. So that's something that we really care about giving you very accurate models, but also giving you insights over interpretability. So it's possible to also make interpretable models with ensembles, but as you can imagine with simpler models, it's more easy to poke around and get insights from them.


How Large Do Model Datasets Need to Be?


How large do the datasets need to be? And the models are pre-trained? Yes, the models I downloaded are pre-trained models from ImageNet and for datasets, it really depends on your constraints. So we support multi-GPU and if you have a large enough instance with a large enough dataset or data drive, you can kick off any experiment that you could possibly think of. We are almost out of time, so I'll take the last question, which is, is it capable of image captioning? yes. We allow semantic segmentation, so you could build a model on top of the art if you'd like. I'm working overtime, so thanks. Thanks everyone.