Part Two on Consistent Billionaires: Introducing The Surprise

In my last post, I spoke about a certain surprise I had to share so here it is……

*drum rolls*

*roll sleeves*

*cracks a knuckle or two*

It’s a web app called Billion Dollar Questions!


It’s a simple and fun web app that anyone can use to predict what sort of billionaire they’ll become. Simply tell the app who you are and a model runs its magic and tells you your future billionaire status. You can share your prediction on Twitter and Facebook to rake up cool points (if you are going to be Consistent anyway).

At this point, I think I should say that I can in no way guarantee you’d become a billionaire. My skills border around Data Science not making money rain.

Here’s how to use it

Before you go any further, I highly recommend that you read my last postThat way, a lot of the stuff on the app would be familiar to you.

Using the app is pretty simple, fill the form in a way that best describes you, click “Predict” and in a few seconds, the app would tell you what sort of billionaire you’d become. Here’s a GIF on how it works:


You can also use it on your desktop, tablet or mobile device!

Now that you’ve seen how it works, here’s the app:

Interested in How I did it?

My work is divided into two parts and can be found on my GitHub repo here:

R’s Shiny

Shiny is an amazing tool from Rstudio that gives you the ability to create R-driven web apps which can be easily deployed for anyone to use without ever having to touch code. A Shiny app usually has three parts:

  1. The UI: This is what you see at the front-end made up of R-wrapped HTML, JS and CSS.
  2. The Server side: This basically your usual R code. All R calculations, functions, scripts are run server side. In my case, this is where the model takes all your inputs, converts it to a dataframe and carries out predictions.
  3. Global: This is optional and it is used to declare variables globally which are to be accessed by multiple objects/functions. It is advisable to only use this when necessary because such variables or objects are loaded at runtime and if they are large or take too much time, it can slow down the loading time of your app. In my case, I read in the original dataframe here as well as the model which I used for the app since both objects would be needed by multiple functions.

My UI, server and global variables are all in the app.R file in the GitHub repo shared above.

Some Custom HTML and CSS

Shiny provides a great way for Data Scientists to code up nice web apps without having to know how to use Front-End tools like HTML, CSS and JavaScript. However, if you want more control over your app, you just might need to know a thing or two on how to use those Front-End tools. The good news is, Shiny lets you create these things pretty easily. I wrote custom code using the HTML() function and within it, I can put in my custom HTML exactly the way I would if I was building a website. I also had a custom stylesheet called style.css to give my app the feel I wanted and make it mobile friendly with a few media queries. I also used the famous animate.css library to make my app look fun (you can see all the buttons jiggling away).

Things to Keep in Mind

A number of people asked me why I used h2o and not R’s famous caret for my machine learning. The answer is: it was the use case. The billionaire data had a significant amount of missing values and had variables with over 50 different categories. These two things are what most machine learning algorithm implementations in R don’t deal well with and h2o handles both gracefully. You can check out h2o’s implementation approach here.

The  code that I used to create the final model used on the app along with some interesting research which did not introduce at this time, can be found here.