Making your first Capsule for Samsung Bixby – An Exercise in Teaching your Phone to Listen

 

Today Samsung made the Bixby Developer Studio available for download and use so that developers can start building capsules to publish on their marketplace starting in 2019. I am an early adopter of the new Bixby and wanted to share how to build a simple capsule using the new developer kit as well as share my experience with using the new platform. Readers of this blog know that I have made and published Skills for Amazon’s Alexa and published tutorials on how you can develop for that platform. Similarly this writeup will focus on a minimal application that will help you get started with the features of Bixby’s impressive development tools.

Bixby is a wonderful platform to develop for and it has top-notch development tools. Building software for Bixby is a lot like teaching someone a new skill. It uses natural language model training so you can show Bixby what parts of user phrases are important and it uses the idea of concepts to define Bixby’s understanding of what capability you are giving it. These concepts will be discussed in detail below.

Today we will be making a capsule to generate passwords made up of a random string of words, inspired by XKCD’s Password Strength comic. We will be taking advantage of Bixby’s visual interface to make a password users can easily remember as well as easily copy and use for their accounts. The XKCD algorithm sticks random, memorable words together so that passwords are complex but can also be easily recalled by the user. They also are higher entropy than a random string of characters and numbers, making them harder to crack my many brute-force methods. After giving the comic a quick look, read on.

As always, you can download and view the code on my GitHub!

THE PROBLEM

We want to build a Bixby capsule that can generate memorable passwords for users. These passwords should be of a user-specified length.

The overall requirements are:

  • Generates a password using regular English words
  • Takes in a user’s specified length
  • Displays the password graphically for copying
  • Displays a calculation of the entropy of the password so the user knows how good the password is.

THE SOLUTION

You should now have Bixby Studio installed on your system. As of writing it is available for Windows and macOS.

Create a new project by clicking File>New Capsule.

The first bit of code we will focus on is the generator.js file. This is where we define our entry point and what we are going to return.

Notice how we export the function generate- this is the function where I generate everything we need for the response you see in the screenshot above. We take our wordlist dictionary file (how to get that will be discussed shortly), we build our password using a user-specified length called numWords, and we calculate the entropy of the password. We then return a result we can parse into a nice, visual response like in the screenshot.

Whew, there’s a lot going on in here though! Let’s start with the wordlist. This is a JSON-formatted list of common English words I found searching for open-source corpora. Why JSON? As you might have surmised from the above code snippet, Bixby capsules are written in JavaScript! Importing this data as JSON makes it very easy to loop through and use, as you can see in my generate function. I stored this in a directory called lib but you can call it whatever you please. Just be sure to update the path in the generator!

Next, we need to discuss how we get numWords. This is the user-input. We want the user to say ‘Make me a password with three words’ and Bixby needs to know how to do that.

In the resources directory you will find endpoints.bxb. The actions your capsule can take are called endpoints. Let’s define one for generating a password:

Let’s look at what we have here: We have authorization set to none because this endpoint is public and available to any user without authorization. We have specified an action endpoint for our generate function as defined in the generator.js snippet above and we have told Bixby that the input for this endpoint is numWords. We also tell it what file it will find the definition for this endpoint in- generator.js.

Now that Bixby has an available action in the form of the endpoint, we get to the really interesting stuff- teaching Bixby what everything in our capsule means. The way we do this is via a model. In the model directory we have actions and concepts. These make up Bixby’s understanding of what your capsule can do, and we just need to write some high-level markup to make this work. Let’s start with the action our capsule is going to have- generating passwords. This will inform what concepts Bixby needs to have definitions for so that we can move on to training our natural language model.

Above is the generator.model.bxb action file. You will find it in my action directory. What does this do? Read through the comments carefully. It defines the actions Bixby will take when running this capsule, and it covers all our bases regarding various user inputs! We tell it our action is to run our generate function. We tell it to collect numWords, and we tell it that numWords is of the numWords concept type which we will define shortly. We tell Bixby that there can be at most one numWords (so that we ignore other numbers in the user’s invocation) and we tell Bixby that this value is required. If Bixby cannot find a number in the invocation to use, we define a default initialization with four words- the same as our XKCD comic! We then do some validation in the event we find a number in the user’s invocation. If numWords is 0 or less, we want to display some text telling the user that you cannot have a password that is negative in length (duh, but the bulk of software development is anticipating stupid). Finally we tell Bixby what our result is going to be- an instance of our PasswordResult concept, which will be of the type Calculation. This is a type Bixby provides for a result that it needs to compute or otherwise derive. Let’s get started defining what these concepts are.

If you are following along in the repo, look at the numWords concept.

This is a good minimal example of a Bixby concept. These are the variables that are key to our capsule working. You can think of them as teaching Bixby a new idea, slowly building for it the picture of what you are trying to achieve. We tell Bixby that NumWords is an integer (we don’t want fractional words). We also give a brief description of what this has to do with our capsule. For NumWords this is obvious- it is the number of words in the password.

Password is almost the same except this concept is given the ‘name’ type since we need an output string. We describe it as the output password. Entropy is similar- we describe it as the approximate bits of ‘randomness’ in our password and give it the integer type since it will be a number we calculate. Length, predictably, is an integer that represents the length in words for our password. This is utilized in the entropy calculation, which taking the formula from the comic is taking two to the power of the number of words and then dividing for the number of attempts to brute force the password you could make if your computer could make 1000 attempts per second for a year. This yields an estimation of the number of years the password would take to crack in these circumstances. Finally Years is given the integer type and described as the number of years simple brute forcing would take to crack this password- it is also part of the entropy calculation we display at the bottom of our result as you can see in the above screenshot.

The most complicate concept is our PasswordResult:

It has the type Structure because it contains multiple properties- namely every concept we have just defined. We give these properties types- I just made these the same as the property name for simplicity but they can be used in more complicated capsules to link properties together with a descriptive type. We again describe each property and what it does, tell Bixby if the property is required, and for each tell Bixby that there can be at most one value for each. This result, as you may recall from the generate method, is what we will use to generate our visual response on the screen of the device. We have now explicitly told Bixby everything there is to know about how our capsule is going to work! It knows every concept and every result we are going to want. We now can teach Bixby how to handle speech.

Click training in the resources/en directory.

Screen Shot 2018-11-07 at 7.11.44 PM

You will see a list of training examples I have provided the natural language model. We are effectively training Bixby to understand how to parse user phrases and turn them into useful input for our capsule. This is an application of machine learning! Notice the examples I have provided. I have made one: ‘generate a password for me’ with no numbers in it- this is to provide an example where Bixby should use our default input of four words from above, like the XKCD comic. I also provide numerous examples with varying numbers of words asking Bixby to generate a password in various ways. Notice how I have clicked on and highlighted the number in each training phrase and I have labeled this value as numWords! You will do this for each input your capsule needs- the more examples the better. Bixby will use the labels and examples you provide to teach itself that when something sounds similar to your examples Bixby is being asked to open your capsule and feed the data that is similar to the labeled phrases you gave it to the capsule as input. Bixby is learning, so make sure to spend plenty of time here to make sure Bixby really gets it! Compiling the model will make Bixby learn each of your examples and you can view what Bixby’s output for your examples would be so you can be sure that Bixby has not mis-learned how to handle your examples. A well-trained model will make your users happier and your capsule easier to use. This is my favorite part of the Bixby developer tools- it is very intuitive and fun to use, and it offers a look for machine learning enthusiasts into the underlying technologies behind Bixby. This is a defining attribute of the platform for me- it feels much more flexible than Alexa, which as a developer seems to encourage a more robotic and specific interface for its skills than the more flexible Bixby interfaces for capsules.

With your model trained and your concepts laid out, the last thing to do is to specify how Bixby should display our output. This is done with dialogs and layouts.

Dialogs define for Bixby’s interface the concepts (inputs) and the results. Therefore for each input you need there will be a dialog and for each result there will be a dialog.

NumWords therefore gets a dialog like so:

This is pretty bare-bones: We define a concept dialog (input dialog), tell it to look for NumWords (like in our training!) and we provide some template text for this type if we wanted to display something related to this input (in my project I ended up not using it).

The Password Result Dialog defines the dialog for our result. This one is more important for this project as it will populate our layout.

We define an output (result) dialog, have it match this time for our PasswordResult concept (passing in the output from calling generate with our numWords result) and then we tell Bixby what to write on the screen with the template text: Notice that this is the first bit of text in the above screenshot that appears when Bixby is displaying a result telling the user what it did for them!

The layouts for the visual part of the display (like this one, PasswordResult.layout.bml) look a lot like HTML! There are many documented UI widgets you can use such as pictures, hyperlinks, cards, and more. Here you can see we use a card to display the actual password, making it wrap onto the next line for long passwords and making them easy to copy. Down below in a div tag we display the password entropy. This is calculated using the formula from the XKCD comic, as described above. Finally we hyperlink to the comic that inspired this project as a way of giving credit.

A few more example passwords are shown below:

You can try it out for yourself in Bixby Studio! Simply click the icon that looks like a phone on the left hand side of the screen to open the Simulator, giving you an idea of what your capsule will look like on an actual Samsung device when the marketplace opens in a few months.

SHARING THE SOLUTION

This project can be found in its entirety on my GitHub! I hope this very early tutorial can help developers make their first steps into developing for Bixby, which I think has some very compelling development tools and technology behind it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s