How to train your own classification model

Do you remember the scene from “Her” where Samantha, the AI, goes through Theodore’s letters and classifies them into Amazing and not-so-amazing?
While the AI solutions provided by natif.ai don’t mend broken hearts as Samantha does, they can definitely classify your documents for you!

At natif.ai, we understand that as a prosperous business, you are most likely facing a flood of documents from different sources without knowing what they contain and who should handle them. It probably requires intensive and tedious manual work for you to go through each document and decide whether to toss it away, pass it on to a co-worker, or sort it into a specific folder.

If this applies to you, then you can stop worrying as natif.ai has launched a service where you can train your own AI model on your own documents to sort them into your own chosen labels!
Couldn’t get more convenient, ha?

So, let us show you how easy it is to train your own classifier in very simple and straightforward steps.

We start in the API Hub of our platform and choose “Create a new document classification API”.

1. Give your API a personality!

Give it a name, a description, and a picture to distinguish it from other APIs.

2. Tell the model your desired classes

It’s as easy as just naming your classes!
If you provide descriptive names for your classes, our generic classification model can even work out of the box without training. Naturally, however, you will get better classification results once trained on your data.

3. Tell the model which parts to focus on

If you know in advance whether your interesting content is on specific pages, you can let the model know so that it works even better.
Moreover, our models can classify based on both text and images within the documents. So, let the model know in advance if it should focus more on text or visual cues to get even better results faster!

If you are not sure yet, then no worries, the model will try everything for you.

4. Hurray!

Your model is ready but is initially just a simple generic one. It still needs your guidance to excel!
Right now, you can test the model on your own data, however, the results might not be as satisfactory because the model will behave based on its generic knowledge. If your data poses unique challenges, you will need to give it some training first.

You can test this generic model using our live interface where you upload your own document(s).

If you don’t want to deal with manual uploading, it’s also possible to test through the API. You can send your files through an extremely simple single POST request!

And we already provide you with the code to do it in multiple programming languages.
Just copy and paste the code to give it a try!

5. Train your model

Now, it’s the long-awaited moment for your model. It is going to learn from your own data so its results can be tailored to your documents specifically.

If you have already sorted some files into classes (or can get class labels from your existing databases), then just give these sorted files to the model to directly learn from.

Here, you can first select the name of the class, then upload the files that belong to such a class.

If your files are not sorted yet, then just give these unsorted documents to the model and let it know if it made a mistake in classifying.

In this setup, an uploaded document will be shown with the probabilities for each class as computed by the generic model. If the generic model is wrong, just select the class that you see fit, and the model will adjust when you start training.

When you have uploaded and labelled enough documents per class, hit that “Train now” button and wait for the magic to happen.

We will send you an email once the training is finished. Come back to this page to see the evaluation results of your training!

6. Congratulations!

Your trained API is now on your dashboard and can be used whenever you want!

If it hasn’t finished training yet, you can still integrate the API already. It will first deliver results from the generic model and provide you with improved classifications from the trained model once available.

Do you have further processes you would like to automate?

If what you want to do next is more complex than just classifying documents, then, check out our API-Hub and see if it already contains what you want to do! Even if it doesn’t, just contact us and let us know what is it that you want to do.