Custom Splitting: How to Train Your Own Splitting Model 

Tired of spending hours manually splitting large files into individual documents? Splitting can be tedious, especially with pre-scan prep like page separations or barcodes. Our Generic Splitting Workflow simplifies this, but our Custom Splitting Workflow takes it further - leveraging AI to learn from your annotations and streamline bulk splitting with precision.

custom-splitting-placeholder
Have you ever wished for a way to tailor document splitting to perfectly suit your needs? Splitting large files into individual documents can be time-consuming, especially when it involves prepping them with page separations or barcodes before scanning. Our Generic Splitting Workflow simplifies this process for you. But what if you need to apply your own custom splitting rules? We’ve got you covered! 
 
Our Custom Splitting Workflow eliminates manual effort by using AI to learn from user-defined annotations, enabling businesses to efficiently and accurately split bulk document streams into individual files. Seamlessly integrated into natif.ai ’s configurable workflow management system, it adapts to any document type or stream complexity. 
 
If this applies to you, then you can stop worrying as you can now train your perfect AI Splitting Workflow with natif.ai! 

Let’s Start 

Wir beginnen in der  Workflow Übersicht unserer Plattform und wählen “Trainieren Sie jetzt Ihr eigenes Modell” aus. 
workflow-overview-platform

Select Your Workflow 

Here you can find all our Custom AI Workflows. For our Splitting Workflow we select “Create Custom Splitting”. 
select-workflow-platform

Describe Your Custom Splitting Workflow 

We start with describing our workflow by giving it a name and short description. You can also upload an image. This will help you to distinguish this workflow from the others. 
describe-custom-splitting-workflow

Specify Your Documents 

Now we have to give the AI some information about our documents so it knows which tasks need to be done. This also improves the accuracy of your workflow. 
 
For a Custom Classification Workflow the AI needs to know: 
Are the documents always perfectly cropped or should they be cropped in the workflow? 
Is the content on the documents in the Latin or Japanese alphabet? 
Is the text printed or handwritten? Or can it be both? 
specify-documents-1
specify-documents-2

Your Workflow Is Created 

Your model is ready but is initially just a simple generic one. It still needs your guidance to excel! 

Right now, you can test the model on your own data, however, the results might not be as satisfactory because the model will behave based on its generic knowledge. If your data poses unique challenges, you will need to give it some training first. For this we select “Upload Training Data”. 
 
upload-training-data-overview

Upload Your Training Data 

 
Now, it’s the long-awaited moment for your model. It is going to learn from your own data so its results can be tailored to your documents specifically. 
 
If you want to upload your documents sorted somehow, you can create a document collection for them. You also have the option to choose if you want to upload your documents already splitted or merged together. 
upload-training-data
Please upload a minimum of 200 individual documents overall or 50 per collection. It’s very important to select documents that are very similar to the documents that the model will process later. This will help the AI get a full understanding of your documents and provide high accuracy processing. 
 
If you upload the documents already splitted, you don’t need to annotate them. The AI will already know which class the document belongs to. 
If you upload your documents merged together, you have to annotate them. 
upload-data

Annotate Your Training Documents 

For annotation your merged documents, the uploaded documents will be shown with the split points computed by the generic model. If the generic model is wrong, just use the buttons on the left to split or merge the documents, and the model will adjust when you start training. 
uploaded-documents
Now repeat this step for each of your uploaded documents. 

Start The AI Training 

Once you are done with annotating the documents, you can start the training. 
This means the AI now learns how to process your documents. 
You will receive an email once the training is completed – which is normally within the next 24 hours! 
start-ai-training

Integrate Your API 

However, your workflow API is already ready and can be integrated! You can find all information such as code snippets and JSON response examples in the workflow documentation. 
 
ai-integration

That’s It! 

Ihre API wird automatisch angepasst, sobald das Training abgeschlossen ist. Die Trainingsmetriken liefern Ihnen detaillierte Informationen über die Genauigkeit Ihres KI-Workflows. 

Wenn Ihr nächstes Vorhaben komplexer ist als nur die Klassifizierung von Dokumenten, werfen Sie einen Blick auf unsere Workflow-Übersicht. Vielleicht ist das, was Sie suchen, dort schon enthalten. Sollte das nicht der Fall sein, kontaktieren Sie uns einfach, und lassen Sie uns wissen, wie wir Sie unterstützen können! 

Your API will automatically be adjusted when the training is completed! The training metrics will provide you with more information about the accuracy of your AI Workflow. 
 
If what you want to do next is more complex than just classifying documents, then, check out our Workflow Overviewand see if it already contains what you want to do! Even if it doesn’t, just contact usand let us how we can support you.