Annotation is the foundation for transforming raw data into actionable insights. With natif.ai’s platform, mastering annotation ensures your AI learns effectively, delivering precise results. This guide simplifies common annotation challenges, providing actionable steps and best practices to help you get the most out of your document automation workflows.
What do I have to pay attention to when I want to extract a date?
You need to extract complete dates (day, month, year) from documents.
Solution:
1. The “Date” field is specifically designed for full dates.
2. For partial formats, like “March 2024,” alternative methods are required.
Steps:
1. Use the “Date” field for complete dates only.
2. For partial formats or unconventional date representations, use the “Free Text” field and parse the data with your system.
Pro Tip:
Maintain consistent formatting across documents to improve AI learning.
How can I extract calendar weeks?
Extracting calendar weeks, such as “Week 12,” is not natively supported.
Solution:
While a dedicated “Calendar Weeks” field is unavailable, the “Free Text” field can capture this data.
Steps:
1. Select the “Free Text” field.
2. Annotate the calendar week in your documents.
3. Post-process this data in your system as required.
What is the best way to extract checkboxes?
Extracting checkboxes along with their associated text.
Solution:
Checkboxes need to be annotated in combination with the text they relate to for accurate extraction.
Steps:
1. Create a combined field with two text elements.
2. Annotate both the checkbox and its associated text.
3. Group them together to train the AI on their relationship.
Pro Tip:
Always annotate the checkbox and text consistently to avoid discrepancies in AI predictions.
How do I model multiple hierarchy levels on my documents?
Documents with multiple hierarchical levels, such as invoices containing multiple deliveries and products.
Solution:
Using lists allows you to model hierarchical data effectively.
Steps:
1. Create separate lists for each hierarchy. For example:
• List 1: Delivery Numbers
• List 2: Products Annotate each level within its respective list.
2. Annotate each level within its respective list.
3. Merge the lists in your system if required using provided coordinates.
Pro Tip:
Clearly define the relationships between lists to maintain data integrity.
There is often too much space between two words in my document. How can I get the AI to read both of them anyway?
There is too much space between two words. Our AI therefore decides to select only one of the two. However, you want to extract all the words in the field.
Solution:
By creating a “combined data field” consisting of a text field, you can group all the associated words in the text field.
Steps:
1. Select “Add new extraction field”.
2. Then choose “Combined” as the field type.
3. Add the “Free Text” field to the combined data field.
How do I annotate correctly if a data field always appears several times in a document?
A field appears multiple times in every document.
Solution:
Annotate only one occurrence of the field to maintain consistency
Steps:
1. Choose one instance of the repeated field to annotate.
2. If you need to annotate multiple occurrences, create separate fields (e.g., Vendor Name and Imprint Name).
Pro Tip:
Always annotate the same instance across documents to help the AI learn its location.
How do I annotate correctly if a data field appears several times on some documents and only once in others and in different places?
A field appears in different locations or inconsistently across documents.
Solution:
Lists allow you to annotate all occurrences systematically.
Steps:
1. Create a list of text fields.
2. Annotate all occurrences within the list, grouping each instance.
3. The list consolidates all relevant fields for consistent output.
Can I also assign the rows of a table to separate categories?
Extracting rows of a table where each row belongs to a different category.
Solution:
Separate lists allow you to categorize table rows effectively.
Steps:
1. Assign each table row to a unique list.
2. Annotate rows within their respective lists.
3. Include shared data fields (e.g., “Price”) within each list as needed.
How do I annotate a file that consists of several documents?
Files containing multiple documents where only some require annotation.
Solution:
Split the documents for easier processing or annotate unwanted sections with None.
Steps:
1. For mixed-document files, use the Splitting Workflow to separate documents.
2. If splitting is unnecessary, annotate irrelevant pages with None so the AI can ignore them.
Pro Tip:
For files containing identical documents (e.g., two invoices), annotate each individually for accuracy.
Pro Tips and Best Practices
• Consistency is Key: Annotate the same fields consistently across documents to improve AI learning.
• Use Lists Strategically: For varying field locations, lists provide flexibility and organization.
• Validate Your Annotations: Regularly review annotations to ensure they meet your intended output.
• Train Iteratively: As your AI learns, refine your annotations to address edge cases or inconsistencies.
Next Steps and Resources
• Watch the Annotation FAQ Video: click here • Explore Advanced Tutorials: Access more guides on using natif.ai effectively.
• Need Help? Contact our support team.
By mastering these annotation techniques, you can transform your document workflows and harness the full potential of natif.ai’s intelligent platform. Get started today!