Ascent Advanced Forms 3.7
Extraction Module - SR01
Part Number 10001337-000
Revision A
December 23, 2004
These notes apply to Ascent Advanced Forms 3.7 and include the following sections:
User Documentation and Examples
New Features (since version 3.0)
Known Problems and Limitations
Trademark/Copyright Statements
Ascent Advanced Forms provides intelligent document recognition. It can be used to determine document types, perform document separation, and extract data from index fields. It can also be set to automatically rotate images and sort the pages of each document. Ascent Advanced Forms leverages the DOKuStar Extraction technology.
Ascent Advanced Forms can be added to your Ascent Capture workflow like any other custom module. You can use it as a replacement for the standard Ascent Capture Recognition Server module, or use it in conjunction with Ascent Capture Recognition Server.
For details, refer to the documentation provided with the Ascent Advanced Forms software.
User Documentation and Examples
Ascent Advanced Forms includes the following documentation. Some of the documentation is installed to your Ascent Capture installation location. (The default location is C:\Program Files\Ascent\Bin\DOKuStar\ExAC_doc.) Other Ascent Advanced Forms documentation is installed to an ODT-OCE folder. (The default location is C:\Program Files\ODT-OCE\Ascent Advanced Forms 3.7\Doc.)
Ascent Advanced Forms User's Manual (UserManual.pdf)
This manual introduces the Ascent Advanced Forms custom module and provides a quick tour of its interface. It also provides steps for installing the custom module, registering it with Ascent Capture, setting it up, and using it in your Ascent Capture workflow.
Ascent Advanced Forms Design Studio User's Manual (Help.pdf)
This manual introduces Design Studio and provides a quick tour of its interface. It includes details about extraction, classification, indexing, field types, testing, and more.
Ascent Advanced Forms Interfaces and File Formats (Interface.pdf)
This manual describes various interfaces that are provided with Ascent Advanced Forms.
Ascent Advanced Forms Tutorial (Tutorial.pdf)
This tutorial provides instructions for using Design Studio and introduces various examples provided with the software.
Examples
Several precompiled examples of projects and Visual Basic programs are installed with the software. They are installed to the \Examples folder in the installation location.
For details about the examples, refer to readme files and commented source code in the Examples folder. In addition, refer to the Ascent Advanced Forms Tutorial for descriptions of the examples.
Online Help
Several online help systems are available:
New Features (since version 3.0)
The following sections list new features added since version 3.0. The features are listed by the version in which they were introduced.
Introduced in 3.7 SR01
Extensions for the Invoice Option
The following features were added:
Single Click Entry for Validation
A new project setting was added to support generating character recognition data needed for the Ascent Advanced Forms Validation single click entry feature.
Runtime and Field Statistics
Following a test run, the new Show Runtimes and States command is now available in the context menu of the Result node. Show Runtimes and States can be used to display runtime and field statistics of many objects, such as fields, document types, and classes. For each index field, the statistics show how often the field returns OK, empty, and error states.
BarCode Field
The BarCode field can now be used as a global field. If a regular expression is specified with formatting, the value of the field will be returned in the specified format. The read bar code is returned in the new RawValue field.
Show Image Command
The new Show Image command is now available in the context menus of the Types register and the Classes register. It displays a list of all test documents in a submenu. This allows you to display any test document in the Setup Image register of the document window.
Introduced in 3.6 SR01
Extensions for the Invoice Option
The following features were added:
New CheckBoxGroup Field
This new field evaluates groups of check boxes on forms. It can be used as an independent field or as a subfield of a KeyValue or FirstOf field. An arbitrary number of PixelCount fields can be added as subfields to check for marked check boxes. Parameters allow you to restrict the number of marked check boxes. The field returns the concatenated result strings of all PixelCount subfields.
Advanced Image Processing
Some functions of the RecoStar Imaging Toolkit are available under the new Advanced Imaging function. The RemoveLinesAndInverseText function replaces the former preprocessing InverseTextCorrection function.
Detailed Recognition Results
The new SaveValueDetails property for text fields allows you to save detailed character recognition results of the field value to the result file. The data can be used for validation (for example, with Ascent Advanced Forms Validation).
Runtime Statistics
Following a test run, statistics can now be displayed. The statistics show the runtimes of all processed items, such as document types, document classes, features, and index fields. For each item, the statistics show the total time, average time, and number of calls. The statistics could be helpful for analyzing and optimizing a project.
Address US Field
This field has been optimized.
Protected Custom Operators
It is now possible to save global fields together with all used subfields into an encrypted file. The file may also be license controlled. When a protected operator is loaded into the Globals register for a project, only the Zone property of the field is shown. The remaining properties of the field and its subfields are not shown and cannot be accessed. In server mode, the field runs only if the system identifier of the license is contained in the encrypted operator file.
New Invoice Option Fields
The following new Invoice option fields are available: InvoiceItemsCustom, InvoiceTotalsCustom, and USInvoiceItemsCustom.
Introduced in 3.2 SR04
Extensions for the Invoice Option
The fields for European invoices now support processing Euro invoices from Italy. Invoices with Lire cannot be processed.
Introduced in 3.2 SR02
Extensions for the Address US Field
This field now supports the Occurrences property. In addition, it can read flat addresses consisting of one line and detect addresses where the recipient information is missing.
Introduced in 3.2 SR01
Extensions for the Invoice Option
The Invoice option contains additional fields for processing US invoices. The fields for European invoices support additional countries (Belgium, Great Britain, the Netherlands, and Switzerland).
Reading US Addresses
The new Address US field reads US addresses.
Support for Multipage Documents
Supplying the page number within a document consisting of several pages in an attribute file allows the ability to skip classification for the subsequent pages and to control which index fields are processed on different pages. This allows the use of a single document type and class for all pages of a multipage document.
Extensions for Custom Operators
It is now possible to save global fields as custom operators that contain links to other global fields. All needed fields will be saved to the same file and will be loaded together.
Improved Usability
There are many improvements within the Design Studio:
Recognition Results for Single Fields
It is now possible to request detailed recognition results in the result file for specified fields only.
Extension for the Invoice Vendor Field
In post-processing, the values of specified database columns of the found database records can be searched and read on the document. The results will then be contained in the result file.
Introduced in 3.1 SR03
Additional Country Settings for Character Recognition
Ascent Advanced Forms now supports 50 countries and, in addition, the settings WesternEurope and CentralEurope.
Using DOKuStar under Different Users
Ascent Advanced Forms can now be used under any user. The program RegisterDOKuStar.exe is no longer needed.
Introduced in 3.1 SR02
Additional Classifier Languages for Character Recognition
Character recognition now supports the following additional languages or language groups: CentralEuropean, Croatian, Czech, Hungarian, and Slovenian.
Introduced in 3.1 SR01
New Fields for Invoice Processing
New field types for invoice processing were added that allow you to read important fields. With these fields, complex parameterizations or distinguishing of different invoice layouts are no longer needed. The following field types are available:
With 3.1, only German language invoices are supported. Support for additional languages will be included in future releases.
Reading from Boxes
The new fields SimpleBox and StructuredBox process boxes that often appear on forms. The fields search a box of a specified type in the search area of the field, remove the box lines, and read the text contained in the box.
Confidence Values for All Fields
Now all fields return a confidence value in their result. The confidence value is a rough indicator of the reliability of the result.
Extension for Collectors
It is now possible to specify features, a setup document, and test documents at collector nodes. Collectors can thus be used to evaluate common features of document types that are defined as subnodes. Still, classification will never terminate at a collector. If a collector evaluates to true, but all document types defined as subtypes evaluate to false, classification will continue with the next collector or document type, respectively.
New Image Preprocessing Function Erase
The Erase function sets one or several rectangular areas to white. This allows you to erase pictures, logos, holes, and other data that may disturb the layout analysis of the document. This may increase the quality of the recognition results.
Extensions for the Table Field
New properties of the EndOfTable node specify conditions for an unconditional table end. In addition, it is now possible to use the Table field as a key or value subfield in a KeyValue field. Used together, these extensions can recognize several similar tables in a document as separate entities (e.g. by specifying a Table field as a value of a KeyValue field with Occurrences set to all).
In addition, it is now possible to use the Table field as global field.
Extension for the FuzzyDb Field
It is now possible to use the Phrase field as a subfield of FuzzyDb fields.
Installing Different Versions of DOKuStar Products
Starting with version 3.1, it is possible to install different versions (greater than or equal to version 3.1) of DOKuStar products or to install different service releases of DOKuStar Extraction and DOKuStar Validation together.
Additional Classifier Languages for Character Recognition
An additional language setting (Western European) allows documents to be processed in several western European languages with a single Extraction project.
Introduced in 3.0
Logo Field
The Logo field is available as a classification feature and as a key in a KeyValue field. It allows the use of graphical information, such as company logos for classification or for positioning of search areas of index fields. A logo editor is supplied to facilitate preparation of reference images for a logo field.
InhouseAddressee Field
The InhouseAddressee field allows you to find the addressee of a document using a database file. The address is identified by the specified zip code of the target address. The field can be used for documents in German language only.
Address Field: New Support for Company Addresses
A new Company column is supported for the database file. The new CompanySynonyms property allows variants to be specified for any company name contained in the database file. The new Exclusions property allows you to specify that addresses must not be returned.
Improved Orientation Detection
The Orientation image preprocessing function (Upside Down in version 2.0) now allows you to detect and correct a rotation of 90 or 270 degrees.
New properties for Phrase Field
It is now possible to specify exclusion phrases where the specified phrases must not be found and to specify whether phrases must begin and end at word boundaries.
Flexible Control of Search Areas for FirstOf Field
All subfields of a FirstOf field now use their own search areas by default. A command in the context menu (a property in version 3.1) of each subfield allows you to specify that the search area of the FirstOf field should be used instead. Thus, some subfields may use the common zone specified at the FirstOf field, while some use an individual search area, which was not possible in version 2.0. Project conversion parameterizes the search areas automatically according to the value of the former UseCommonZone property.
Extended Result File Format
The result file format has been extended with some additional elements. The sections Result and Classify now have a state element. The element image is now followed by a new element attribute file that contains the name of an attribute file in the subelement name, if an attribute file has been used. If a DOKuStar Reader license is available, the Classify section will contain a new element DocumentType containing the document type that evaluated to true with its subtypes and all corresponding feature results. All fields now have an additional element source that specifies whether the field worked on the image or gives the name of the respective attribute if it worked on an attribute.
Extensions for KeyValue Field
The KeyValue field now has its own Occurrences property. Both subfields may now have their own Occurrences properties. This allows you to specify in more detail which key/value pairs for keys should be returned. Project conversion automatically sets the properties so that the field will behave in the same way as in version 2.0.
In addition, the PixelCount field is now also available as value field.
Global Fields as Index Fields
Fields defined in the Globals register can now also be added as index fields.
Protection for Project Files
It is now possible to protect project files against reading and modification by encryption and against unauthorized use by a licensing mechanism.
Additional Classifier Languages for Character Recognition
Character recognition now supports the following additional languages or language groups: Scandinavian, Italian, Spanish, and Dutch.
Support for Large Document Formats
Documents up to DIN A0 format can now be processed.
Extension for the EmptyPage Feature
It is possible to specify a relative threshold for the maximum number of black pixels on an empty page.
The following sections highlight compatibility issues among Ascent Advanced Forms versions:
In version 3.7, the following features are not compatible:
New Project Setting
When the new project setting "Single Click Entry - Create page data" is activated, the project can no longer be used with version 3.6.
DbLoad Property of the Invoice Vendor Field
The DbLoad property no longer supports the value byFileName. To load database files dynamically, the value bySigFile must be used instead. The signal file must be a text file containing a line with the name of the new database file. The file name must differ from the previously used file name.
Compared to version DOKuStar Extraction 3.2 SR04 the following features have been changed:
Image Preprocessing
The image processing function Inverse Text Correction is no longer available. It will still run if it is found in an old project, but it can no longer be added to a project. In old projects, the function should be removed and replaced by the new Advanced Imaging function.
Custom Operators
Custom operators can now only be saved from and loaded to the Globals register.
Compared to version DOKuStar Extraction 3.1 SR02 the following features have been changed:
Result File Format
The cell element of the Table field result now contains the additional elements of the respective column field. Thus, the following are always available: name, state, value, box, source, and confidence.
Compared to version 3.0, the following features are not compatible:
Project Description File Format
The parameter type of the OcrParam node has been changed to CharacterSetType.
Programming Interface
To avoid conflicts with other products, Ascent Advanced Forms no longer uses the PATH environment variable to find the binary files in its installation path. Consequently, the application must make available the interface library dscf.dll of Ascent Advanced Forms. The easiest way is to copy it to the directory of the application program. The library can be found in the bin directory of the Ascent Advanced Forms installation.
Compared to version 2.0, the following features are not compatible:
Verify No Longer Available
The verification station DOKuStar Verify of DOKuStar 2.0 can no longer be used. If you already use Verify with a previous version of the software, please contact Océ Document Technologies.
Compatibility of Project Files
The project file format has changed. Projects created with prior versions will not run with Ascent Advanced Forms 3.x. They must be converted as described in Converting Project Files.
Old Result File Format No Longer Supported
The old (generic) result file format of previous versions is no longer available. The API call dscf_classify now produces the same standard result format as the call dscf_classifyXML.
Detect Orientation No Longer Available
The image preprocessing function Detect Orientation is no longer available. It can be replaced by the Orientation function (formerly Upside Down function) that is now able to detect rotation of 90 and 270 degrees. Project conversion replaces the function automatically.
Occurrences Parameters of the KeyValue Field
The KeyValue field now has its own Occurrences property. Key field and value field also may have an Occurrences property. Project conversion will set the properties so that the field behaves in the same way as in version 2.0.
Result of Address Field
If a database file is used the Address field will return State reject, if it returns an address that could not be matched against the database. The OK state is returned in this case only if database matching was successful and database lines are returned.
Result of Regular Expression Field
If formatting is used, the Regular Expression field now returns the formatted result in the value element. The unformatted value is returned in a new element rawvalue. For compatibility, the formatted value is also returned in the element formattedvalue.
SequenceNumber No Longer Available
The SequenceNumber property is no longer available at the Document Type node. Document types are now always processed in the order in which they are defined in the Types register of the project explorer. To change the processing order, the Document Types nodes can be moved using the Move Up and Move Down commands or Cut and Paste. Project conversion will reorder the document types accordingly, if sequence numbers have been used in the project.
Project Description File Format
Some elements of the project description file (.ipx) have changed:
Example - in version 3.x (\n stands for new line): <currency>DM\nDEM\nEURO</currency>, V2.0: <currency>{{EUR} {1}} {{EURO} {1}} {{€} {0}} ...
Minimum Hardware Requirements
The following table lists minimum hardware requirements for Ascent Advanced Forms:
| Processor: | Pentium class 1 GHz processor |
| System memory: | For DOKuStar server (production mode): At least 256 Mbytes RAM and 512 Mbytes for the swap file. Note that the required swap space for Design Studio depends on the amount of document data and number of images processed in a single test run. |
| Disk space: | 220 Mbytes (for both Ascent Advanced Forms and the Ascent Advanced Forms custom module) |
Note For the computer with DOKuStar server, no other applications or services that use CPU and RAM intensively should be running. Otherwise, resource bottlenecks could occur. For example, the Microsoft Office indexing service may not start.
Certified Operating Systems
Ascent Advanced Forms has been certified by Kofax with the following operating systems.
Note Kofax Technical Support supports Ascent Advanced Forms installed on these operating systems only.
Certified Versions of Ascent Capture
Ascent Advanced Forms has been certified by Kofax with the following version of Ascent Capture:
Additional Software Requirements
The following additional software is required:
Adobe Acrobat Reader is available on the Ascent Advanced Forms installation CD.
Ascent Capture 6.1 must be installed on the computer where you want to install Ascent Advanced Forms 3.7. If you are using an earlier version of Ascent Capture, you must upgrade to version 6.1 before you install Ascent Advanced Forms 3.7. The Ascent Advanced Forms 3.7 installation will not complete if an earlier version of Ascent Capture is detected.
Only one version of the Ascent Advanced Forms custom module can be installed on a computer. If you are upgrading from version 3.1 or higher, the version 3.7 installation program will detect the previous version of the custom module and provide an option for uninstalling it. Choose the option to allow the installation program to uninstall the previous version.
Once version 3.7 is installed, you must register the Ascent Advanced Forms custom module with Ascent Capture. See Registering Custom Modules.
Read these important installation notes before you install Ascent Advanced Forms.
Ascent Capture Must Be Installed
Ascent Capture must be installed before you install the Ascent Advanced Forms custom module. If Ascent Capture is not installed, the Ascent Advanced Forms custom module installation program will not be able to complete. See Certified Versions of Ascent Capture.
For details on installing Ascent Capture, refer to the Ascent Capture and Ascent Capture Internet Server Installation Guide provided with Ascent Capture.
Internet Explorer 6.0 Must Be Installed
Internet Explorer 6.0 or higher must be installed before you install Ascent Advanced Forms. If Internet Explorer is not installed, the Ascent Advanced Forms installation program will not be able to complete.
For details on installing Internet Explorer, refer to your Microsoft documentation or the Microsoft Web site.
Administrator Rights
To install Ascent Advanced Forms, the logged-in user must have administrator rights on the station.
TMP Environment Variable
On the station where you are installing Ascent Advanced Forms, the TMP environment variable must point to a directory with adequate disk space. See Operating Requirements.
Multiple Installations
Starting with version 3.1, it is possible to install different versions of Ascent Advanced Forms on a computer. However, it is not possible to install two service releases of the same version. If version 3.0 or older is installed, it is not possible to install a second version.
When working with multiple versions, some restrictions apply:
Demonstration Software
The installation CD includes a demonstration program that introduces the functionality of Ascent Advanced Forms and Ascent Advanced Forms Validation. You can choose to install the demonstration software by clicking Install Demo Software from the main installation screen.
The demonstration installation program first installs Ascent Advanced Forms (if it is not already installed) and then installs DDSDemo into the Ascent Advanced Forms installation location. (A short description of the demonstration program is installed to DDSDemo\doc. )
Once DDSDemo is installed, Ascent Advanced Forms Validation can optionally be installed (if it is not already installed). If you do not want to use Ascent Advanced Forms Validation with the demonstration, you can choose to cancel the installation without installing Ascent Advanced Forms Validation.
Note The installation program for the demonstration software does not support removal of a previous version. Therefore, if a previous version of the demonstration software is already installed, you must remove it with the Add/Remove Programs utility before you install the new release of the demonstration software.
In addition, DDSDemo requires administrator rights and can only be used by the logged-in user that installed the program. If Ascent Advanced Forms Validation will be used with DDSDemo, it must be installed with the same user. The demonstration program will not work with other users.
The standard installation process installs two main components: Ascent Advanced Forms and the Ascent Advanced Forms Design Studio.
Note You must install Ascent Advanced Forms on each station where it will be used.
In addition, do not terminate the installation program using Task Manager. If you do, the software will not be installed correctly and the results will be unpredictable.
To install Ascent Advanced Forms
After you install Ascent Advanced Forms, you must verify that its license(s) have been activated. You can do this from the Ascent Capture License Utility, which allows you to view the status of Ascent licenses, as well as activate new or additional licenses for your Ascent Capture system.
Tip For easiest activation, it is recommended that you run the Ascent Capture License Utility on a station with Internet access. For client/server installations, you need only activate your Ascent Advanced Forms license(s) one time: you do not need to activate on each client station where you installed Ascent Advanced Forms.
If Internet access is not available, you must acquire an activation code from your Kofax Certified Solution Provider or the Kofax Web site before you attempt to activate the license(s). Contact your system administrator or Certified Solution Provider for assistance.
To view licenses
To activate licenses
Before you can add the Ascent Advanced Forms custom module to an Ascent Capture batch class, you must register it.
To register a custom module
Follow these instructions to uninstall Ascent Advanced Forms. To completely remove the product, you must uninstall both Ascent Advanced Forms and the Ascent Advanced Forms custom module.
Note When uninstalling, you may be prompted with "Shared File Detected" messages. If this occurs, select "Yes" to remove the files. You may also be prompted with "Locked File Detected" messages. If this occurs, select "Ignore" to remove the files.
In addition, files that were not installed by the installation program will not be uninstalled. If you want to completely remove the product from the computer, you must manually delete the files left in the installation folder.
To uninstall Ascent Advanced Forms
The project file format has changed since version 2.0. All projects created with version 2.0 or earlier and to be used with Ascent Advanced Forms 3.7 must be converted to the 3.x format. It is not possible to convert version 3 projects back to the format of prior versions.
Ascent Advanced Forms provides a conversion tool to facilitate conversion. You can call it through batch files:
To run the conversion tool
For example, assume you enter:
DDSmigrate C:\Program Files\ODT-OCE\Ascent Advanced Forms\Projects\MyOldProject.ipj
This will create the output files MyOldProject_V2.ipj and MyOldProject_V3.ipj in C:\Program Files\ODT-OCE\Ascent Advanced Forms\Projects.
DDSmigrate.bat waits 10 seconds after the conversion finishes so that you can read error messages or control the result. If you want to convert several projects from a batch file that you create, call DDSbatchmigrate.bat for each project file that terminates without waiting.
Note that it is possible to create the project in version 3 format with the original project file name. For this case, the extension .orig is used for the file name extension of the source file. The intermediate file in version 2 format gets the extension .tmp. To do this, add the -3 parameter to the call. For example:
@echo off
bash migtool.sh -3 -p %1
For this case, the above example creates the following output files: MyOldProject.ipj, MyOldProject.ipj.orig, and MyOldProject.ipj.tmp.
Note The conversion tool is a separate component installed by default with the Standard installation mode. If you do not want to install it, you can choose Minimum as the installation mode or choose a user-defined installation and do not select the Conversion component.
In addition, verify projects cannot be converted. The verification station of prior versions is no longer available.
Known Problems and Limitations
The following sections list known problems with this release.
Barcode Field with Regular Expression
A dictionary (expression element <D...>) is not supported in a regular expression of a bar code feature.
dscf_rpcCreateServer Function
If a computer name is specified as a value of the parameter host, the function call may fail. For this case, use the IP address as the host parameter.
Inverse Text Correction Image Processing Function
A fatal error may occur with images that contain a large amount of noise. Please remove this function from old projects and replace it with the new advanced imaging RemoveGraphicLines function.
Online Help Links
Problems may occur with the HTML help viewer or Internet Explorer 6.
regsvr32 /u <drive>:\winnt\system32\hhctrl.ocx
regsvr32 <drive>:\winnt\system32\hhctrl.ocx
The links should now work on the Windows 2000 system.
http://support.microsoft.com/?kbid=811630
Restart Ascent Advanced Forms Custom Module After Project File Changes
If you make changes to a project file while the Ascent Advanced Forms custom module is running, you must shut down and then restart the custom module for your changes to be used. The custom module will continue to use the old project settings for new batches until it is shut down/restarted.
Character Confidence Not Always Followed
In some cases, the specified character confidence is not used. Some characters that do not meet the specified confidence level are not rejected. This seems to occur in the Recipient and LastLine rows when using the USAddress field.
From time to time, fixes for Ascent Advanced Forms are provided as Service Packs posted to the Kofax Web site. Should you encounter a problem, check the Web site at www.kofax.com for available Service Packs.
For additional technical information about Kofax products, visit the Kofax Web site at www.kofax.com and select an appropriate option from the Support menu. The Kofax Support pages provide product-specific information, such as current revision levels, the latest drivers and software patches, online documentation and user manuals, updates to product release notes (if any), technical tips, and an extensible searchable knowledgebase.
The Kofax Web site also contains information that describes support options for Kofax products. Please review the site for details about the available support options.
If you are a certified representative from an authorized company and need to contact Kofax Technical Support, please have the following information available:
Trademark/Copyright Statements
Copyright
Copyright © 2004 Kofax Image Products, Inc. All Rights Reserved.
The information contained in this document is the property of Kofax Image Products, Inc. Neither receipt nor possession hereof confers or transfers any right to reproduce or disclose any part of the contents hereof, without the prior written consent of Kofax Image Products, Inc. No patent liability is assumed, however, with respect to the use of the information contained herein.
Trademarks
Kofax, the Kofax logo, Ascent, and Ascent Capture are registered trademarks of Kofax Image Products, Inc.
Ascent Advanced Forms uses DOKuStar Extraction. Copyright Océ Document Technologies GmbH 2001-2004. DOKuStar is a registered trademark of Océ Document Technologies GmbH.
All other product names and logos are trade and service marks of their respective companies.
Disclaimer
The instructions and descriptions contained in this document were accurate at the time they were written. However, succeeding products and documents are subject to change without notice. Therefore, Kofax Image Products, Inc. assumes no liability for damages incurred directly or indirectly from errors, omissions, or discrepancies between the product and this document.
An attempt has been made to state all allowable values where applicable throughout this document. Any values or parameters used beyond those stated might have unpredictable results.