|dtSearch Corp., a leading supplier of enterprise and developer text retrieval software along with document filters, announces Version 7.72 of its product line. The new version expands dtSearch’s proprietary document filters built into its text retrieval products. For customers in need of data parsing, conversion and extraction only, the dtSearch Engine (with APIs in native 64-bit/32-bit, Win/Linux C++, Java and .NET through current versions) also provides the document filters for separate OEM licensing.
Supported Data Types. dtSearch’s document filters support a broad range of data formats:
* Web-ready static data: covers integrated image and text support in HTML, XML/XSL and PDF.
* Web-based dynamic data: through the dtSearch Spider, covers integrated image and text support in PHP, ASP.NET, SharePoint, etc.
* Other databases: through the dtSearch Engine APIs, covers SQL-type databases along with the full-text of BLOB data; all products support Access, XBASE, XML, CSV, etc.
* MS Office documents: covers integrated image and text support in Word (DOC/DOCX), PowerPoint (PPT/PPTX), Excel (XLS/XLSX) and Access (MDB/ACCDB).
* Other “Office” documents and compression formats: covers PDF with integrated image and text support, RTF, OpenOffice, ZIP, RAR, GZIP/TAR, etc.
* Emails and email attachments: covers MS Exchange, Outlook (PST/MSG), Thunderbird (MBOX/EML), and other popular email types, including nested email attachments.
* Embedded image support: covers images in Word, PowerPoint, Excel, Access and RTF files, as well as Outlook and Thunderbird emails, including images in recursively embedded files.
For all supported formats, the document filters support data parsing and optional extraction, as well conversion to HTML for browser display with highlighted hits.
New in Document Filters. Version 7.72 adds OneNote (*.one) support through current versions, including support for images and documents embedded in OneNote files. The new version also expands the document filter APIs, enhancing options for text extraction from individual files, nested objects, etc.
Terabyte Indexer. dtSearch enterprise and developer products can index over a terabyte of data in a single index, spanning multiple directories, emails and attachments, online data and other databases. The products can create and search any number of indexes. Indexed search time is typically less than a second, even across terabytes of data. The product line also supports highly concurrent, multithreaded searching.
dtSearch Spider and Federated Searching. dtSearch products offer federated searching across any number of directories, emails (with nested attachments), and databases. The dtSearch Spider adds local and remote, static and dynamic online content to a search. The Spider can index sites to any level of depth, with support for public and private or secure online content, including log-ins and forms-based authentication. dtSearch products support integrated relevancy ranking with highlighted hits of data across both online and offline repositories.
Developer SDKs. The dtSearch Engine for Win & .NET and the dtSearch Engine for Linux make available dtSearch instant searching and document filters (both together with searching as well as available for separate licensing) for a wide range of Internet, Intranet and other commercial applications. SDKs include native 64-bit and 32-bit C++, Java and .NET (through current versions) APIs. For over a hundred developer case studies, please see www.dtsearch.com/casestudies.html.
# # #
About dtSearch, www.dtsearch.com
The Smart Choice for Text Retrieval® since 1991, dtSearch has provided enterprise and developer text retrieval along with document filters for over 22 years. The company offers parsing, extraction and conversion—as well as of course searching—of a broad spectrum of data formats. Supported data types include databases, static and dynamic website data, popular "Office" formats, compression formats and email types (including the full-text of nested attachments).