|
|
|
|
|
YOUR FEEDBACK
SOA World Conference
Virtualization Conference $200 Savings Expire May 16, 2008... – Register Today! Did you read today's front page stories & breaking news?
SYS-CON.TV |
TOP THREE LINKS YOU MUST CLICK ON Search News Desk
Will Combined Search and Business Intelligence Go Mainstream?
Implementation considerations
Dec. 11, 2007 01:30 AM
Digg This!
Page 2 of 3
« previous page
next page »
Defining the Scope of Enterprise BI Search Search solutions must answer all relevant questions, whether they are about detailed data or summaries. Hence, the scope of BI search extends along a continuum from unstructured documents and aggregate reports to individual records and transactions stored in applications and databases (see Figure 1). While a solution can be implemented in stages, selected technologies should enable indexing and reporting along the entire continuum. This simple approach helps map vendor capabilities and determine which offers the best fit. Unstructured BI-Relevant Content BI-Specific Content: Reports, Records, and Transactions Reports and transactions are BI-specific content types. Their original formats don’t really matter, because they can be transformed into, say, HTML or XML for indexing by Google. More important is the need to access data sources and applications in order to extract and enrich data, making the information meaningful for a natural language search. Specialized search engines have started to develop access and integration capabilities, but only BI vendors currently provide enterprise-level capabilities. Reports – static aggregations of individual transactions – are stored in report libraries or file systems. Search engines can index reports independently or with BI vendors in the same way they index any other unstructured document. The lack of context makes it difficult to distinguish, for example, one profit report from another among the hits on the search results page. BI companies provide value by supplying metadata in the search results that the end user can use to identify the most relevant report. An integrated BI and search solution lets users retrieve reports, refresh the data, and modify the report content – important capabilities when up-to-date reports are required. Only BI vendors can generate entirely new reports from the hits, such as what users would need while searching for inventories that might be out of stock. Most BI vendors only index reports. While it’s tempting to think that users don’t need anything more, most questions are about the details of individual records and transactions, especially so in operational BI. Experts estimate that 80 percent of enterprise data is structured and that, from a decision-making point of view, the value of structured transactional data far exceeds that of unstructured data. That implies that enterprises should focus on indexing structured data first; unstructured content is misconceived as low-hanging fruit because it was the core competency of search engines. Search engines significantly expand BI query capabilities in this area. BI companies use structured queries to find or filter data in known data sources using known parameters. Search allows users to find data not only in structured (dimensional) fields but also in unstructured (CLOB or text) fields without prior knowledge of the data sources or the parameter values. Thus, customer records can be retrieved by names in structured fields or by customer clues recorded in the free-form text fields. Some BI companies provide the missing link through transactional indexing, which includes data access and metadata enrichment. Transactional Indexing Transactions can be enhanced by appending data from other tables, databases, and applications, or by pre-aggregating records. Help desk applications, for example, create a new entry for each communication with a customer, and relate it to a customer case using a reference key. Indexing each communication record separately will create fragmented search results; not indexing all customer communications will create an incomplete record for searching. The solution requires enriching the incoming record with the available customer information, re-aggregating all communications into a single indexed message, and passing it to the search engine to replace the previously indexed record. This indexing process flow involves numerous steps: capturing the new incoming customer communication, creating dynamic joins with other tables and applications, running a procedure to aggregate the related case records, structuring and transforming the message into an indexing format required by the search engine, and passing it to the search engine for re-indexing, and deleting the prior record. Vendors have taken different approaches to transactional data indexing: Crawling databases: Web search engines have adopted an approach to transactional indexing similar to document indexing – they crawl tables in databases using SQL select statements. Crawling is an acceptable choice for slowly changing tables, but not for large volumes of frequently changing data that needs to be available for search in near-real time. It is also not very effective for applications and highly normalized operational data stores. Passing the search query to the application: This solution relies on some intelligence to determine how to match search terms with applications. It then relies on the application for data extraction and aggregation. This approach works well for simple queries, such as stock price information. Implementation becomes more daunting if users can run multiple queries against the same application. In those cases, a self-service application will likely offer more robust querying capabilities and be less confusing to the user. Pushing application data to the index: Instead of letting the engine crawl the records, an application pushes data into the index using a search engine-provided indexing API. The application makes all connections into the underlying data store and has complete control over scheduling, interfacing protocols, and data structures. The scope of effort to configure and use this method depends on the extraction and transformation complexity and the available application tools for it. Integrating data through SOA and process flows: These same APIs can let integration tools broaden the scope of the index. It requires integration capabilities, including transformation tools, process flow capabilities, and adapters, to define and execute the process that captures and enriches transaction data in real time. The first three methods are application-specific and would work in projects with limited scope. The fourth method is generic and will address all present and emerging search integration needs, but very few traditional BI companies have the expertise in modern integration architecture to implement it. Page 2 of 3 « previous page next page » LATEST OPEN WEB DEVELOPER STORIES
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING OPEN WEB DEVELOPER NEWS
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||