EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
April 18, 2007

Table of Contents

Featured Content: Enterprise Search & The Future of Findability
Google vs. Microsoft -- Which Works Better for the Enterprise?
FAST buys Convera's RetrievalWare
Clarabridge Releases Content Mining Platform 2.2
New from RedDot/Open Text
UpSNAP Partners with MuseGlobal to Launch Mobile Metasearch
ClipBlast! Releases ClipBlast! 2.0
SearchInform Technologies Releases Internet Search Product; Announces New Version of SearchInform
On-demand e-mail archiving
Managing metadata

Featured Content: Enterprise Search & The Future of Findability

Q:  Let’s just jump right into the basics: What makes enterprise search so difficult? We keep hearing this question: Why can’t it be as simple as Google?

Peter Morville: Google is able to draw on the page rank algorithm, which looks at the links between websites as implicit votes of confidence. The most popular sites from a linking perspective float to the top.

At the enterprise level, you’re dealing with a smaller set of users. And, they don’t get as excited and enthusiastic about their enterprise’s content, so they’re not as willing or able to do all the cross-linking. Hence, the tendency is to fall back on the older model of relying on the words in the text and supplementing that with metadata.

There are additional challenges unique to enterprise search.There are so many different file types. Plus, search has to work across multiple departments and business units—and it’s not at all easy to reconcile the different goals, priorities, and perspectives that come with multidisciplinary efforts. You also have cultural and international challenges.

But, there are a number of things you can do to improve the situation: tune the search engine’s relevance ranking algorithms to the content; carefully design the results interfaces; take a close look at the content and make sure to remove the content that’s redundant, outdated, or trivial (also known as the “ROT”). If you can make the information space smaller, then the search is going to be a lot more successful.

Q:  You mentioned metadata. Can you speak to automated versus manual tagging?

PM: On one hand, it’s great to be able to automatically extract certain types of metadata.Things such as document type, author, and creation date are often easy to capture. But machines don’t do so well with topical or subject oriented
classification. They don’t understand what we librarians call “aboutness.” So, people should be involved in tagging some metadata fields, at least for the most important content.

Q:  There’s a lot of talk these days about guided navigation, can you help the readers understand what’s involved?

PM: The first successes with guided navigation occurred in the e-commerce market.Take, for example,Tower Records. You might enter a couple of keywords for a song whose name you know.You get back a massive results list, but on the left side of your screen, you see a narrow panel that provides you with metadata fields and values that provide an opportunity to further refine or narrow your query. So, you might limit this search to albums from the 1970s, or by genre, artist, or price. You can perform this narrowing in an iterative, interactive fashion until you find what you need. This approach is just starting to succeed in enterprise settings where there’s more work to be done to define those metadata fields and tag the documents, but there’s a real payoff from a findability perspective.

Q:  What about any problems involved with a guided-type navigation solution?

PM: The biggest challenge in the enterprise setting is tagging the documents. So, there’s a good amount of heavyduty information architecture work to be done in defining the metadata fields and developing the controlled vocabularies or the values for each field. It is difficult but not necessarily all that time consuming. But how do you tag all this stuff? That takes us back to the earlier point, whereby if you’re able to do a decent job with automatic metadata extraction, you might be able to tackle most of your content. However, the real high-value topical tagging is almost certainly going to have to be done by people.

This is where you really have to get into defining different tiers of content, and perhaps manually tag only a small percent of  the most valuable enterprise-level content.That way, you’ll provide better access to that core content while the remaining content will still be accessible via traditional search.

Q:  Let’s talk a little about your book and where you think search is headed.

PM:
Ambient Findability is very much focused on the future. At the crossroads of ubiquitous computing and the internet, I see us heading toward a world where we can find anyone or anything from anywhere at anytime. Technologies such as GPS, RFID, and cellular triangulation are making it possible to tag and track high-value objects and each other. As we create an “internet of things” populated by location-aware devices (not to mention sensors), the concepts of retrieval and wayfinding are converging in some interesting ways. As the book’s subtitle suggests, what we find changes who we become…

As a librarian, I’ve long been aware that the iterative and interactive nature of search involves a lot of learning. In fact, I would argue that in this age of knowledge work, search is among the most important ways that we learn.We enter a few key words, find a document, read it, and discover that we have been searching for the wrong thing. So, we try a different search term, and we find an expert in the field. Perhaps we hire them as a consultant, and they teach us something else and send us in new directions. Ultimately the search shapes our goals and what we believe.

Q:  How prepared are enterprises to fully embrace the potential of search?

PM: It depends. I see a huge discrepancy between the few organizations that really get it and are way ahead, and the other organizations that are back in the 1990s with respect to their enterprise search systems. So, in the near term, we have an awful lot of work to do to simply make our existing systems work better—sometimes organizations take two steps forward and three steps back. It’s not uncommon for companies that integrate new content management systems and search engines to actually degrade the user experience in order to accommodate the new software.


In the longer term, I’m intrigued by the convergence of physical and digital systems and experiences. Organizations will have the opportunity to become more efficient by tracking a diverse inventory of people, information, and objects. Hospitals are already using a wireless product from Cisco that lets them track the location of their wheelchairs. And they’re showing a substantial return on investment, because they’re saving tremendous amounts of staff time—instead of searching around for
that wheelchair that got stuck in a closet somewhere, they’re able to locate it immediately.

There are opportunities within many enterprises to tag and track high-value objects and really improve efficiency and customer service at the same time. As the digital and physical world converge, we’ll identify products and solutions we’ve never considered before. It’s an exciting time to be in this business. As science fiction writer Bruce Sterling noted, “the future isn’t just unwritten—it’s unsearched.”

Back to Contents...

Google vs. Microsoft -- Which Works Better for the Enterprise?

It wasn't exactly a duel to the death, but Microsoft's group product manager for enterprise search Jared Spataro (right) and Google's lead program manager for the Google search appliance Nitin Mangtani definitely faced off at Gilbane San Francisco last week in a debate about which company has the better solution for the enterprise.

A common criticism of Google as an enterprise solution is that its dependency on links as a clue to relevancy are irrelevant in the corporate setting, where linking does not take place, at least not with the frequency of the open Web.  Google's plug-n-play, one size-fits-all appliance is also thought by some to oversimplify the high-end, specialized searching that may be needed in the enterprise. 

That these criticisms are often advanced by vendors with a high-end product to sell, however, may cause some users to pause in merely rejecting Google because it is successful in the consumer space. 

Mangtani told the audience at Gilbane, "We believe it's the same guy on Google and in the enterprise. You don't want to give him a nice experience as a consumer and a terrible one at work."

Spataro likened the M/S and Google businesses by saying each recognized the importance of enterprise search for their customers, each recognized the market as having tiers (commodity on the low end, specialized needs on the top), and each has a great market reach.  There are differences as well, he observed, "Google wants to organize the world's information, while Microsoft has focused on empowering people to get more done."

Mangtani rebutted that Google appreciates that Enterprise Search is different than Internet search, and then proceeded to read a litany of corporate customer names from the 7,000 users of the Google appliance.  And to prove that Google is not just about unstructured information, he pointed to work Google has done on the consumer side to retrieve structured information in response to queries about the weather, shipment tracking, and other data-based requests.

Spataro emphasized Microsoft's understanding of all three search markets, noting particularly the ability to deliver high-end, specialized search products for the most demanding market segments.

By all accounts, the debate was a draw. 

--Dick Kaser, Information Today, Inc.

Back to Contents...

FAST buys Convera's RetrievalWare

FAST has agreed to purchase the assets of Convera's RetrievalWare business for $23 million in cash.

Under the terms of the agreement, FAST will retain Convera professionals serving its enterprise search customers. The deal enables Convera to focus its technology and professional service resources on the rapidly growing vertical search market serving specialist publishers. The acquisition is subject to customary closing conditions and is expected to close in the second quarter.

Convera and FAST have also announced that Convera has licensed FAST Ad Momentum, a private-label contextual advertising and monetization platform developed with the support of leading online publishers. FAST Ad Momentum will be integrated with Convera’s hosted vertical search solution and its Publisher Control Panel, which enables publishers to directly manage and improve the search experience for their professional communities and pursue search-based revenues for their Web sites.

Back to Contents...

Clarabridge Releases Content Mining Platform 2.2

Clarabridge, a text-mining software company, has announced Release 2.2 of its Content Mining Platform (CMP), a solution built specifically to enable the commercial use of text mining. The new version of CMP integrates entity extraction, fact extraction, categorization, sentiment extraction, and other natural language processing (NLP) capabilities in a single solution designed to allow general business users to convert internal and external source data into business intelligence, without special coding. Release 2.2 includes function-specific reports that organize customer feedback, measure customer sentiment, and can facilitate root-causes analysis. With CMP, users can ask any question of any information source using any analysis technique to address business needs. The general availability of CMP 2.2 is scheduled for early in the second quarter of 2007.

(www.clarabridge.com)  

Back to Contents...

New from RedDot/Open Text

 Open Text reports that RedDot, the Open Text Web Solutions Group, has released the next versions of its content management and delivery software, RedDot CMS 7.5 and RedDot LiveServer 3.5.

The newest versions of RedDot CMS and LiveServer feature:

  • integration with Open Text Livelink--allows Livelink ECM content to be delivered to the Web offering access to documents across intranets, extranets and Web sites);
  • improved solution for SAP Portal integration--permits adding RedDot content into the portal navigation, providing a more efficient experience for Web site administrators;
  • enhanced search--employs Microsoft SQL full-text index search to locate and update content faster);
  • a redesigned task monitoring interface--enables a quick view of each user's workflows and activities within RedDot CMS); and
  • a new foundation for add-on Web components--allows partners, customers and RedDot developers to build RedDot Web components with standardized tools, which can be placed within a page uniting content from the CMS with the component functionality.

Back to Contents...

UpSNAP Partners with MuseGlobal to Launch Mobile Metasearch

UpSNAP Inc., a provider of free mobile search and streaming mobile audio entertainment, and MuseGlobal, a provider of comprehensive search management systems worldwide, have announced that they are forming a partnership to offer mobile metasearch. MuseGlobal has created new search technology designed to enable UpSNAP Inc. customers to enjoy mobile metasearch access to ringtones, on-demand streaming audio, mobile radio, and more. Mobile metasearch technology from MuseGlobal searches multiple mobile sites simultaneously.

(www.upsnap.com; www.museglobal.com)  

Back to Contents...

ClipBlast! Releases ClipBlast! 2.0

ClipBlast!, a video search engine, has launched ClipBlast! 2.0, complete with Video Navigator--Web video technology engineered according to how people actually interact with video online. ClipBlast!'s Video Navigator can help viewers find the specific video that interests them from across the web, enabling them to search, browse, and personalize the video they want, when they want it.

At ClipBlast.com, viewers can: search the entire internet. ClipBlast! crawls the entire web for all video content available and serves up all the new video that's being uploaded to the web every minute of every day--in real time; browse video in different ways. ClipBlast! viewers can select categories ranging from animals to wellness and major content providers from ABC to YouTube; personalize ClipBlast! to deliver what each viewer wants. Viewers can build their personal Video Clip Library by saving their searches, preferred categories, favorite providers, and individual video clips. They can sign up to receive email as new, similar clips become available. The ClipBlast! Video Navigator delivers search results that link viewers directly to content providers' sites. This approach is designed to allow companies to better protect copyrighted material, increase their own traffic, control the viewing experience, and monetize their own content. Participating sites can customize and embed ClipBlast! technology, enabling viewers to search and view a specific content provider's video.

(www.clipblast.com)  

Back to Contents...

SearchInform Technologies Releases Internet Search Product; Announces New Version of SearchInform

SearchInform Technologies, a company operating in the sphere of corporate full text search technologies, has released SearchInform Internet Server, designed to enable the user to organize search through any given site or a list of resources. SearchInform Internet Server allows the owner of a website or information portal to provide its users with the ability to conduct quality full text search in the contents of the whole resource. Search results are displayed in both HTML and XML formats, which makes them fit the website's custom design. SearchInform Internet Server can also organize vertical (topical) search through a preset list of resources. The program has high reaction speed to any website updates (reindexing takes place at a time interval set by the user) as well as a precise relevance of results compared to most global search engines (as search is conducted only through the previously selected list of relevant to the user resources). The websites to be indexes can be selected not only by topic, but also by their belonging to a certain domain zone (for instance, all sites from .biz zone).

SearchInform Technologies has also introduced a new version of SearchInform, a corporate system of full text search and search for documents with similar content in large databases, featuring added support of popular DjVu format as well as changes in the program's functional and in its performance in the local network. The program is able to index and perform search in DjVu document format. DjVu is a graphic format specially designed and optimized for storing scanned documents. The changes also took place in the program's corporate functions as the process of scanning the local network for SearchInform servers has been enhanced. The program will only locate servers of the same version as the client application. That is, if the client application is version 3.3.08, it will find only analogous servers. Connecting with other server versions could lead to unstable work of the system. Main features of SearchInform 3.3.08, include: phrase search with due consideration to stemming and thesaurus; new SoftInform Search Technology of search for similar documents; indexing speed from 15 to 30 Gb/hour; index size of 15-25% from the actual size of the text data; query caching system; and support of over 60 text formats, Outlook and TheBat electronic messages, mp3 & avi tags, and logs of MSN and ICQ instant messaging programs.

(www.searchinform.com)  

Back to Contents...

On-demand e-mail archiving

Fortiva reports its enhanced Archiving Suite is the only software-as-a-service offering that meets the full spectrum of archiving, including e-mail storage management with attachment stubbing, advanced search for legal discovery and compliance with supervision and retention requirements.

Features include:

  • legal discovery search tools to allow legal counsel to manage discovery requests without the help of IT;
  • attachment stubbing to enable IT to reduce e-mail server storage by as much as 80 percent without limiting user access to e-mail;
  • real-time archive searching from Outlook to allow users to have complete control over their knowledgebase, providing instant access to retrieve accidentally deleted e-mail; and
  • a Web-based supervision tool to enable compliance departments to review and monitor e-mail.

Back to Contents...

Managing metadata

Inxight Software has introduced what it says is the first enterprise-class data integration platform designed specifically for the collection, exploration and cleansing of data derived from unstructured sources.

The Inxight SmartDiscovery Metadata Management System (MMS) is designed to profile data to discover inconsistencies and other anomalies and perform data cleansing activities (detecting, correcting or removing inaccurate records).

The Inxight Metadata Management Systems includes three modules:

  • the Metadata Connector, which directs extracted output from Inxight's SmartDiscovery text extraction;
  • the Metadata Repository, which leverages and extends the power of a standard Oracle database to hold Inxight-extracted information; and
  • the thin-client Metadata Editor, which allows users to modify and augment the results of Inxight's SmartDiscovery text extraction.

Inxight says that by combining the power of its automated text extraction with the precision of human review, MMS simplifies and accelerates text analytics projects, providing a resource for third-party system integrations and a link to Inxight-powered downstream operations that rely on accurate information.

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy