EnterpriseSearchCenter.com Home
  News   Features   White Papers   Research Reports   Web Events   Conferences  
 
RESOURCES FOR EVALUATING ENTERPRISE SEARCH TECHNOLOGIES
August 30, 2006

Table of Contents

Featured Content: Social Work--Adding Social Network Analysis to Search
New Book on Workplace Surveillance Helps IT Managers Address Employee Concerns
Convera Launches govmine
Endeca Announces SEO Capabilities for Guided Navigation-based Websites
IBM to purchase FileNet
Context is king
Stellent, securely
Open Text to acquire Hummingbird
Blip.tv Signs Licensing Deal with Media Companies

Featured Content: Social Work--Adding Social Network Analysis to Search

The web today is about participation and participant-created content. The most effective web search tools take this participation into consideration in the process of delivering relevant results. A look at these techniques (and some of the problems with them) can lead to insights into exploring the relationship between social context and search results inside the firewall as well.

The first name in web search, Google, examines participant behavior to enhance its search results. In social network analysis terms, it measures your degree centrality—how connected you are—through its page-rank approach. In addition to your degree centrality, Google factors many things into its search algorithm; it also keeps its algorithm secret to slow down those who want to cheat in order to gain a higher place in search results. Degree centrality, one of the central factors in Google's algorithm, measures the reputation of a site through who links to the site and who the site links to. Naturally, Google looks at other factors such as how relevant your content is to the search term, but your social standing in the internet is a key determinate of your place in its search results.

One of the constant battles that Google has, along with all internet search engines, is staying ahead of the cheaters who want to position their sites artificially high in the search results. It is relatively easy to game the system and get an artificially high degree centrality measure because degree centrality is a very direct and simple social network analysis approach. You can set up automated ways to create links and you can link into highly rated sites to benefit from their standing. Google does not make its social network analysis visible because its mission is simply to provide keyword search results. However, it also might not want to do this because it would give away tips to the cheaters on how they operate.

One-two Punch
There are other hybrid approaches to consider that make more direct use of social network analysis by combining it with search in a visible way that can be leveraged by the searcher. For example, iQuest uses a form of social network to enhance its search results and content analysis to enhance its social network analysis results. It measures the between centrality—are you linked to between important websites?—to gauge the importance of connections that pass through your site versus the whole web. Thus, the keyword in a given search field does not have to be on your site; it can be on the other sites that connect to and through yours. iQuest correlates well with Google but it is harder to game because of the indirect social network measures.

Both tools return the appropriate sites for the search term with links to those sites. However, a big difference between the two occurs in the results displayed via iQuest, which also shows you the social network analysis.

iQuest differs from other social network analysis in other ways, because it contains a content analysis tool that looks inside the content of electronic communication. Most social network analysis tools simply look at the relationships between participants in this communication. A hybrid tool like iQuest shows you who is talking to whom, what they talk about, when they talk, and where those conversations take place.

Searching for Social Structure
Social network analysis has some interesting applications for enterprise search, though when you go behind the firewall, the nature of content changes. Most of the content in the internet is unstructured and made up of large amounts of narrative data. Within organizations, new data factors in, such as company records and email exchanges. Behind the firewall, the mix shifts and there is more structured data. For example, you now find project status reports and financial results in spreadsheets.

Because their focus is on unstructured data, internet search engines like Google look at structured data as if it were unstructured, treating things like email addresses or phone numbers simply as data, rather than taking into account what they actually are. In a perfect world, a search engine would "know" whether it is looking at unstructured or structured data, and then understand and make use of the structure, if present.

There are a number of enterprise search engines that can address structured and unstructured data, but most of these are not capable of looking at the social networks around this data. Just like the web, the world inside the firewall is made up of social relationships. Hybrid tools allow you to both find content and benefit from understanding the human interactions around content. You can look at the email communication within a project team to see if it is functional or dysfunctional. Are key team members being left out? Is someone blocking communication? Are all the right people included? And, of course, you can see what they are talking about.

When you combine social context awareness with search, you can see who is central to the conversations on specific topics occurring inside the firewall and who may be excluded. Who are the real contributors? Who is being discussed, and what is being said about them? What are the most-used words on a specific topic or specific person? The integration of social network analysis with search opens up new avenues of inquiry and can put greater context to enterprise search.

See the Connections
Here is a simple table that summarizes the differences between internet search engines such as Google, enterprise search tools, social network analysis (SNA) tools, and hybrid tools that combine social network analysis with search.

This power naturally opens up issues of privacy, and policy needs to be made to govern proper use. However, with the new internet we are living in a world of greater openness and interaction, as evidenced by the growing popularity of tools such as blogs and wikis, not to mention social networking tools. It is interesting that in the early days of Quickplace, the Lotus collaboration tool, users were upset when senior management and IT people wanted to make access to the Quickplace collaboration sites more open. Now, companies are increasingly using blogs and wikis as collaboration sites because they provide the very transparency that used to have people up in arms.

We are moving into a new world of amplified social interaction and increased transparency. Combining social network analysis with search inside and outside the firewall will help us make greater sense of this world through improved understanding of the human interactions that occur around content.


Unstructured Data Structured DataLinks to and FromRelationships BetweenExact Data Not in Target 
Internet Search

Enterprise Search

some

Hybrid Tools

SNA Tools

Click here to download your free PDF of this article.

Back to Contents...

New Book on Workplace Surveillance Helps IT Managers Address Employee Concerns

We found this title in ITI's book shop. And thought you might be interested, even though it's not directly related to enterprise search. 

The Visible Employee

Using Workplace Monitoring and Surveillance to Protect Information Assets—Without Compromising Employee Privacy or Trust

 By Jeffrey M. Stanton and Kathryn R. Stam

CyberAge Books • 2006/376 pp/softbound ISBN 0-910965-74-9 Regular Price: $24.95

The misuse of an organization's information systems by employees, whether through error or by intent, can result in leaked and corrupted data, crippled networks, lost productivity, legal problems, and public embarrassment. As organizations turn to technology to monitor employee use of network resources, they are finding themselves at odds with workers who instinctively feel their privacy is being invaded. THE VISIBLE EMPLOYEE reports the results of an extensive four-year research project, covering a range of security solutions for at-risk organizations as well as the perceptions and attitudes of employees toward monitoring and surveillance. The result is a wake-up call for business owners, managers, and IT staff, as well as an eye-opening dose of reality for employees.

 "An eye-opening book for employees with privacy concerns and employers worried about information security. Carefully researched and remarkable for its objectivity." — Ted Demopoulos Demopoulos Associates

"Employee monitoring and workplace privacy issues seem contentious at best, and intractable at worst. Read this book, and you will find yourself at the enlightened end of the learning curve. The authors distill solutions from workplaces with high security standards and high employee satisfaction. You don't have to reinvent the solution—just read it." — Steve Kropper Senior Vice President Equinox Corporation

Click through to preview the table of contents, Chapter 1, and order

Back to Contents...

Convera Launches govmine

Convera Corporation, a provider of search technologies for professional workers, has announced the beta launch of govmine, a commercial search site designed for government professionals. The site provides government workers with the technology for researching data across the internet by enabling them to enter queries and presenting the results.

(www.convera.com; www.govmine.com)

Back to Contents...

Endeca Announces SEO Capabilities for Guided Navigation-based Websites

Endeca, an information access company, has announced the first in a series of solutions and best practices aimed at increasing the effectiveness of search engine optimization (SEO) efforts for Guided Navigation-based websites.

The solutions and best practices are designed to give companies running business-critical applications on the Endeca Information Access Platform (IAP) new capabilities to complement search marketing initiatives and help web search engines to more accurately rank products and content. One of the first solutions is the Endeca Sitemap Generator, which is designed to ensure that the right pages are indexed by the key web search engines. It dynamically creates a sitemap containing every relevant navigation page configured within the MDEX Engine--the core technology that powers the Endeca IAP and resulting Guided Navigation user experience. In addition, the generator eliminates duplicate sets of content to avoid "PageRank" dilution through duplication. It supports several output formats including HTML output for websites and XML output for Google Sitemaps, and is configured for other data feeds.

Among the collection of best practices, all offered to Endeca customers via the company's Customer Solutions organization, are ways to take advantage of the Endeca IAP's Web-based Management Suite to customize and dynamically create meta-tags. This capability allows non-technical business users to build context-sensitive presentation rules to optimize what a web search engine sees when it indexes specific pages.

(www.endeca.com)

Back to Contents...

IBM to purchase FileNet

As part of its Information on Demand initiative, which was formally launched in February, IBM has announced definitive agreement to acquire content and business process management provider FileNet in an all-cash transaction at a price of approximately $1.6 billion, or $35 per share. The acquisition is subject to FileNet shareholder approval, regulatory reviews and other customary closing conditions. It is expected to close in the fourth quarter of 2006.

IBM says its Information on Demand strategy is designed to provide its clients with information precisely when and how they need it to improve their business processes, respond to market needs and identify new business opportunities.

Following completion of the acquisition, IBM says it intends to:

  • combine FileNet's operations with IBM's Content Management business in the Information Management unit,
  • preserve and enhance customer investments in both FileNet and IBM content management platforms,
  • integrate IBM's BPM and service oriented architecture technologies with the FileNet platform, and
  • train IBM and FileNet partners and services teams on both IBM and FileNet technology.

Back to Contents...

Context is king

In the latest version of its operating platform, Contextware has enhanced the ability to capture, organize, distribute and operate more effectively around business processes, policies, procedures, content and resources.

New in Version 2.4 is the Contextware Anywhere feature, which is said to allow customers to connect directly from their existing Web-accessible applications to Contextware. For example, the company says, employees can be working on the Internet or in any enterprise system on another part of the network and instantly jump into Contextware, review the business procedures and tasks relevant to the work they are doing and then jump back into the application from which they came. Additional enhancements include the ability to bookmark process information, improved user administration and security, as well as expanded authoring capabilities.

Back to Contents...

Stellent, securely

Stellent has acquired SealedMedia, a provider of enterprise digital rights management solutions, and Bitform, which offers content cleansing technologies. Stellent says integrating the new technologies will enable its customers to better secure and control content both inside and outside of the enterprise.

SealedMedia's software allows organizations to maintain complete control—for the full life of a document—over who can use sensitive information and when. Unique in the DRM space, SealedMedia extends security, control and tracking to information on remote user desktops, laptops and mobile wireless devices.

Bitform's Secure SDK identifies and cleanses or strips files of sensitive, confidential or proprietary metadata and hidden information that may pose risks to organizations if exposed, including tracked changes and comments, revision and author history, fast-save data, and database connection details, among many other elements.

Back to Contents...

Open Text to acquire Hummingbird

Open Text has entered into a definitive agreement to acquire all of Hummingbird's outstanding shares in an all-cash transaction. The deal is valued at US$27.85 per share, or approximately US$489 million). Open Text says Hummingbird's strength in the legal and government fields will significantly enhance its ECM offerings aimed at vertical markets and regulatory compliance.

Open Text also reports that its First Class division has released the latest version of it collaboration suite. The company explains that Version 8.3 enables collaboration and enhances communication through any Internet-accessible device. Applications include: e-mail, instant messaging, calendars, contact management, workgroup collaboration, document sharing, file storage, Web publishing, blogging, podcasting, and unified voice and fax messaging.

Back to Contents...

Blip.tv Signs Licensing Deal with Media Companies

Blip.tv, an internet media company that hosts and distributes web-based TV shows and videoblogs, has announced that CNN has licensed its software to enable people around the world to send CNN newsworthy footage shot on a cell phone, camcorder, or digital camera. Beginning immediately, anyone can upload footage directly to CNN, via blip.tv software, for consideration for use by the CNN broadcast networks, CNN.com, and its other platforms and services.

Blip.tv has also announced software-licensing deals with the Oxygen Network and the William Shatner DVD Club. The agreements enable anyone to share content directly with these companies for contests, promotions, and web-based features. Blip.tv's software and tools were designed to enable companies to attract, collect, and manage user-generated content. Blip.tv also gives users the tools to distribute videos to blogs, to iTunes, video aggregators and search engines, del.icio.us, Flickr, and more. Blip.tv lets show creators maintain complete ownership of their content and supports Creative Commons licensing.

(www.blip.tv)

Back to Contents...
 
[Newsletters] [Home]

Problems with this site? Please contact the webmaster. | About ITI | Privacy Policy