Enterprise Search

Enterprise Search and Enterprise Search refers to a sub-area of information retrieval and refers to the process of computer-aided content-based search with the help of a company's internal search engine, which indexes content using so-called crawlers.

The search is not carried out normally live on the original data sources, but in the search index. This index includes primarily internal data sources such as documents from various databases and records of file systems.

Hits or any documents found are displayed in the context of the query as a text excerpt ( " snippet "). By previewing the relevance of the results can be quickly assessed. Due to the continuous indexing of the individual data sources, the timeliness of the results ( result set ) is guaranteed.

From the perspective of companies, the benefits of Enterprise Search is the support of the employees in the search for work-related information.

How it works

Search engines consist in most cases of three main components: a crawling / indexing engine, a query engine and a ranking / relevancy engine.

The Crawling / Indexing Engine provides for the procurement of documents and data from the sources and stores this information in a searchable structure efficiently. It also provides for the creation of document cache, which are used to display the document preview in the result view. The query engine searches the index for hits and creates a list of results. The ranking / Relevancy Engine is responsible for sorting, respectively. Order of the results.

As the surface a web browser is typically used and the results are presented in a similar form as in Internet search engines.

Interfaces

Many enterprise search vendors offer a wide variety of adapters or connectors for popular enterprise applications in order to display the content in the search solution. In addition to direct queries the customer database, for example, plug-ins for group e -mail applications, content or document management systems are typical. Also a mapping as a separate file system ( network drive) is often possible. It is also often worked with " Federated Search " connectors, and pass the search query to a target system, and then the partial results obtained integrated into the results.

Components

Generally a distinction is made between the frontend and backend.

The back-end typically includes in addition to the individual connectors the crawler, indexer and parser for search queries submitted by the various front ends. These requests are forwarded to the actual search engine that compiles the information from the indexed database.

In front there is generally greater design freedom. It may simply be an input field, or provide more comfort by means of proposals for suspected typos, the ads of other related topics, or navigation through a tag cloud or faceted classification. The ever further restrict the number of hits by supplementing the keywords and criteria or by selecting a sub- concept ( for example, along a taxonomy tree) is also known as drill down. The formatting of the result ( for example, division into different pages) is also typically done in the frontend. The front end typically includes all pure convenience features such as the ability to save searches and to ask again later.

Comparative corporate search and Internet search

In Enterprise Search, as well as the Internet search similar techniques and algorithms are used basically. These are firstly the crawler. Another common feature is the large indexes and sort the results by relevance.

The following differences exist:

  • Security: In order to protect information and data against unauthorized access, those responsible must share their data sources. Here, the access to the information sought must comply with the applicable rules and regulations in the company and Privacy Policy. An integrated rights management ensures that the user will find only the data in the companies on which they are allowed to access. That is, the authorization of the users files and folders must be ensured within the company to prevent the misuse of data in the company and outside of it.
  • Link structure: The ranking is not affected by the parameter " number of links to a document ". However, some applications and sources have their own indexes. In order to improve the performance of search engines, they should build on this standard indexing. The process saves valuable resources. Lack of opportunity, relevance of information due to links determine wins in the enterprise search metadata concept massively important.
  • Sources: The searchable data comes not only from Web servers, but from various other locations. This includes network drives, intranet applications, e -mail systems, local data as well as removable media such as USB flash drives or CD -ROM drives.
  • Content: Content is not optimized for indexing by a search engine, respectively. manipulated and there is no spam. Thus, both structured and unstructured data are suitable for use.

Comparable companies search engine / database

In contrast to databases with the purpose of managing the structured content search engines are mainly used for the development of unstructured content. A large difference is also in terms of the number to be searched Sources: Enterprise Search can search several different sources, while in databases usually limit the queries to a. The query language for search architectures is a lot easier, and there just keywords can be entered and no database query languages ​​such as SQL are required. In addition to these aspects, search engines are many times faster; so a query usually takes up to one second as opposed to complex database queries that can take several hours.

Current Situation

Market researcher IDC forecast in the most recent update of their study The Diverse and Exploding Digital Universe, a veritable explosion of digital information amount and form variants. Currently, the digital information overload grow by 60 percent annually. By 2011 they should ( 10 to the power 18 bytes ) reach 1,800 exabytes, which would correspond to a tenfold increase over 2006.

According to IDC, 70 percent of people responsible for this data growth. Nevertheless, the IT departments were involved from organizations and companies about 85 percent of the resulting data into storage, provision, transmission and Privacy Policy. This fast-growing and multi-faceted data deluge faced IT managers with an unprecedented level of complexity. In desperation, many companies are trying to keep with unified, centralized systems for data management and storage sprawl under control. According to Juergen Lange DMS solutions are, however, very quickly reach their limits. The consequences are that it is more difficult for employees to obtain the required information.

This makes the searching and finding of information for companies about vital key factor is developing. Compliance with safety-related regulations plays a crucial role. While this should be a matter of course to enterprise search solutions, for the majority of the offered free search engine software for gaps - after the installation to create such programs a complete list of contents in a database on the computer, in which they store data content and application behavior. Officially transferred these search engines then reports to the outside.

Although providers of such solutions assure that they do not transmit personal data, but only movement and behavioral data, according to which the Privacy Policy, however, occurs mostly remains a mystery. After installing the security mechanisms of many companies are therefore often ineffective. Indexing, which detect the first ten thousand words so often give complete contents again. Such knowledge outside the German or European judicial area harbors an incalculable risk entrepreneurial potential; Theft and trading of information is a lucrative market.

In Germany and Europe, there are - compared to the U.S. - relatively little know -how and expertise for enterprise search solutions. Only a few German companies and European research projects dominate this key technology. Here, the policy is called upon to support the German Mittelstand. In addition, the Court has to stop foreign suppliers to respect national and European data protection regulations.

309633
de