Semantic Web

The Semantic Web (English Semantic Web ) is a concept in the development of the World Wide Web and the Internet. In the framework to develop the Internet of Things and ubiquitous computing, it becomes necessary that machines can process the information gathered from people. All amounts expressed in human language information on the Internet should be a clear description of their meaning (semantics ), which can be understood by computers or at least processed. The machine using the data from the braided human power of the data is possible only when the machine can be assigned their meaning clear; only then will they represent information

The Semantic Web is an instance of semantic networks. As an instance of the Semantic Web is also an extension of the World Wide Web. The goal of the Semantic Web is to make the meaning of information for computer recyclable and thus to automatically arrange for the interested user as part of a query. The information on the Web should be able to be interpreted by machines and processed automatically. Information about places, people and things are supposed to be set with the help of the Semantic Web based on the content related to each other.

Example: Dresden is located on the same . Paul Schuster was born in 1950 in Dresden birth> .

  • 4.1 annotation
  • 4.2 ontology
  • 5.1 Brief overview of relevant W3C Recommendations
  • 5.2 XML and RDF
  • 5.3 RDF Schema (RDF Vocabulary Description Language )
  • 5.4 Web Ontology Language
  • 5.5 Related standards
  • 6.1 Tools

Applications

When linking the information in a Semantic Web new relationships can be discovered that were previously not recognized (see Serendipity effect).

  • For consumers: When traveling about weather and traffic reports in relation to information about possible stops and known preferences of the travelers would be set.
  • In the art: Related solutions are put in relation, to determine possible new approaches.
  • In science: non -structured texts are assessing the content and associated with each other.
  • In health: publications will be evaluated in terms of side effects or concomitant symptoms.
  • For buyers: Comparable offers are to be put in relation with respect to all data requested.

History

The concept is based on a proposal by Tim Berners -Lee, the founder of the World Wide Web.

Based on the term Web 2.0 is called by John Markoff of Web 3.0, if the concepts of Web 2.0, yet added the concepts of the Semantic Web.

Basics

While the World Wide Web is a way to link data together, the Semantic Web is a way to link information on the level of their importance to each other. The contents of the World Wide Web can currently be understood and interpreted by humans. Whether it is in a piece of text to a first name, a last name, the name of a city or a company or an address of the structure of the website is not apparent. This hinders the machine processing of content that would be desirable in view of the rapidly growing amount of available information.

The Semantic Web is intended to represent the solution to these problems. The data in a Semantic Web are structured and presented in a form which enables computers to process them according to their substantive significance. In addition, a Semantic Web allows computers ( on realization of the concept ), to derive knowledge from the many information of the global data and to generate new knowledge. Origins of the Semantic Web are also in the research field of artificial intelligence.

In computer science, one stands in many areas the task Detected or thought up to represent and to impart knowledge, such as facts, facts or rules in a technical application, in a business process or in legal procedures or the content of documents or web pages. People can make stored knowledge advantage by resorting to their basic and contextual knowledge of the relevant knowledge domain, use textbooks, rules, lexicons and tag register and connect to the stored contents. If, however, take automatic search, communication and decision making responsibilities with respect to the stored knowledge or exchange data, which themselves contain information about how they are to structure and interpret, so they will need a representation of the underlying concepts and their interrelationships. (See also: Semantic gap)

Knowledge Representation

One way to solve this problem is the concept of knowledge representation - as a knowledge representation called. Such concepts are used for the Semantic Web. A knowledge representation describes a field of knowledge - as a knowledge domain called - with the help of standards and defined relationships and possibly also by derivation rules and the corresponding bibliographic references. The Semantic Web is a similar approach as a knowledge base without limits, but with attributes to link together all available sources at the semantic level.

The knowledge representation is composed of three areas of other scientific fields:

  • Logic provides the formal structure ready to formulate rules with the aid of the computer system is capable of forming inferences.
  • Ontologies define the objects that exist in a particular environment.
  • Predictability is a property of a knowledge base, this can be practical.

Without a logical knowledge representation is unclear, since then there are no criteria to check whether certain statements are unnecessary, redundant or even inconsistent. Without an ontology statements can be difficult to determine and are confusing because their linguistic attributes were not specified. It does not make sense to implement the two scientific fields logic and ontology on a computer system without making a predictable review.

Concepts

In contrast to information retrieval with information extraction (IR / IE ), operating on unstructured data sets the Semantic Web annotations ( metadata) for the construction of the knowledge representation ahead. The significance of the presented content is so explicit written to using a markup language and not later heuristically interpreted as in computational linguistics. The annotation is done using specified vocabularies and ontologies, for example by RDF or OWL.

The following individual components are investigated.

Annotation

An annotation of HTML / XML pages on the web is done for example by the Wissens-/Ontologie-Repräsentationssprachen (RDF ), or building upon Web Ontology Language (OWL ). What one would like to achieve?

For one, it comes to provide better categorization options. This is to be brought closer by the importance of WWW links via annotation:

Secondly, it should be possible to make conclusions using annotation. For example, says the annotation of a Web page that it deals with "football". From the ontology used would indicate that there is a certain " sport " is in "soccer". So you came to the conclusion that the site the more general theme of " Sport" treated, although this was not explicitly stored in the metadata.

With an appropriate choice of terms in the annotation is thus a high degree of automation for processing sites could be achieved. So it would be very desirable if Semantic search engines could answer through the implementation of semantic networks, more complex queries directly in the near future. The result of the query " How many goals did Diego Maradona shot at the World Cup in 1982? " Would then contain only this one needed information.

Ontology

For the representation of complex knowledge relationships, the term ontology is used in the department of computer science. In contrast to the taxonomy - which uses simple hierarchies - the ontology embodies a network of hierarchies, in which information about logical relationships are linked or could be. These relations are based on properties that need to be specifically allocated to the information. Elements, associated in this way are generated semantically. Ontologies consist of a variety of components such term ( concept ), instances and relations.

Techniques

Quick Overview of relevant W3C Recommendations

  • OWL - Web Ontology Language - Description Language for classes and relations
  • RDF - Resource Description Framework - Description language for a Web resource information
  • SPARQL Protocol and RDF Query Language - query language for the Resource Description Framework
  • GRDDL - Gleaning Resource Descriptions from Dialects of Languages ​​- "bridge" to RDF ​​via XML resource descriptions
  • RDFa - Using RDFa RDF can be in XML documents (including XHTML) to embed.
  • RIF - language to formulate rules to the semantic data

Below are several languages ​​that are used for the construction of the knowledge representation in the Semantic Web, explained:

XML and RDF

Often the concept of the Semantic Web is only brought with RDF (Resource Description Framework), in conjunction, although the vision of the Semantic Web, of course, other representations does not rule. In 2001 Write Berners-Lee et al. in an article: the Semantic Web is an extension of the conventional Web, are provided in the information with unique meanings to facilitate the work between man and machine ("The Semantic Web is an extension of the current web in Which information is givenName well -defined meaning, better enabling computers and people to work in cooperation ", Scientific American ( 2001-05 ) ).

RDF as a markup language for metadata based on so-called triples or statements (English: statements ) from a subject, predicate (or property, English property ) and object, which can be seen as an extension to key - value pairs. While key - value pairs can only assign any property to any value (eg contact = Musterstraße ) can with a triple on semantic type an object, concept or value can be set with another in relationship. An example of such a pattern is triple road contact by John Doe, here Sample Road is the subject of contact is the predicate and Joe Bloggs the object. Any resources (typically Web pages) are assigned specific values ​​, such as author, creation date, with just the URL of the Web page, the subject, the predicate, and finally the author's name represents the property "author" the object. Since ideally for the properties known and widely used vocabulary is used, such as the Dublin Core Element Set ( DC) which provides unique URIs for the main types of metadata, the information of such an excellent resource for computer programs as metadata can be identified and accordingly interpretable, eg an author as just such.

The concept of RDF triples is strongly based on conceptual graphs (CG ) ( John F. Sowa ), which was published in 1976 (see). But the concept of conceptual graphs proved to be too little formal and imprecise. The optimal serializing RDF - based descriptions is not a trivial problem, so that for a simpler notations are continually invented, such as N3, and on the other a wide distribution does not take place overnight. This difficulty is also seen hand in hand with a lack of immediate "reward" the hassles of metadata distinction. The World Wide Web has grown so quickly esp. because HTML is simple and the publication thereof will be rewarded by an immediate, worldwide availability on the Web.

RDF Schema (RDF Vocabulary Description Language )

The Resource Description Framework model gives the possibility to generate single XML -compliant documents describing objects using statements. By judicious choice of resource names to get information about the object. To a group of similar objects, such as books, all excel with the same properties offers, RDF is no way a " framework" for all of these objects to be defined. For these purposes, the RDF description language was - RDF Schema (RDFS, officially, " RDF Vocabulary Description Language " ) - defined. This provides the ability to set terms and their related semantically to each other. For example, can be set with RDFS that the property title used to describe the title of a book. In RDF Schema is defined for each attribute, which values ​​are allowed, what this has a significance which relations to other properties exist and what types of resources may use this property. This is not a universal schema has been defined by the W3C, are defined in the various classes and properties, but it is described in a " Scheme - Definition Language ", by which the actual schemes can be defined. These schemes are also referred to as the vocabularies. In recent years, RDF Schema communities have formed that have the task of designing RDF Schema metadata models, such as the Dublin Core. Through this decentralized approach is admitted that it is impossible to develop a single suitable for use all options scheme.

Web ontology language

The Semantic Web and RDF / OWL were from the World Wide Web Consortium (W3C ) developed and standardized. This will also learn exactly these technologies most of dissemination.

The Web Ontology Language (OWL ) is currently the most popular language for modeling ontologies and thus to the development of the Semantic Web. OWL is derived from the ontology language DAML OIL and builds on RDF / RDFS. This means that the official exchange syntax is RDF. OWL is located on the Semantic Web concept above XML. With OWL terms of a domain and their relationships, just like with RDFS, formally described. However, OWL compared to RDFS provides much more complex functions to describe the relationships. In general, the difference between OWL and RDFS is that concepts can be specified more clearly in OWL, thereby providing a higher level of abstraction is formed. Furthermore, with the help of Reasonern which OWL process instead of RDFS, better logical conclusions be closed, as is used to create logical constructs in OWL, which are not possible with RDFS. The Web Ontology Language exists in three different versions.

For this, the levels of language OWL Lite, OWL DL and OWL Full have been defined. For the use of OWL-Lite/DL restrictions have been defined, which facilitate the development of tools. The drawing of logical conclusions based in OWL generally on the concept of so-called Open World Assumption - short OWA. The Open World Assumption (open -world assumption) means that a reasoner does not accept that something does not exist unless it has been explicitly defined so that it does not exist. Expressed general rule is that as long as something has not testified as applicable, a Reasoner does not accept that it is incorrect - it simply assumes that the knowledge has not been added to the knowledge base. This may result in OWL that no amount of return is found. There is a risk to trigger an infinite or at least very long lasting arithmetic operation.

Related standards

A similar approach to knowledge representation provides example the ISO standard Topic Maps (TM ) dar. not tied to a particular serialization format ( such as RDF ) Unlike RDF are Topic Maps. A major difference between RDF and Topic Maps is in the semantics of associations. While in RDF associations are always directed, they are non-directional and role based in the Topic Maps standard.

Projects related to the Semantic Web

Techniques of the Semantic Web are beginning to slowly and partially enforce. Examples of applications are:

Tools

  • Apache Stanbol: Set of modules to extend existing CMS with semantic methods
  • HERBIE - Hybrid Education and Research Base for Information Exchange
  • CHAMPAGNE - Semantic Enabled Knowledge Technologies
722457
de