Where is the Semantic Web going

There are different markup formats and technologies in the area of ​​linked data, how do they differ from one another? In your opinion, will one prevail?

Yes, there are basically different formats such as RDF, RDFa or, more recently, Schema.org. They usually differ in their thickness. That means, on the one hand we have things like microformats that are very easy to use but not very powerful, on the other hand something like RDF or RDFa, very powerful and also expandable, but also more difficult to use. They all have their place and I think the trend will be that e.g. B. Microformats can be converted into RDF, so that search engines understand, for example, both microformats and RDFa.


Does that mean that these formats can then be converted into one another?

Exactly, at least one way. These techniques already exist today. B. by Sindice, a semantic search engine, already used. It is not yet usable for the mainstream, but it shows that it already works.


So one format will not displace the other, but one or the other will be used depending on the application, complexity and know-how ...

Yes, although you have to be very clear that if you have the know-how, you should use RDF and RDFa, because they are the most flexible languages, you have the greatest freedom and can ultimately model any kind of information.


While we were at Buzzwords, there's another buzzword that comes up a lot here, which is Web 3.0. How do you see the connection, is Web 3.0 to be equated with the semantic web?

Web 3.0 is a very extensive term, it is difficult to summarize what can be understood under it. Of course you can say: Web 1.0 means static content, Web 2.0 means user-generated content and Web 3.0 means machine-understandable content. But as a technician, it is rather difficult for me to actually fill such buzzword terms with life.


Would you see the Semantic Web as a real paradigm shift with which the Web is entering a categorically new dimension, or is it just an extension?

Let's put it this way: It's a generalization. The great thing about the web is that you can refer to any resource without knowing where it is, just by entering a URL. And now it is possible to use this linking idea as the core idea of ​​the web, to link documents with one another, on a much more granular level, on the level of information. This means that you can make statements about people or events, locations etc. and have clearly defined URLs for this. In this respect, it is a quantum leap because the web can be transferred from the document-based approach to the real world and, so to speak, assigned URLs to all things in the real world and information about these things can be saved.


How does the data collection and standardization work? A URL is something unique. B. want to clearly name the city of Paris in France, I need a URL. Are there one or more data platforms?

There are several of these sources of information, e.g. B. DBpedia, which is a machine-readable export from Wikipedia. And there are z. B. geonames.org as an independent database that provides geographical information. Now you can define in the semantic web that certain URLs are the same, so you can define that DBpedias Paris is the same as the Paris that is in geonames.org. This is already being done on a large scale, DBpedia and geonames contain z. B. this information directly that certain things are the same.

This is well visualized using the open data graph. (see Figure 1, richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.html). It's exactly about what kind of data sources I have and what associations between them. The Semantic Web lives from as many cross-connections as possible from as many data sources as possible. You can see in the graphic: DBpedia is in the middle, and geonames.org is then a little to the left, here you can see the cross-connections. When you publish new data, the goal is to combine it with as much data as possible and as meaningful as possible, so that you don't just create your own world.


Are there any areas of application for the Semantic Web for normal consumers nowadays?

I would say that the normal consumer will not notice much of this whole Semantic Web at first. He will only notice this when major search engines like Google decide to use linked data on a large scale, so that the search results could in some cases be significantly better. Then I can z. For example, ask “Give me all the events tonight in Dresden” and you will immediately receive a table with locations, entrance fees, etc., which means I don't have to look for this information myself.

For companies, however, the Semantic Web can already be of great importance if it is used internally, because companies always have the problem that they have many software platforms that have to "talk" to each other. Here the Semantic Web offers a universal information description language that can also be used for such purposes.


Are there any commercial uses yet?

GoodRelations is an example, a vocabulary specifically for e-commerce. There are many sites, such as Best Buy or OReilly, that are already making extensive use of this. They hope that others will in turn link to their data. When I use GoodRelations, someone else, for example Best Buy, can use my data directly and easily and vice versa. This can have advantages for both. So far, this has been done using proprietary APIs, for example at Amazon. The problem is that the data consumer then has to master many different proprietary APIs. If more and more websites use this universal language, then that falls away and you can use the data directly. Ontologies like GoodRelations are ultimately about the language of the web. We hope that a few will prevail here. For people, for example, Friend of a Friend (FOAF) has become established, this is a language in which you can say “My name is Sebastian, I am so and so old, my e-mail is X and so on”. If such a language becomes established in an area of ​​application and because all data producers mark their data in the same way, very large synergy effects can arise.


If I want to use semantic data myself, for example in my blog or on my website - how should I proceed?

A major problem at present is still that it is very difficult for information providers to adhere to all the different standards. For example, I have developed a functionality for FLOW3 [footnote on FLOW3], with which you can easily make data in the FLOW3 system available via RFD. You only have to define once how my personal data corresponds to that of Friend-of-a-Friend-Person, then the framework takes care that the data is processed correctly.


How and in what period of time will the semantic web develop further? As you said, as a normal consumer, I don't notice much of it. Do you see any significant changes in the near future?

Well, that's generally difficult to assess. You can definitely see that the Linked Data Cloud is growing exponentially every year. We now have around 250 nodes in the Open Data Graph, which is of course not much compared to the Internet as a whole, but there are a few pioneers, such as the New York Times or DBpedia, who are using it more and more. So I do think that - if the exponential growth continues like this - the Semantic Web will achieve its breakthrough. What is still missing is the “killer application”. For example, if Google were to say "We rank information that is RDF rated higher in our search index," then every webmaster will be interested in the Semantic Web.

Otherwise, I believe that integration will more likely take place via individual companies or communities. For example, what we will do in the TYPO3 project is to use a semantic web platform as a data integration platform between our various systems. So we could rebuild the new extension repository and aggregate data from many different sources, such as reviews, ratings, metadata about the user uploading the extension, documentation, etc. This data is then available even though it comes from many different sources available on a uniform platform. That would then also be used for FLOW3 packages, for ViewHelper and all other types of artifacts. This unified platform for the TYPO3 project, that would be my personal outlook.