Format to use for exposing structured meta data (d

2019-07-27 06:43发布

In an altruistic manner I would like to expose as much structured data about my website. I also wouldn't mind SEO boost but its secondary.

Seems there are a couple of options:

  • Full on RDF (kill me now XML)
  • Atom with your own custom tags (liking that)
  • RDFa in your webpage (might help SEO)
  • Dublin Core Meta tags
  • Dublin Core using RDFa
  • Atom with RDFa

I'm just trying to make it easy for people to get data off my site.

The nice thing about standards is that there are so many of them to choose from.

Which one do you think I should use?

2条回答
兄弟一词,经得起流年.
2楼-- · 2019-07-27 06:57

The Dublin Core Schema is a small set of vocabulary terms that can be used to describe web resources (video, images, web pages, etc.). Example of Dublin Core code

 <meta name="DC.Format" content="video/mpeg; 10 minutes">

 <meta name="DC.Language" content="en" >

 <meta name="DC.Publisher" content="publisher-name" >

Link to Generate DC.Meta tags : http://www.dublincoregenerator.com/generator_nq.html

DC in meta-tags for SEO purposes - they are obsolete.

It was found that using Dublin Core elements did not improve the retrieval rank of the web pages" and that "Dublin Core metadata, as a well-known metadata schema, is not widely accepted and used by search engine designers and the spiders do not consider its elements while ranking the web pages.

Google are NOT using that in their indexing, and there is no mention of Dublin core on Google or search engine's site for indexing.

In the UK, government organisations use DC to provide standardised access to tags.

That's not to say Google, Bing, Yahoo, etc will never implement them. Google is using more metadata and rich snippets these days.

查看更多
冷血范
3楼-- · 2019-07-27 07:01

RDF is not just XML; RDF is a data model that relies on sets of triples (subject, predicate, object) and URIs to unambiguously refer to things. Actually, people working with RDF tend to run away from RDF/XML and we prefer RDF/Turtle or RDF/Ntriples, even RDF in JSON format. These serializations are more readable, easier to construct and easier to parse. Moreover, there are many tools that allow you to transform between all the range of RDF flavors (i.e: rapper or Jena).

When it comes to publishing information in RDF. You generally have three different choices:

  1. To provide RDF dumps of your data.
  2. To publish RDF following the Linked Data rules.
  3. To add metadata to your existing Web pages with RDFa.

... these are not exclusive. You can go for any combination of them, the most important thing is choosing the correct structure of URIs (see Cool URIs don't change).

Following your SO profile I see that you're working on a social taste recommendation website (http://evocatus.com/). I assume that you might want to expose information about those reviews. So for a review like http://evocatus.com/sauce/cholula-chipolte-hot-sauce/272645/ you can provide different serializations and give back not just HTML but also:

  • .../holula-chipolte-hot-sauce/272645/rdf-turtle
  • .../holula-chipolte-hot-sauce/272645/rdf-xml
  • .../holula-chipolte-hot-sauce/272645/rdf-json
  • and one for any other type of format you want to expose.

In addition, the HTML version could be enhanced with RDFa. Depending on the type of client that consumes your data, following content negotiation rules, you'll redirect the HTTP request to whichever format is accepted by the client. This is established by the HTTP header Accept. So a request like the one below with curl would be redirected by your application giving back the RDF/XML version:

curl -H 'Accept: application/rdf+xml' .../holula-chipolte-hot-sauce/272645/

In the future, people would be able to say things about existing reviews in your site by just reusing your URIs in their RDF data. That's the power of RDF and Linked Data.

About Dublin Core, you could use Dublin Core with either RDF or RDFa. But, in your case there are some other interesting ontologies to consider and the right thing would be to use a mix of all of them:

  • FOAF: Friend Of A Friend, to express user personal information and relations between users.
  • Tag Ontology: A very simple ontology to express tag information.
  • RDF Review Vocabulary: Vocabulary for expressing reviews and ratings using RDF.
  • GoodRelations: An ontology to express product information and eCommerce.
  • Vcard/RDF: for addresses, normally used in combination with FOAF.

There is one site called http://revyu.com/ that uses all these ontologies (except GoodRelations), so you could use it as a guideline. See for instance:

... these are HTML and RDF versions of the same review.

Unlike with ATOM, as you can see, with RDF you would be able to reuse existing ontologies and since RDF is based on URIs everything would be interlinked.

Linked Data Added Value

What would happen if you invest sometime linking your products and reviews to other data sources ? (i.e: dbpedia.org or freebase.com). Let's imagine that you start linking all your Beer reviews (http://evocatus.com/beer/) to whatever brewery is manufacturing the product from (http://dbpedia.org/page/Alcoholic_beverage), by following the links you would be able to know for instance where the preferable beers are manufactured. Dbpedia holds that information.

Also see that in Freebase, that also provides RDF versions, you could link to manufacturers. For instance see, http://rdf.freebase.com/rdf/en.budweiser in RDF or http://www.freebase.com/view/en/budweiser in HTML.

查看更多
登录 后发表回答