I don’t use Semantic Web technologies anymore, though they still influence me

119 points by mdlincoln 6 years ago · 69 comments

Reader

zcw100 6 years ago

I could write a book on what's wrong with the semantic web. One of the worst isn't even technical, it's the community. There are some great people in the community but there are also a large number of extremely toxic people that drive people away. If the technology ever takes off it's going to be because some outside community cherry-picks the good parts and tells those people to f-off. That's already starting to happen and you'll hear no end of bitching from people in the semantic web community about how they're reinventing what they've already done years ago. Guess what? You're right. You're so toxic that it's worth redoing everything if it means they don't have to deal with the toxic attitudes.

reggieband 6 years ago

> a large number of extremely toxic people that drive people away
It's funny you say that because as soon as I saw semantic web I had a negative emotional nostalgia. I can hardly remember all of the RDF/RSS/Atom stuff from way, way back or what the trigger for that is but I just remember there being rancor swirling around the whole thing. I think there was some petty arguments about who deserved credit for the creation of the formats or something? Wasn't it between a bunch of bloggers? Then XHTML became a battleground since some groups were trying to keep semantic tags out of it while other people wanted them in. I remember just feeling exhausted every time the subject came up since it was like emacs vs. vim or space vs. tabs wars.
The funny thing is, I believe in the promise of the semantic web. I recall Tim Berners-Lee declaring the next frontier was not open source but open data and I agree. He even co-founded an institute around it: https://theodi.org/person/sir-tim-berners-lee/
- ttepasse 6 years ago
  
  > I can hardly remember all of the RDF/RSS/Atom stuff from way …
  You're mixing in some stuff, that aren't really Semantic Web related.
  RSS vs. Atom was less about the Semantic Web than an squibble between different XML formats, one very loosely specified, the other more ... well-formed. The Semantic Web did had a small foot in the RSS wars - the very first RSS (RSS 0.9 from Netscape) was RDF based and for a short time RSS 1.0 wanted to rebuild RSS on an RDF basis for the expandability of the Semantic Web, but the later discussion were about the XML variants of RSS and then Atom, wether the spec was adequate, wether it was frozen or how and wether it should be fixed, etc.
  The XHTML discussions were less about elements in my recollection but about parsing models. XHTML reformulated HTML als XML which meant an error model with no error correction but failure on the first error. And XHTML 2 tried to evolve structural elements by being not backward compatible but defining a somewhat different new dialect. The backslash against XHTML was against that, a group sponsored by the browser makers then formed which wanted to evolve backwards-compatible and to standardize the parsing of tag soup → HTML5.
  („Semantic elements“ were often a shorthand for „instead of a dumb div use the appropriate HTML element. That was more the quest of the web standards project than the Semantic Web.)
  (Slight overlap: How to embed Semantic Web statements has a small relationship with XHTML - RDFa started imho in an XHTML 2 module.)
  I somewhat miss that time. All these bloggers with an interest in web standards and how to do them best had their own idealism and the cross blog and W3C discussions were always interesting. Today web standards don't have that publicity and idealism anymore, they seem more like an engineering collaboration of the 2½ big browser makers which get to decide among themselves. Maybe it was always so, but it seemed different at that time.
  - reggieband 6 years ago
    
    Our recollections of history of similar, however I also recall there being discussion about preventing semantic tags from being included in XHTML. A certain segment of the population believed it didn't belong in the document but rather as a corollary document in RDF or whatever (an argument of data normalization vs. denormalization).
    Atom/RSS was involved in the debate because they were also trying to solve the metadata issue. Things like "author", "publish date", etc. are just as relevant to aggregation/syndication formats as it is to the document itself. Again, I'm summoning my fallible memory here, but one argument was if the metadata is relevant to both documents then it ought to be stored separately and linked to the HTML/RSS docs using a URL.
    XHTML was involved because as an XML format it was conceivable to store your metadata separately AND to use XSLT to transform it into your XHTML/RSS/Atom document on demand. So RDF, Atom, RSS and XHTML authors all wanted a say on a metadata format that would suit all of those use cases. That is a tall order.
    My personal feeling about the death of XHTML was it wasn't one big thing that killed it. It was hundreds of smaller disputes like this one.
mindcrime 6 years ago

It's interesting how perceptions vary. I've been working with SemWeb stuff for a decade or so, and I have never experienced what you describe here:
One of the worst isn't even technical, it's the community. There are some great people in the community but there are also a large number of extremely toxic people that drive people away.
Maybe it's just the subset of the community that I choose to deal with, but the folks on the Jena mailing lists (pre and post Apache) have always been very gracious and helpful in my experience. And Ralph Hodgson, one of the co-founders of Top Quadrant came to a Triangle Java User's Group talk that I once gave on Semantic Web technologies, along with a bunch of other Top Quadrant people... and despite the fact that my company competes with them in certain areas, they were perfectly cordial and pleasant to interact with. Likewise for the other times that I've had Top Quadrant folks show up at events where I was speaking.
Maybe it's just dumb luck on my part, or whatever, but I have found no major issues with toxic people in the SemWeb community. shrug
- zcw100 6 years ago
  
  Yes, there are some wonderful people in there. Andy Seaborne has always impressed me with his thoughtful responses. I won't call out any of the bad apples. Usually a question goes something like, "Uh, I'm new to the semantic web and I'd like to do X" and the response is, "this is how it works, you're a dummy and you need to understand how brilliant the semantic web is and you don't need to be doing what you're asking for" or academics who will complain that they're not getting enough credit for providing their brilliant intellectual scaffolding.
  Databases, that are run on a shoe string, aren't stable so we're going to make everything federated with linked fragments? Fine, give it a go but you don't need to go on and on about how databases are inadequate because someone isn't willing to foot the AWS bill so they can host dbpedia for ya.
  Lets have a go at JSON-LD. RDF/XML is finally recognized as a mistake. A somewhat reasonable mistake because everyone was XML crazy at the time. So what do we do? The exact same thing except this time it's JSON. But it's even worse. We choose a serialization that is prized for its simplicity and we foist the entire RDF stack onto it? Then they claim that JSON-LD isn't about the semantic web so we're good and Jedi mind trick it with, "This isn't the RDF you're looking for".
  Because we aren't done overcomplicating simple things we take aim at CSV with CSVW. Granted CSV has some subtle complexities but it's easy and reasonably compact. So now we're going to add metadata to csv files with rdf and then serialize it into JSON as JSON-LD. Great. How do I find this metadata. Either a well known location or in a link header. Whoops I can't publish metadata and reference your csv file. Lets convert your csv fie to rdf. WTF. my 500Mb csv file just became 1.5B triples and it's taking 8hrs. to load it into my triple store!
  Don't get me started on people who call themselves ontologists. They're really zombies but instead of eating brains they eat budgets. They should be dispatched the same way, with a shotgun blast to the face. They generally can't justify their decision even though there is a framework to do that, onto clean. I have yet to meet one who even knew what that was. They just convince management that what they're doing is intellectually unattainable by mere developers although they'd be lost without protege, top braid, or excel and what they produce is generally an incomputable pile of garbage. It's always OWL full. "Class or property? Class or Property? Well is is an "is a" relationship."
  I'm done writing so I'll just include a list of the half baked ideas that sound good but are a day late and a dollar short. LDP, R2RML, ShEX, SHACL, DCAT, RDF Data Cube, WebID.....
  My wife always says to say something nice so I'm going to say SKOS. SKOS is ok.
  - mindcrime 6 years ago
    
    RDF/XML is finally recognized as a mistake.
    Finally? From what I've seen, most everybody in the SemWeb community moved on to N3 or Turtle over a decade ago, with a little bit of interest in the aforementioned JSON-LD.
    I'm a fan of SKOS myself.
    I may remain more of a fan of SemWeb tech because I say away from the edges. Out of LDP, R2RML, ShEX, SHACL, DCAT, RDF Data Cube, WebID....., I use none of those. Add GRDDL to the list of things I don't need as well.
drongoking 6 years ago

Community aside, I'd love to read an article on what's good and bad in the current semantic web. Maybe it would have to be written anonymously.
Or maybe contact O'Reilly and write an intro book "Semantic Web: Just the Good Parts" for their series.
meej 6 years ago

This is precisely why I left the rdf* mailing list shortly after it was created this year. The list maintainer invited subscribers to introduce themselves and as soon as a particular individual posted, I no longer wanted to be a member of the list.
markhollis 6 years ago

I'm curious what those toxic attitudes are. Surely the "we already invented it and you're reinventing it" can't be the only case. I'm also curious if it's in an academia or in industry.
- tasogare 6 years ago
  
  I shared the experience described by the grand parent. In particular I remember I had some argument on HN with a few people and the sheer amount of bad faith and technical inaccuracy thrown at me was jaw dropping. At this point I consider SW more a cult than a technology.
  On the research side there are two kinds of research papers: the one that proposes an ontology for a domain, and the one that describes the conversion of an existing resource to RDF. I've never seen a paper where SW was used for something new and interesting and that would have been impossible without SW.
  That being said, they are also both technical and conceptual pain points that are plaguing RDF. Basically the tech is trying to address too many things: both metadata and data, and every kind of data. "IRIs that can be URLs than can be sometimes dereferenced and sometimes not, but it's better if they are and then it's Linked Data" kind of thing makes it hard to assume (and thus build) anything.
  So, RDF have been success in a few domains (biology) but in most case it doesn't offer a real competitive advantage over simpler and more expressive technologies such as graph databases.
  PS: @zcw100 if you where to really write a book about semantic web, drop me a line please.
- wrnr 6 years ago
  
  My take on the attitude in academia: Here we describe a set of algorithms that can solve a class of problems that previous algorithms can't. In the 60' someone published a solution to a problem we have improved upon with the novel innovation of called "hyperlinks". The technical, social and economical shortcomings of our solution are invalid because it is decentralised and therefor morally superior to the current offerings, used the world over, of industry practitioners who are only doing it for the money. More funding is needed for further research.
  - nl 6 years ago
    
    In general the decentralised fetishism isn't something that is big in academia (as in the academia that publishes paper). There's lots of issues in academia and even more with the semantic web, but fetishism of decentralisation isn't it.

JimmyRuska 6 years ago

Semantic web tech solves a common problem. You have a database where you want to have some shared schema among many groups, and you want a way to infer facts based on first order logic. You want to be able to query multiple sources and reason about facts when taking into account multiple sources.

Whether you use semantic web tech or not that's still a common problem that doesn't always have a good plug and play solution. There's still a lot of places using jsonld format for metadata and cataloging information. You can google cooking recipes and get ratings, cook time; search for movies and see how high rated the movie is and who made it with a synopsis of the plot, all of these are product metadata powered by rdfs or jsonld metadata, a relic of the semantic web. It would be incorrect to say semantic web is dead. Any AI that can effectively use wikidata as a fact table would be jeopardy grade. There's still new tools coming out like RDFox that apply first order logic at multicore speed across huge datasets for reasoning. There is work being done to make it horizontally scalable. I think people will just go on an endless loop of getting the same pain points and creating new tools using the trending tech of the day, but even in this day and age, sometimes something like prolog or picat is what you need.

zozbot234 6 years ago

> you want a way to infer facts based on first order logic
Isn't that computationally infeasible? Semantic web standards are based on description logics, i.e. multi-modal logics chosen specifically for computational expediency.
Also, I wouldn't describe JSON-LD as a "relic" of anything. It's a fairly recent standard in the grand scheme of things, and many interesting projects these days implicitly rely on it.
- aggerdom 6 years ago
  
  So not an expert in this area, would love if someone corrects me. My understanding is generally FOL is infeasible. Propositional logic even can be computationaly difficult [1]. My understanding is that most of the semantic web stuff is done using a description logic of some flavor. These will be named based on the properties of the logic. The important thing is that they are generally decidable, and you can use something like MALET or some other solver to infer things from your database or ontology.(You give up some expressivity for decidability) Not sure how much stuff is going on with that these days. Played with a petrology ontology in protegé some back in college, but haven't followed the space. I remember OWL being important, but can't remember why at the moment.
  [1] For example if you try to figure out if a formula is satisfiable. You can for sure do this using truth tables. The catch is that you're looking at 2^n complexity where n is the number of propositions in your formula.

hos234 6 years ago

I am still a fan of Googles OpenRefine tool. It's reconciliation feature that helps disambiguate Named Entities etc based on wikidata is really powerful - https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation

You can hook in your own reconciliation end point which we do at work to expand internal knowledge graphs.

nl 6 years ago

Note that OpenRefine isn't really kept up to date.
The basic capabilities work ok, but lots of the additional capabilities have atrophied away.
jyrkesh 6 years ago

This is awesome, thanks so much for sharing. I'm really surprised I've never come across it because I've thought of building something like this before.
I really want to look into how this could ingest my own post-GDPR data exports, as well as data sanitization for ML projects.

ansible 6 years ago

I had a lot of interest in the semantic web when I first started learning about it.

However, the efforts I've seen seem to be missing some critical factors for longer-term success. I think we've got a lot of work to do with regards to knowledge representation in general.

One of the big things for me is that the context for any fact is critical for it to be true or not.

You can have a fact like "Tim Cook is the CEO of Apple", represented in a graph like you would expect. However, that is only true today. Ten years ago it was Steve Jobs. Without explicit context encoded in the information graph, this web of data isn't as useful as it could be.

Context is important for reasoning in all kinds of situations. "What if Steve Ballmer was CEO of Apple?", is a hypothetical context, where it may be useful to do reasoning about. The context of "Who is the most distinguished captain of the Enterprise?" could be about the real world US Navy, or a fictional Star Trek universe (of which there are multiple).

JimmyRuska 6 years ago

Checkout datomic, you can query for facts given a certain point in time. This is common among customer preference storage too, like, "what's your favorite song or movie can change easily over time." Being able to query the state at a certain date can be helpful. Some predictive analytics can also be done like people with these preferences, how did they change once they had a family, or moved to another "life stage".
Though you probably don't need datomic, it would be not too complicated to model this in neo4j or some other RDF graph that supports arbitrary sized tuples. Datomic just supports this feature as a first class value offering.
mdlincolnOP 6 years ago

context-dependent, or "reified" assertions are a pain point for sure. I come from the perspective of cultural heritage data, where context is king. Which expert made this attribution for this painting? Who owned it _when_? According to which archival document? etc.
Almost all the engineering problems cited in the original post are still basically there, but graphical models are still the least painful way of doing this, particularly when trying to share data between institutions. Example: https://linked.art/model/assertion/
- zozbot234 6 years ago
  
  The OP mentions property graphs as a way around this problem. They can be seen as natural extensions of "RDF quads" which in turn are based on common RDF triples (Subject / Property / Object)
meej 6 years ago

This isn't that difficult to deal with, though. Instead of linking a CEO to their company with a simple object property, explicitly reify the relationship as a class in the model that represents something like "employment". Then you can hang as many contextually relevant properties from the class instance as you need -- start date, end date, role, etc...
This is why much of the hubbub over property graphs puzzles me. If you need a relationship to have its own properties in an RDF graph, just turn it into a class. What's the big deal?

sawaruna 6 years ago

Shoutouts to the 11 other people on HN still working with rdf and similar in 2020.

at_a_remove 6 years ago

At an old job, I knew some very idealistic folks who kept pushing semantic web business. "Let's do that everywhere!" As an exercise, I would have them open a browser, visit various sites, and then look at the source. "Go on, check to see if it validates," I would say with an anticipatory grin. Whether hand-crafted HTML or generated by any number of frameworks, many sites can barely manage to close their tags, asking for semantic references is a "just won't happen in practice" thing.

I have also seen a great deal of consultant money, programmer time, sys-admin sweat, and the like focused on these toweringly-designed, completely-unused triple stores, layer upon layer of hot technologies (ever-moving, construction on the tower never ceased) fused together to create a resource-intense monstrosity that, at the end of the day, barely got used. But hey, let's look at that jazz semantic web example one more time.

The most painful part is that I understand the urge to build a gleaming repository for information, where the cool URIs never change; SPARQLing pinnacles, ready to broadcast the Library of Alexandria, glimmer; and the serene manifold of abstract information lies RESTful ... but I have come to understand that the web of today is an endlessly bulldozed mudscape where Someone Very Important has to have that URL top-level yesterday (never mind that they will forget about it tomorrow), of shoddy materials and wildly varying workmanship, and where nobody is listening to your eager endpoints because the commercials are just too loud. I too once labored for information architecture, to have the correct thing in the obvious place, with accurate links and current knowledge, to provide visitors with the knowledge they desired ... but PR preempted all of it to push yet more nice photographs in yet another place: the Web as a technology for distributing images that would once live on glossy pamphlets.

The vision is lovely, but we who have always lived in the castle have walked alone.

riffraff 6 years ago

I would argue the problem is not the broken tags, but the business disadvantage to exposing semantic data.
Remember when microformats were all the rage, and you could get hReview or hRecipe or XFN data everywhere?
Then every host in turn realized that actually, it's _better_ if people can't scrape your site, and it's even better if they can't even see it and it's behind a login wall.
- acdha 6 years ago
  
  “better” is too strong: in many cases, structured data is not a problem (and if it is, people will scrape it anyway), but there's simply no business case for spending time on it. Most of the semweb stack had a horrible developer experience — bad documentation, tools, validators, etc. — and rarely had tangible benefit from spending time slogging through it.
  The semantic data which has actually been implemented on a wide scale happened because someone could go to their boss and say “Spending time on x will mean better Google ranking” or “Facebook will use their new sharing display for our pages”, and it was orders of magnitude simpler to implement so the time and risk were far more palatable.
- zozbot234 6 years ago
  
  Well, whether it's better depends on local incentives. But it's true that in many cases these push against making machine-readable data available, thus "semantic" tech becomes mostly irrelevant. Similarly, Linked Data has been most successful as Linked Open Data, where these incentives are explicitly aligned.
- gwern 6 years ago
  
  Indeed. Why would you expose all of your data to your competitors like Google, so they can commoditize you? (Incidentally, note that the big tech companies like the search engines are some of the major proponents of microformats, like for restaurants or local businesses... As always, 'commoditize your complement': https://www.gwern.net/Complement )
- chongli 6 years ago
  
  That’s a proximal cause. The root cause is that the Internet is not free, despite appearances. If hosting and bandwidth were free, we wouldn’t need businesses to do what we want. Wikipedia wouldn’t need donations. Everything would be great.
bordercases 6 years ago

I'm working on the Semantic Web stack in a more limited setting of biomedical data. Performance is definitely a problem but the project is currently exiting pilot due to what were seen as satisfactory results in indexing and summarizing biomedical information, and bridging connections between domains of results (with human assistance).
This is a different outcome than in the commercial setting where the W3C is still imagining people as users of their computer rather than consumers of the services their computers connect to. But it also means that in certain technical domains where e.g. publication results are scaled out to oblivion but the ontologies are regular or made easily negotiable, there can be benefits for researchers.
tasogare 6 years ago

I've read my share of SW papers: the fact that after a year more than half of links in such works are dead is more telling than the papers themselves.
goto11 6 years ago

The reason HTML pages doesn't validate is pretty simple: It does not provide any benefit for the publisher. Consider if the images didn't show up - you better believe the publisher would have it fixed immediately.
Same for the semantic web. Show the benefit for the publisher.
rbosinger 6 years ago

Agreed. Poetically written as well!
- dandelo53 6 years ago
  
  Just want to pile on more kudos. Nicely written.

mark_l_watson 6 years ago

These a fair criticisms of the semantic web. One thing the author misses (does not touch on at all) is domain specific RDF resources for biology, medicine, etc.

schema.org and WikiData are great resources and for large companies, using these as a foundation for their own internal Knowledge Graphs can make sense. This expense is (maybe?) too large for small and medium size companies, they would not get enough benefit for the cost.

I worked with Google’s Knowledge Graph as a contractor, and I am still a believer in the technology but I also respect other people’s well founded scepticism.

contravariant 6 years ago

With the recent widespread interest in Category theory I still think it's a damn shame that RDF wasn't designed to treat relationships as stand-alone entities. Perhaps property graphs work better in that regard, although it's a bit weird how properties aren't themselves relationships, but perhaps that's a necessary concession to keep things efficient.

AndrewStephens 6 years ago

I have some low-level hate for the Semantic Web. I run a small personal blog that I maintain using a relatively simple static site generator that I created that turns markdown files into clean(ish) html.

A couple of months ago I got interested in adding semantic information to my posts so I modified the generator to add some of the common semantic tags. It was an annoying job, since the semantic information pollutes the structure of the html.

Can anyone tell me what the semantic web does for me as a small-time publisher? Is it for search engines? Does it really matter that a book review (for instance, I have a few) is tagged properly?

lazyjones 6 years ago

> Can anyone tell me what the semantic web does for me as a small-time publisher? Is it for search engines?
Yes, in practice it is mostly for bigger fish in the pond to easily identify and steal your content as needed.
For example, Google was using reviews from small competitors' sites in Google Shopping.
- abathur 6 years ago
  
  I think this is one of the big issues. The semantic information does make it easier for end users to find what they're looking for, but it also made denial of traffic possible.
  In a lot of cases, the information was there to get eyeballs--so this is undesirable.
  I guess if you don't really care about the eyeballs it can be "useful" for the big fish to pay most of the cost of serving the fraction of your server response that the end user was looking for...
  - TeMPOraL 6 years ago
    
    So the root problem is actually that people care about the eyeballs. Nothing good comes from such incentive.
    
    abathur 6 years ago
    
    Maybe. Not sure what I think about that framing.
    FWIW, I was picking "eyeballs" as something wider than just ad revenue. I think ads are the big share, but I'm sure there are people/orgs who want eyeballs for other reasons like ego/status, promote their company/brand/service/products, etc.
    In some sense I think your framing is accurate, but I don't know about whether we'd be better off (have an informational ecosystem that is more net-positive?) without status chasers. Some share of them are inevitably gaming the system and diluting the ecosystem; others probably add net value in pursuit of eyeballs?
    
    TeMPOraL 6 years ago
    
    In context of semantic web, pursuit of the eyeballs is a problem because it makes the people owning/creating the data also want to be delivering that data directly to the users, and be the only ones allowing to do so. Semantic web works for the opposite goal - to allow the data to be automatically transmitted, processed and understood by software, and only perhaps eventually delivered in some form to end user.
    As for building more net-positive information ecosystems, going for the eyeballs instead of actually caring to deliver good information isn't necessarily bad per se, just suboptimal. It's better for an eyeball-chasing site to publish some information, if otherwise that information wouldn't be published at all. But it's the eyeballs being your primary revenue source that will make you work hard to make the data as useless as possible outside your own publication - which leads to a very unhealthy information ecosystem.
    
    killface 6 years ago
    
    They don't even care about that. They care about their advertising revenue.
onli 6 years ago

More a side note, but if you run a blog you might know that the trackback url can be specified via a RDF tag. That's a kind of semantic information, one example for one type of usage: Given other clients (here: other blogs) additional information (here: where to send the Trackback POST).
The markup you added - it depends on what exactly you did. Did you add the markup for schema.org? That's in practice solely for Google. The SEO promise there is that Google will make use of the information provided and format some information nicely, which can lead to more clicks. https://moz.com/learn/seo/serp-features explains that not badly. For things like reviews I can imagine it to be quite useful.
zozbot234 6 years ago

> Does it really matter that a book review (for instance, I have a few) is tagged properly?
If the semantic web was better supported, you could have a semantic annotation precisely identifying the books you are reviewing (whether by ISBN edition or otherwise), and reusers of your content (users, search engines or others) would be able to programmatically associate your review with similar content.
- coddle-hark 6 years ago
  
  That seems like it would be abused to the point of the semantic information being completely useless.
  - abathur 6 years ago
    
    I guess by abuse you mean ~black-hat SEO?
    It seems likely (and perhaps obvious) that:
    - people will try to abuse it
    - abuse will keep it from supporting naive trust of semantic information published by untrusted third parties
    But we're also already roughly in this scenario, and it seems like it might be easier to model and spot/discard abuse of semantic information.
decebalus1 6 years ago

> It was an annoying job, since the semantic information pollutes the structure of the html.
In what way? Both the html and the metadata is intended to make your website machine-friendly. You may find the html structure polluted, but crawlers would find it more informative.
sjg007 6 years ago

Embedding semantic information would allow Google to further refine search traffic to your web page. I assume it may also make you more authoritative wrt to the content you publish.
zmix 6 years ago

"Semantic Web" is a wide area. What technologies did you use? Care to post a little example, as to what and how it pollutes the HTML structure?
have_faith 6 years ago

I can't imagine what semantic tags would pollute a blog's markup as most of the semantic tags were designed to structure simple text content like a blog post. Do you have any examples?
> Is it for search engines?
Yes. And Accessibility.
- Vinnl 6 years ago
  I think you might be confusing semantic HTML with the semantic web. (Which is understandable given the mention of semantic tags.)
  Using semantic HTML means using <article> rather than yet another <div>. What GP is referring to, however, is adding extra information to your HTML detailing what kind of data is in your tags, e.g.:
  <p vocab="http://schema.org/" typeof="Person"> <span property="name">Christopher Froome</span> was sponsored by <span property="sponsor" typeof="http://schema.org/Organization"> <a property="url" href="http://www.skysports.com/">Sky</a></span> in the Tour de France. </p>
  Here, the vocab, typeof and property attributes are used to add semantic information to the HTML. It might also give you an idea of why one might consider that a chore, especially if it doesn't appear to provide any benefit, like making your site accessible to users of screen readers.
  - have_faith 6 years ago
    
    You're right, I was conflating the two overlapping concepts.

tannhaeuser 6 years ago

I wouldn't call semweb dead; it just has found its niche(s) and is even stabilizing and gaining in those areas. I actually landed a gig for graph DBs, SPARQL, etc. in lab informatics for bio/chem. Earlier this year I attended a keynote held by Wikimedia Deutschland's Franziska Heine pushing for large publicly available RDF data sets, etc.

abathur 6 years ago

I'm really interested in semantic authoring (not really structuring data with semantics--but marking semantics within running text), though I guess I'm disinterested in the semantic web.

I agree with a lot of the problems noted in other posts, and would add two other problems from the authoring side:

1. Identifying and employing sound semantics requires a level of thought and clarity that I don't think most people are habituated to working at. It raises the bar somewhat on who can be contributing (either they have to understand and take care with the semantics, or you need a separate person to handle them?)

2. I may be missing some good tools, but I haven't been able to find a good low-friction semantic authoring experience. Even if you are mentally prepared to write with explicit semantics, it still adds a lot of friction to the writing process (or requires subsequent semantic-edit passes).

buboard 6 years ago

modern NLP makes the semantic web completely obsolete. if anything, you need less markup because it's confusing and more often than not, just wrong.

drongoking 6 years ago

This is too extreme. If, like Google, you have a flock of Ph.D.s who you can put onto an NLP problem to extract semantics from text, then semantic markup becomes less valuable. Not all of us are in that situation. And I don't think parsing text is the only application of the semantic web. Having hugs databases full of knowledge is interesting in itself.
As for semantic markup being confusing and usually wrong, I don't know where you get that.
- buboard 6 years ago
  
  Yeah, but i think there is a difference between standardized markup data formats describing e.g. proteins, and generic text with annotations. The latter are redundant

liminal 6 years ago

I really want to like semantic web technologies, but every time I try to get into them I'm stymied: * A zillion standards that all reference each other * Two zillion incomplete and incompatible implementations of those specifications * No sense of direction within it all (what's the easy path?) * Multiple rebrandings of the same ideas (Semantic Web, Linked Data, Solid...)

zozbot234 6 years ago

"Solid" is just SOcial LInked Data anyway. I like LD as it seems clearer in intent than the "semantic web" label.
- Vinnl 6 years ago
  
  Yeah, in that sense Solid is a subset of Linked Data: linking personal data.

austincheney 6 years ago

When writing data structures that are not for describing or defining services I still can't help but think in triples. I also can't help but think of each data facet as though it were something described with meta-data would provide sufficient context that it would make sense if it were read out loud to a stranger.

tylerjwilk00 6 years ago

Or responsive web design techniques apparently.

StuffedParrot 6 years ago

It reads fine for me on mobile.

Settings

I don’t use Semantic Web technologies anymore, though they still influence me

Keyboard Shortcuts