Which Domain Should Be Canonical?

I’m building a network of sites where the same page can be served on different niched sites. Each site will adjust the content of the page slightly when possible to meet the theme of the site. My plan was to make the canonical URL the one where the niche best matched the core content. BUT now I’m wondering…

One of the sites in the network will be a generic safe-for-work version with all the X-rated content stripped out. Given how much Google hates porn these days, should that site be the canonical site in all cases? (At least all the cases where a safe-for-work version is available). The canonical tag only exists for the search engines. So which will Google show more love to – a stripped down safe-for-work page, or a more full-featured X-rated page?

I only started wondering this like 15 minutes ago, but right now the safe-for-work version makes most sense to me. Google can still crawl the other sites and see they have more content and ignore the canonical tag if it wants to.

Thoughts?

Re: Which Domain Should Be Canonical?

Another option is to stick with the best match niche for the canonical URL, but on all explicit pages have a <link rel=“alternate” … > meta tag. Thing is rel=alternate is always used with something else - like a language or a format (RSS, JSON, etc.) It’s not just used on it’s own. But if all the other pages are pointing to the safe-for-work version then Google might start to understand that I want them to weight the safe-for-work version highly.

Re: Which Domain Should Be Canonical?

I keep replying to myself, but…

I found there are a number of standards other than RTA for classifying whether a page has adult content. Many of them seem to be dead, but the one that still seems to be around is PICS + SafeSurf. There’s also PICS + IRCA, but that’s pretty complicated and the official pages describing the standard are now offline, though I suspect Google may know the standard and use it when it crawls pages.

But that doesn’t answer the question of which version to make canonical.

Re: Which Domain Should Be Canonical?

It’s a can of worms.

I’m also looking at a safe-for-work version of GayDemon. But in my case it’s different, it’s a old established domain so canonical links on any other version of the site will refer back to the oldest URL. That means the safe-for-work version will point to the hardcore version as the main one. I will stick to RTA but change the label to softcore or whatever they have.

For you it’s different, you’re starting a new site. I personally would probably do the safe-for-work version the main site and point the main pages of the niche sites back to it. Then it depends on each sub page and if they match with anything on the safe-for-work site or not.

The question is if Google wouldn’t classify a site as “porn” no matter if it contains explicit images and videos or not.

Re: Which Domain Should Be Canonical?

I agree with Bjorn, it really is difficult territory to get into and one little fuck up could mean no indexing for you.
In the instance you’re giving I’m afraid I would go the safest route, which would be creating unique content for each domain. In all honesty, you don’t need to write a fully new piece, you just need to write a new version of what you have. You could easily rearrange the paragraphs and swap out words throughout it to make it an original piece and it should only take a few minutes for each one.

IMO, that’s far safer than diddling around with canonicals and alternates

Incidentally, I remember when I started a gay sfw blog years back and everyone here was asking why I would bother :smiley:

It’s a shame other things got in the way of that, it would have been a great blog to have up and running right now.

Re: Which Domain Should Be Canonical?

I should probably mention the safe-for-work version exists for Facebook, not for Google. Facebook doesn’t allow links to NSFW pages. But now that it exists, the question is how do I present it to Google?

RTA is weird. Despite it being this huge complicated code, there’s only one code and that code means Restricted To Adults. That’s sorta the origin of my problem AFAIK, there’s no other RTA label I can send to tell Google or others that the safe-for-work site is safe-for-work. Hence my interest in SafeSurf.

That is my fundamental question. I know Facebook has different standards for text and images, but I don’t know if the same is true for Google. So on Facebook if your page/group is marked for adults only or even older teens and up, you can be pretty explicit in your text. But they’re still fussy about explicit images.

Bottom line my safe-for-work site may get filtered out by Safe Search because it has explicit text. If that happens then making it the canonical site is a lose-lose proposition since it’s the version with the least content.

Right now I’m thinking I’ll have canonical be the best match niche-wise with the explicit pages using rel=alternate to point to the SFW version. It may confuse Google a little, but it’s the fairest compromise (I think).

[QUOTE=conran;166498]I’m afraid I would go the safest route, which would be creating unique content for each domain. In all honesty, you don’t need to write a fully new piece, you just need to write a new version of what you have. You could easily rearrange the paragraphs and swap out words throughout it to make it an original piece and it should only take a few minutes for each one.

IMO, that’s far safer than diddling around with canonicals and alternates[/QUOTE]

You’re thinking like a writer. This isn’t about written content. Let me give you a very brief preview of what I’m working on… Realize that most of these links have zero design (the SFW site doesn’t even have a logo). The features aren’t fleshed out, etc. It’s totally a work in progress…

Take for example pornstar profiles…

Alexander Morales
http://wilywilly.com/pornstar/alexander-morales (canonical)
http://maleprime.com/pornstar/alexander-morales (SFW – information and pics drop out)

Fred Mayer
http://wilywilly.com/pornstar/fred-mayer (non-canonical, explicit)
http://bbbh.com/pornstar/fred-mayer (canonical)
http://maleprime.com/pornstar/fred-mayer (SFW – information and all pics drop out)

Bo Dean
http://wilywilly.com/pornstar/bo-dean (canonical)
http://bbbh.com/pornstar/bo-dean (non-canonical version has slightly more info than even canonical because he did some bareback porn, but not enough to really be labeled primarily bareback)
http://maleprime.com/pornstar/bo-dean (SFW – information and some pics drop out)

Speaking of Bo Dean… It wasn’t mentioned on here, but he was shot by a “friend” during an argument in December and is now paralyzed and didn’t have health insurance at the time. So his life is a complete mess right now. Sorta sad.

But getting back to my point, I’m building what’s best for my audience and not worrying about Google. (Google does recommend that). But at the same time I’m trying to make it understandable to Google. They’re fine with canonical tags (if you implement them correctly). And it’s easy with the RTA tag to say “this site is porn”. The problem is saying “this site is about porn, but none of the images are pornographic” – you can only sorta say that with PICS tags. And beyond that I don’t know that Google even cares enough about porn to bother understanding the distinction I’m trying to make.

Re: Which Domain Should Be Canonical?

BTW, I finally found the current way to tag a page as family friendly or NSFW… Schema.org has a solution and Google loves Schema.org…

Here’s an example of a NSFW page…

<script type="application/ld+json">
    {
            "@context": "http://schema.org",
            "@type": "WebSite",
            "url": "https://bbbh.com/pornstar/mike-dozer?dev=1",
            "name": "Porn Star Mike Dozer",
            "headline": "Porn Star Mike Dozer - #BBBH",
            "publisher": "Studio 3X, Inc.",
            "isFamilyFriendly": "False",
            "typicalAgeRange": "18-99",
            "audience": {
                "@type": "PeopleAudience",
                "requiredMinAge": "18",
                "suggestedGender": "male",
                "audienceType": "gay men"
            }
        }
</script>

And here’s an example of a SFW page…

<script type="application/ld+json">
        {
            "@context": "http://schema.org",
            "@type": "WebSite",
            "url": "https://maleprime.com/pornstar/mike-dozer?dev=1",
            "name": "Porn Star Mike Dozer",
            "headline": "Porn Star Mike Dozer - Male Prime",
            "publisher": "Studio 3X, Inc.",
            "isFamilyFriendly": "True",
            "typicalAgeRange": "13-99",
            "audience": {
                "@type": "PeopleAudience",
                "requiredMinAge": "13",
                "suggestedMinAge": "16",
                "suggestedGender": "male",
                "audienceType": "gay men"
            }
        }
    </script>

You put that in the <head> of your page and it defines Schema.org-defined properties about your web page as defined by https://schema.org/WebSite. (And yes, you might expect a WebPage property, but it’s WebSite that you actually use. Confused me a bit when I first encountered it. So you’re saying for a particular URL, it has a particular title (defined by ‘name’ and ‘headline’ – there’s some discussion which is better, both are defined properties), etc. The critical elements for NSFW are:

“isFamilyFriendly”: “False”,
“typicalAgeRange”: “18-99”,

That indicates that it’s not family friendly and it should be targeted to people at least 18 years of age. On Safe For Work pages you have True and a lower starting age (make sure that age is 13+ to be COPPA compliant).

Then you can go further by using the PeopleAudience attribute to say that the required minimum age is 13 or 18. You can also specify that while the required minimum age is 13 you really suggest it for a higher age (e.g. 16 in the example).

There’s a bunch of other stuff you can include. You see some of it in the examples above.

When you’re all done, run your page through the Google Structured Data Testing Tool to make sure you formatted things correctly.

Re: Which Domain Should Be Canonical?

Thanks for sharing that. I use schema.org already actually.

Re: Which Domain Should Be Canonical?

I should mention that I’ve since changed how I display the data. I deleted ‘name’ and ‘headline’ (if Google can’t figure out the title of the page, then they’ve got problems), and then I added in a lengthy ‘publisher’ section since Google likes to know who is responsible for the content of websites. The publisher section has things like my business address, incorporation date, contact email, contact phone number, etc. Basically enough that Google can feel confident that I’m a reputable business. Deleting the page title lets me put the same thing on every page. Only the ‘url’ property changes and that’s not hard to figure out.

But all the parts saying that it’s an adult-oriented page (or not) are the same.

Re: Which Domain Should Be Canonical?

More corrections… Both WebSite and WebPage exist. If you use WebSite (like shown above) then ‘url’ should be the URL of your home page, and ‘name’ would be the name of your site, not the title of the page. That would set age criteria for the entire site. If you want to be specific to the page then use WebPage.

Here’s my current site schema declaration:

<script type="application/ld+json">
        {
            "@context": "http://schema.org",
            "@type": "WebSite",
            "url": "https://bbbh.com",
            "name": "#BBBH",
            "publisher": {
                "@type": "EntertainmentBusiness",
                "legalName": "Studio 3X, Inc.",
                "email": "[email protected]",
                "telephone": "760-569-1148",
                "url": "https://studio3x.com",
                "foundingDate": "2009-01-29",
                "foundingLocation": "New York, NY",
                "location": {
                    "@type": "PostalAddress",
                    "postOfficeBoxNumber": "3587",
                    "addressLocality": "New York",
                    "addressRegion": "New York",
                    "postalCode": "10027",
                    "addressCountry": "US"
                }
            },
            "isFamilyFriendly": "False",
            "typicalAgeRange": "18-99",
            "audience": {
                "@type": "PeopleAudience",
                "requiredMinAge": "18",
                "suggestedGender": "male",
                "audienceType": "gay men"
            },
            "sameAs": [
                 "https://twitter.com/rawTOP",
                 "http://tmblr.rawtop.com/",
                 "https://plus.google.com/+BbbhPigs",
                 "https://www.facebook.com/barebackbrotherhood"
            ]        
        }
</script>

There’s (obviously) a lot more in there now than there was yesterday.