Related tags: indexing [+], crawling [+], feedback [+], communication [+], services [+], products [+]
When a mobile user or crawler (like Googlebot-Mobile) accesses the desktop version of a URL, you can redirect them to the corresponding mobile version of the same page. Google notices the relationship between the two versions of the URL and displays the standard version for searches from desktops and the mobile version for mobile searches.
If you redirect users, please make sure that the content on the corresponding mobile/desktop URL matches as closely as possible. For example, if you run a shopping site and there's an access from a mobile phone to a desktop-version URL, make sure that the user is redirected to the mobile version of the page for the same product, and not to the homepage of the mobile version of the site. We occasionally find sites using this kind of redirect in an attempt to boost their search rankings, but this practice only results in a negative user experience, and so should be avoided at all costs.
On the other hand, when there's an access to a mobile-version URL from a desktop browser or by our web crawler, Googlebot, it's not necessary to redirect them to the desktop-version. For instance, Google doesn't automatically redirect desktop users from their mobile site to their desktop site, instead they include a link on the mobile-version page to the desktop version. These links are especially helpful when a mobile site doesn't provide the full functionality of the desktop version -- users can easily navigate to the desktop-version if they prefer.
Some sites have the same URL for both desktop and mobile content, but change their format according to User-agent. In other words, both mobile users and desktop users access the same URL (i.e. no redirects), but the content/format changes slightly according to the User-agent. In this case, the same URL will appear for both mobile search and desktop search, and desktop users can see a desktop version of the content while mobile users can see a mobile version of the content.
However, note that if you fail to configure your site correctly, your site could be considered to be cloaking, which can lead to your site disappearing from our search results. Cloaking refers to an attempt to boost search result rankings by serving different content to Googlebot than to regular users. This causes problems such as less relevant results (pages appear in search results even though their content is actually unrelated to what users see/want), so we take cloaking very seriously.
So what does "the page that the user sees" mean if you provide both versions with a URL? As I mentioned in the previous post, Google uses "Googlebot" for web search and "Googlebot-Mobile" for mobile search. To remain within our guidelines, you should serve the same content to Googlebot as a typical desktop user would see, and the same content to Googlebot-Mobile as you would to the browser on a typical mobile device. It's fine if the contents for Googlebot are different from the one for Googlebot-Mobile.
One example of how you could be unintentionally detected for cloaking is if your site returns a message like "Please access from mobile phones" to desktop browsers, but then returns a full mobile version to both crawlers (so Googlebot receives the mobile version). In this case, the page which web search users see (e.g. "Please access from mobile phones") is different from the page which Googlebot crawls (e.g. "Welcome to my site"). Again, we detect cloaking because we want to serve users the same relevant content that Googlebot or Googlebot-Mobile crawled.
Maile: Ahh, I see: Webmasters concerned with search traffic likely want to balance the positives of watermarking with the preferences of their users -- keeping in mind that sites that use clean images without distracting artifacts tend to be more popular, and that this can also impact rankings. Will Google rank an image differently just because it's watermarked?Peter: It's understandable that webmasters find watermarking images beneficial.
If search traffic is important to a webmaster, then he/she may also want to consider some of our findings:Pros of watermarked images
- Photographers can claim credit/be recognized for their art.
- Unknown usage of the image is deterred.
In summary, if a feature such as watermarking reduces the user-perceived quality of your image or your image's thumbnail, then searchers may select it less often. Preview your images at thumbnail size to get an idea of how the user might perceive it.Findings relevant to watermarked images
- Users prefer large, high-quality images (high-resolution, in-focus).
- Users are more likely to click on quality thumbnails in search results. Quality pictures (again, high-res and in-focus) often look better at thumbnail size.
- Distracting features such as loud watermarks, text over the image, and borders are likely to make the image look cluttered when reduced to thumbnail size.
Peter: Nope. The presence of a watermark doesn't itself cause an image to be ranked higher or lower.
Googlebot may not be able to find your site
Googlebot, our crawler, must crawl your site before it can be included in our search index. If you just created the site, we may not yet be aware of it. If that's the case, create a Mobile Sitemap and submit it to Google to inform us to the site’s existence. A Mobile Sitemap can be submitted using Google Webmaster Tools, in the same way as with a standard Sitemap.
Googlebot may not be able to access your site
Some mobile sites refuse access to anything but mobile phones, making it impossible for Googlebot to access the site, and therefore making the site unsearchable. Our crawler for mobile sites is "Googlebot-Mobile". If you'd like your site crawled, please allow any User-agent including "Googlebot-Mobile" to access your site. You should also be aware that Google may change its User-agent information at any time without notice, so it is not recommended that you check if the User-agent exactly matches "Googlebot-Mobile" (which is the string used at present). Instead, check whether the User-agent header contains the string "Googlebot-Mobile". You can also use DNS Lookups to verify Googlebot.
RDFa (Yahoo! SearchMonkey):<meta name="title" content="Baroo? - cute puppies" />
<meta name="description" content="The cutest canine head tilts on the Internet!" />
<link rel="image_src" href="http://example.com/thumbnail_preview.jpg" />
<link rel="video_src" href="http://example.com/video_object.swf?id=12345"/>
<meta name="video_height" content="296" />
<meta name="video_width" content="512" />
<meta name="video_type" content="application/x-shockwave-flash" />
Posted by Michael Cohen, Product Manager, Video Search Team<object width="512" height="296" rel="media:video"
resource="http://example.com/video_object.swf?id=12345"
xmlns:media="http://search.yahoo.com/searchmonkey/media/"
xmlns:dc="http://purl.org/dc/terms/">
<param name="movie" value="http://example.com/video_object.swf?id=12345" />
<embed src="http://example.com/video_object.swf?id=12345"
type="application/x-shockwave-flash" width="512" height="296"></embed>
<a rel="media:thumbnail" href="http://example.com/thumbnail_preview.jpg" />
<a rel="dc:license" href="http://example.com/terms_of_service.html" />
<span property="dc:description" content="Cute Overload defines Baroo? as: Dogspeak for 'Whut the...?'
Frequently accompanied by the Canine Tilt and/or wrinkled brow for enhanced effect." />
<span property="media:title" content="Baroo? - cute puppies" />
<span property="media:width" content="512" />
<span property="media:height" content="296" />
<span property="media:type" content="application/x-shockwave-flash" />
<span property="media:region" content="us" />
<span property="media:region" content="uk" />
<span property="media:duration" content="63" />
</object>
Googlers strongly believe in dogfooding our own products. We manage our work schedules with Google Calendar, publish our blogs on Blogger, and store scads of documentation on Google Sites. So, ever since we launched our first Webmaster Help Group, we've been using Google Groups to facilitate conversations about Webmaster Tools and web search issues.
Today, however, I'm thrilled to announce that our English and Polish Help Groups are getting a makeover. And the changes are more than just skin-deep. Our new Help Forums should make it easier for you to find answers, share resources with others, and have your participation acknowledged.
You can read more about the changes on the Official Google Blog, and then check it out for yourself: English, Polish.
Q: What will happen to the old English and Polish Help Groups?
A: While our old groups are now closed to new posts, they will still be available in read-only mode in case you want to reference any of your favorite posts from the good old days. Many of the most frequently-asked questions (and answers!) have already been transferred to our new Help Forums.
Q: If I was a member of the old group, will I automatically be a member of the new forum?
A: We won't be "transferring" membership from the old groups to the new, so even if you were a member of our Google Groups forum, you'll still need to join the new forum in order to participate. Nicknames and user profiles are also managed separately, so you're welcome to recreate your Google Groups profile in our new forum, or reinvent yourself.
Q: What about the Webmaster Help Groups in other languages?
A: They'll be moving to the new Help Forum format in 2009. Specific dates will be announced in each of the groups as they get closer to their moving date.
Feel free to post any other questions about the new Help Forums in the comments below.
And if you have an entire page that should not be translated, you can add:Email us at <span class="notranslate">sales at mydomain dot com</span>
to the <head> of your page and we won't translate any of the content on that page.<meta name="google" value="notranslate">
Thanks to chaoskaizer for pointing this out in the comments. :)<meta name="google" content="notranslate">
Now that you've read more information about internal links, outbound links, and inbound links (today's post :), we'll see you in the blog comments! Thanks for joining us for links week.Create unique and compelling content on your site and the web in general
Pursue business development opportunities
- Start a blog: make videos, do original research, and post interesting stuff on a regular basis. If you're passionate about your site's topic, there are lots of great avenues to engage more users.
If you're interested in blogging, see our Help Center for specific tips for bloggers.- Teach readers new things, uncover new news, be entertaining or insightful, show your expertise, interview different personalities in your industry and highlight their interesting side. Make your site worthwhile.
- Participate thoughtfully in blogs and user reviews related to your topic of interest. Offer your knowledgeable perspective to the community.
- Provide a useful product or service. If visitors to your site get value from what you provide, they're more likely to link to you.
For more actionable ideas, see one of my favorite interviews with Matt Cutts for no-cost tips to help increase your traffic. It's a great primer for webmasters. (Even before this post, I forwarded the URL to many of my friends. :)Use Webmaster Tools for "Links > Pages with external links" to learn about others interested in your site. Expand the web community by figuring out who links to you and how they're linking. You may have new audiences or demographics you didn't realize were interested in your niche. For instance, if the webmasters for example.com noticed external links coming from art schools, they may start to engage with the art community -- receiving new feedback and promoting their site and ideas.
Of course, be responsible when pursuing possible opportunities in this space. Don't engage in mass link-begging; no one likes form letters, and few webmasters of quality sites are likely to respond positively to such solicitations. In general, many of the business development techniques that are successful in human relationships can also be reflected online for your site.
Use descriptive anchor textIntuitive navigation for users
Create common user scenarios, get "in character," then try working through your site. For example, if your site is about basketball, imagine being a visitor (in this case a "baller" :) trying to learn the best dribbling technique.Crawlable links for search engines
- Starting at the homepage, if the user doesn't use the search box on your site or a pulldown menu, can they easily find the desired information (ball handling like a superstar) from the navigation links?
- Let's say a user found your site through an external link, but they didn't land on the homepage. Starting from any (sub-/child) page on your site, make sure they can easily find their way to the homepage and/or other relevant sections. In other words, make sure users aren't trapped or stuck. Was the "best dribbling technique" easy for your imaginary user to find? Often breadcrumbs such as "Home > Techniques > Dribbling" help users to understand where they are.
- Text links are easily discovered by search engines and are often the safest bet if your priority is having your content crawled. While you're welcome to try the latest technologies, keep-in-mind that when text-based links are available and easily navigable for users, chances are that search engines can crawl your site as well.
This <a href="new-page.html">text link</a> is easy for search engines to find.- Sitemap submission is also helpful for major search engines, though it shouldn't be a substitute for crawlable link architecture. If your site utilizes newer techniques, such as AJAX, see "Verify that Googlebot finds your internal links" below.
Q: Let's say my website is about my favorite hobbies: biking and camping. Should I keep my internal linking architecture "themed" and not cross-link between the two?A: It's not something we, as webmasters who also work at Google, would really spend time or energy on. In other words, if your site already has strong link architecture, it's far more productive to work on keeping users happy with fresh and compelling content rather than to worry about PageRank sculpting.
Matt Cutts answered more questions about "appropriate uses of nofollow" in our webmaster discussion group.
Perhaps it's cliche, but at the end of the day, and at the end of this post, :) it's best to create solid link architecture (making navigation intuitive for users and crawlable for search engines)—implementing what makes sense for your users and their experience on your site.A: We haven't found a case where a webmaster would benefit by intentionally "theming" their link architecture for search engines. And, keep-in-mind, if a visitor to one part of your site can't easily reach other parts of your site, that may be a problem for search engines as well.
Internal linking is your homepage linking to your "Contact us" page, or your "Contact us" page linking to your "About me" page. Internal linking (also known as link architecture) is important because it's a major factor in how easily visitors can navigate your site. Additionally, internal linking contributes to your site's "crawlability" -- how easily a spider can reach your pages. More in Day 2 of links week.
Day 3: Outbound links (sites you link to)
Day 4: Inbound links (sites linking to you)Outbound links are external sites that you're linking to. For example, www.google.com/webmasters links to the domain googlewebmastercentral.blogspot.com (our lovely blog!). Outbound links allow us to surf the web -- they're a big reason why the web is so exciting and collaborative. Without outbound links, your site can seem isolated from the community because each page becomes "brochure-ware." Most sites include outbound links naturally and it shouldn't be a big concern. If you still have questions, we'll be covering outbound linking in more detail on Day 3.
Update: Included references to blog posts as they were published throughout links week.Inbound links are external sites linking to you. There are many webmasters who (rightfully) aren't preoccupied by the subject of inbound links. So why do some webmasters care? It's likely because merit-based or volunteered inbound links may seem like a quick way to increase rankings and traffic. Answers to your questions like, "Are there no-cost methods to maximize my merit-based links?" are provided on Day 4.
Duplicate content. There's just something about it. We keep writing about it, and people keep asking about it. In particular, I still hear a lot of webmasters worrying about whether they may have a "duplicate content penalty."
Let's put this to bed once and for all, folks: There's no such thing as a "duplicate content penalty." At least, not in the way most people mean when they say that.
There are some penalties that are related to the idea of having the same content as another site—for example, if you're scraping content from other sites and republishing it, or if you republish content without adding any additional value. These tactics are clearly outlined (and discouraged) in our Webmaster Guidelines:
- Don't create multiple pages, subdomains, or domains with substantially duplicate content.
- Avoid... "cookie cutter" approaches such as affiliate programs with little or no original content.
- If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.
(Note that while scraping content from others is discouraged, having others scrape you is a different story; check out this post if you're worried about being scraped.)
But most site owners whom I hear worrying about duplicate content aren't talking about scraping or domain farms; they're talking about things like having multiple URLs on the same domain that point to the same content. Like www.example.com/skates.asp?color=black&brand=riedell and www.example.com/skates.asp?brand=riedell&color=black. Having this type of duplicate content on your site can potentially affect your site's performance, but it doesn't cause penalties. From our article on duplicate content:
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.
This type of non-malicious duplication is fairly common, especially since many CMSs don't handle this well by default. So when people say that having this type of duplicate content can affect your site, it's not because you're likely to be penalized; it's simply due to the way that web sites and search engines work.
Most search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content. To this end, Google tries to filter out duplicate documents so that users experience less redundancy. You can find details in this blog post, which states:
- When we detect duplicate content, such as through variations caused by URL parameters, we group the duplicate URLs into one cluster.
- We select what we think is the "best" URL to represent the cluster in search results.
- We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.
Here's how this could affect you as a webmaster:
In most cases Google does a good job of handling this type of duplication. However, you may also want to consider content that's being duplicated across domains. In particular, deciding to build a site whose purpose inherently involves content duplication is something you should think twice about if your business model is going to rely on search traffic, unless you can add a lot of additional value for users. For example, we sometimes hear from Amazon.com affiliates who are having a hard time ranking for content that originates solely from Amazon. Is this because Google wants to stop them from trying to sell Everyone Poops? No; it's because how the heck are they going to outrank Amazon if they're providing the exact same listing? Amazon has a lot of online business authority (most likely more than a typical Amazon affiliate site does), and the average Google search user probably wants the original information on Amazon, unless the affiliate site has added a significant amount of additional value.
Lastly, consider the effect that duplication can have on your site's bandwidth. Duplicated content can lead to inefficient crawling: when Googlebot discovers ten URLs on your site, it has to crawl each of those URLs before it knows whether they contain the same content (and thus before we can group them as described above). The more time and resources that Googlebot spends crawling duplicate content across multiple URLs, the less time it has to get to the rest of your content.
In summary: Having duplicate content can affect your site in a variety of ways; but unless you've been duplicating deliberately, it's unlikely that one of those ways will be a penalty. This means that:
Posted by Susan Moskwa, Webmaster Trends Analyst
[www.metrokitchen.com]
"If you're looking for an item that's no longer stocked (as I was), this makes it really easy to find an alternative."
-Riona, domestigeek
[www.comedycentral.com]
"Blame the robot monkeys"
-Reid, tells really bad jokes
[www.splicemusic.com]
"Boost your 'Time on site' metrics with a 404 page like this."
-Susan, dabbler in music and Analytics
[www.treachery.net]
"It's not reassuring, but it's definitive."
-Jonathan, has trained actual spiders to build websites, ants handle the 404s
[www.apple.com]
"Good with respect to usability."
[thcnet.net]
"At least there's a mailbox."
-JohnMu, adventurous
[lookitsme.co.uk]
"It's pretty cute. :)"
-Jessica, likes cute things
[www.orangecoat.com]
"Flow charts rule."
-Sahala, internet traveller
[icanhascheezburger.com]
"I can has useful links and even e-mail address for questions! But they could have added 'OH NOES! IZ MISSING PAGE! MAYBE TIPO OR BROKN LINKZ?' so folks'd know what's up."
-Adam, lindy hop geek
11AM Meeting with Matt CuttsOur team shares a doc containing our current agenda and the previous meetings' agenda, minutes, and action items. In this meeting, we discussed:
- Feedback from blog post on Duplicate content due to scrapers. Some webmasters suggested that we could improve our detection. In order to improve quality, it would help to get feedback with specific examples. Susan Moskwa, one of our Webmaster Trends Analysts based in Kirkland, Washington, volunteered to post a blog comment to solicit more information.
- Recent and upcoming releases
- Webmaster Tools API on schedule
- "Skip intro" in search results
- JuneTune online chat agenda
- Two recent spam techniques mentioned in the blogosphere. Brian White, who leads one of the Webspam-fighting groups at Google, explained that one technique is new twist on old idea, both are already handled.
1PM Lunch with Shyam, a Crawl engineer, and Jason, and AdSense engineerMatt provided feedback on:
- Proposal to write follow-up blog comment on duplicate content caused by scrapers to solicit specific examples. Approved.
- "URLs and case sensitivity basics" presentation for online chat
@fintan: We verified with Adobe that the textual content from legacy sites, such as those scripted with AS1 and AS2, can be indexed by our new algorithm.
@andrew, jonny m, erichazann, mike, ledge, stu, rex, blog, dis: For our July 1st launch, we didn't enable Flash indexing for Flash files embedded via SWFObject. We're now rolling out an update that enables support for common JavaScript techniques for embedding Flash, including SWFObject and SWFObject2.
@mike: At this time, content loaded dynamically from resource files is not indexed. We’ve noted this feature request from several webmasters -- look for this in a near future update.
@captain cuisine: The text found in Flash files is treated similarly to text found in other files, such as HTML, PDFs, etc. If the Flash file is embedded in HTML (as many of the Flash files we find are), its content is associated with the parent URL and indexed as single entity.
@jeroen: Serving the same content in Flash and an alternate HTML version could cause us to find duplicate content. This won't cause a penalty -- we don’t lower a site in ranking because of duplicate content. Be aware, though, that search results will most likely only show one version, not both.
@All: We’re trying to serve users the most relevant results possible regardless of the file type. This means that standalone Flash, HTML with embedded Flash, HTML only, PDFs, etc., can all have the potential to be returned in search results.
@dsfdgsg: We’ve heard requests for deep linking (linking to specific content inside file) not just for Flash results, but also for other large documents and presentations. In the case of Flash, the ability to deep link will require additional functionality in Flash with which we integrate.
@All: The majority of the existing Flash files on the web are fine in regard to filesize. It shouldn’t be too much of a concern.
@brian, marcos, bharath: Regarding ActionScript, we’re able to find new links loaded through ActionScript. We explore Flash like a website visitor does, we do not decompile the SWF file. Unless you're making ActionScript visible to users, Google will not expose ActionScript code.
@dlocks: We respect rel="nofollow" wherever we encounter it in HTML.
You may have noticed that we recently rewrote our article on What is an SEO? Does Google recommend them? Previously, the article had focused on warning people about common SEO scams to look out for, but didn't mention many of the valuable services that a helpful SEO can provide.
The article now notes some of the benefits of search engine optimization, and provides some guidance to site owners who are considering hiring an SEO. We'd also like to get your perspective: how would you define SEO? What questions would you ask a prospective SEO? What advice would you give to an inexperienced webmaster who's considering whether to contract an SEO? We'd like to hear your thoughts and incorporate your feedback if there's important advice that we should add.
| DIRECTIVE | IMPACT | USE CASES |
| Disallow | Tells a crawler not to index your site -- your site's robots.txt file still needs to be crawled to find this directive, however disallowed pages will not be crawled | 'No Crawl' page from a site. This directive in the default syntax prevents specific path(s) of a site from being crawled. |
| Allow | Tells a crawler the specific pages on your site you want indexed so you can use this in combination with Disallow | This is useful in particular in conjunction with Disallow clauses, where a large section of a site is disallowed except for a small section within it |
| $ Wildcard Support | Tells a crawler to match everything from the end of a URL -- large number of directories without specifying specific pages | 'No Crawl' files with specific patterns, for example, files with certain filetypes that always have a certain extension, say pdf |
| * Wildcard Support | Tells a crawler to match a sequence of characters | 'No Crawl' URLs with certain patterns, for example, disallow URLs with session ids or other extraneous parameters |
| Sitemaps Location | Tells a crawler where it can find your Sitemaps | Point to other locations where feeds exist to help crawlers find URLs on a site |
| DIRECTIVE | IMPACT | USE CASES |
| NOINDEX META Tag | Tells a crawler not to index a given page | Don't index the page. This allows pages that are crawled to be kept out of the index. |
| NOFOLLOW META Tag | Tells a crawler not to follow a link to other content on a given page | Prevent publicly writeable areas to be abused by spammers looking for link credit. By using NOFOLLOW you let the robot know that you are discounting all outgoing links from this page. |
| NOSNIPPET META Tag | Tells a crawler not to display snippets in the search results for a given page | Present no snippet for the page on Search Results |
| NOARCHIVE META Tag | Tells a search engine not to show a "cached" link for a given page | Do not make available to users a copy of the page from the Search Engine cache |
| NOODP META Tag | Tells a crawler not to use a title and snippet from the Open Directory Project for a given page | Do not use the ODP (Open Directory Project) title and snippet for this page |
Planning on moving your site to a new domain? Lots of webmasters find this a scary process. How do you do it without hurting your site's performance in Google search results?
Let's cover moving your site to a new domain (for instance, changing from www.example.com to www.example.org). This is different from moving to a new IP address; read this post for more information on that.
Here are the main points: