In the field of SEO, hCard was best known as a microformat standard used for creating structured data; this data, implemented for the benefit of search engines over people, helps Google to better understand information on web pages. Usage of hCard has been phased out in favour of Schema markup. hCard vs Schema hCard has […]
Announced by Google in 2011, Schema markup is form of a microdata that was established in collaboration by the search giants Google, Bing, Yahoo! and Yandex. The aim has been to develop a specific vocabulary of tags that enable webmasters to communicate the meaning of web pages to computer programs that read them, including search engines.
In 2010, Google introduced two new meta tags for news articles: syndication-source and original-source. These tags were designed to enable news curators to publish another site’s article on their own site, without risking a Google penalty for duplicate content or plagiarism. The canonical tag now performs this function.
A rel=canonical tag is a snippet of HTML code which marks up web pages that are at risk of being interpreted as duplicate content. By implementing this tag in pages with similar or identical content, webmasters are able to convey to search engines which is the original page—the ‘canonical link’—and which are subsequent copies.
URL parameters are used to indicate how search engines should handle parts of your site based on your URLs, in order to crawl your site more efficiently. This refers to folders within a URL string, i.e. yoursite.com/folder-one/ or yousite.com/folder-two/, where folder one may have duplicate content to folder two or where the content in folder one should not be showing up in search results.
The nofollow tag is a way that publishers can inform Google and other search engines that they do not endorse certain links to other pages. Nofollow is important for search engine optimisation as it proves to search engines that they are not selling influence or are involved in schemes deemed as unacceptable SEO practices.
These functions were introduced by Google in September 2011 to help tackle the problem of duplicate content. The Rel=Prev and Rel=Next are added to the HTML code of a website in order to let search engines know that a certain collection of consecutive pages should all be indexed together.
A crawler is the name given to a program used by search engines that traverses the internet in order to collect and index data. A crawler will visit a site via a hyperlink. The crawler then reads the site’s content and embedded links before following the links away from the site. The crawler continues this process until it has visited and indexed data from every single website that has a link to another site. It essentially crawls the web, hence the name.
Robots.txt and meta robots tags are used by webmasters and search engine optimisation agencies in order to give instructions to crawlers traversing and indexing a website. They tell the search spider what to do with the specific web page, this may include requesting that the spider does not crawl the page at all or crawls the page but does not include it in Google’s index.
For a long time, search engines found it very difficult to read any information and links which do not appear in HTML format within a page’s source code. This applied predominantly to interactive or multimedia content. The continuing development of Google Image Search, has improved this somewhat.
First Click Free is a Google tool that allows Google bots to crawl and index content held behind forms, predominantly on subscription or registration-only sites (ie. those with paywalls) such as The Times or The New York Times. Introduced in 2008, it allows Google to gain access to behind-form content and allows these pages to appear in search engine results for relevant queries.
A sitemap is essentially a resource you create for your site to enable discovery by Google and other search engines. Providing resource metadata in the form of sitemaps is a key way to position your content to appear in search.
Anchor text is the descriptive text of an outbound link. It is clickable, and it’s readable to both the user and to search engines. How this link is described in the anchor text is considered to be one of the top three ranking factors, and remains an integral part of any content marketing or SEO campaign.
In a computing context, cache refers to the temporary storing of data, usually for purposes of fast retrieval upon a second load. In search specifically, “cache” is a reference to a web cache, usually HTML pages and images that are stored either by the browser or the search engine to reduce bandwidth.
A 301 redirect is an instruction that tells a browser: “The requested page is no longer available at the URL you have, you’ll find it at this new new address”. The browser is then automatically redirected to this new URL and the desired, relocated content is displayed. This happens so quickly users rarely notice a page has been redirected.