Ep201 - ‘How Google Search Crawls Pages’

Release Date:


Episode 201 contains the Digital Marketing News and Updates from the week of Feb 26 - Mar 1, 2024.1. ‘How Google Search Crawls Pages’ - In a comprehensive video from, Google engineer Gary Illyes sheds light on how Google's search engine discovers and fetches web pages through a process known as crawling.  Crawling is the first step in making a webpage searchable. Google uses automated programs, known as crawlers, to find new or updated pages. The cornerstone of this process is URL discovery, where Google identifies new pages by following links from known pages. This method highlights the importance of having a well-structured website with effective internal linking, ensuring that Google can discover and index new content efficiently.A key tool in enhancing your website's discoverability is the use of sitemaps. These are XML files that list your site's URLs along with additional metadata. While not mandatory, sitemaps are highly recommended as they significantly aid Google and other search engines in finding your content. For business owners, this means working with your website provider or developer to ensure your site automatically generates sitemap files, saving you time and reducing the risk of errors.Googlebot, Google's main crawler, uses algorithms to decide which sites to crawl, how often, and how many pages to fetch. This process is delicately balanced to avoid overloading your website, with the speed of crawling adjusted based on your site's response times, content quality, and server health. It's crucial for businesses to maintain a responsive and high-quality website to facilitate efficient crawling.Moreover, Googlebot only indexes publicly accessible URLs, emphasizing the need for businesses to ensure their most important content is not hidden behind login pages. The crawling process concludes with downloading and rendering the pages, allowing Google to see and index dynamic content loaded via JavaScript.2. Is Google Happy with 301+410 Responses? - In a recent discussion on Reddit, a user expressed concerns about their site's "crawl budget" being impacted by a combination of 301 redirects and 410 error responses. This situation involved redirecting non-secure, outdated URLs to their secure counterparts, only to serve a 410 error indicating the page is permanently removed. The user wondered if this approach was hindering Googlebot's efficiency and contributing to crawl budget issues.Google's John Mueller provided clarity, stating that using a mix of 301 redirects (which guide users from HTTP to HTTPS versions of a site) followed by 410 errors is acceptable. Mueller emphasized that crawl budget concerns primarily affect very large sites, as detailed in Google's documentation. If a smaller site experiences crawl issues, it likely stems from Google's assessment of the site's value rather than technical problems. This suggests the need for content evaluation to enhance its appeal to Googlebot.Mueller's insights reveal a critical aspect of SEO; the creation of valuable content. He criticizes common SEO strategies that replicate existing content, which fails to add value or originality. This approach, likened to producing more "Zeros" rather than unique "Ones," implies that merely duplicating what's already available does not improve a site's worth in Google's eyes.For business owners, this discussion underlines the importance of focusing on original, high-quality content over technical SEO manipulations. While ensuring your site is technically sound is necessary, the real competitive edge lies in offering something unique and valuable to your audience. This not only aids in standing out in search results but also aligns with Google's preference for indexing content that provides new information or perspectives.In summary, while understanding the technicalities of SEO, such as crawl budgets and redirects, is important, the emphasis should be on content quality. Businesses should strive to create original content that answers unmet needs or provides fresh insights. This approach not only helps with better indexing by Google but also engages your audience more effectively, driving organic traffic and contributing to your site's long-term success.3. UTM Parameters & SEO - Google's John Mueller emphasized that disallowing URLs with UTM parameters does not significantly enhance a website's search performance. Instead, he advocates for maintaining clean and consistent internal URLs to ensure optimal site hygiene and efficiency in tracking.Mueller's advice is straightforward: focus on improving the site's structure to minimize the need for Google to crawl irrelevant URLs. This involves refining internal linking strategies, employing rel-canonical tags judiciously, and ensuring consistency in URLs across feeds. The goal is to streamline site management and make it easier to track user interactions and traffic sources without compromising on SEO performance.A notable point Mueller makes is regarding the handling of external links with UTM parameters. He advises against blocking these through robots.txt, suggesting that rel-canonical tags will effectively manage these over time, aligning external links with the site's canonical URL structure. This approach not only simplifies the cleanup of random parameter URLs but also reinforces the importance of direct management at the source. For instance, if a site generates random parameter URLs internally or through feed submissions, the priority should be to address these issues directly rather than relying on robots.txt to block them.In summary, Mueller's guidance underscores the importance of website hygiene and the strategic use of SEO tools like rel-canonical tags to manage URL parameters effectively. His stance is clear: maintaining a clean website is crucial, but blocking external URLs with random parameters is not recommended. This advice aligns with Mueller's consistent approach to SEO best practices, emphasizing the need for site owners to focus on foundational site improvements and efficient management of URL parameters for better search visibility and tracking.4. Transition Required for Google Business Profile Websites - Google has announced that starting in March 2024, websites created through Google Business Profiles (GBP) will be deactivated, with an automatic redirect to the businesses' Google Business Profile in place until June 10, 2024. This move requires immediate attention from GBP website owners to ensure continuity in their online operations.For businesses unsure if their website is hosted through Google Business Profiles, a simple search on Google for their business name and accessing the edit function of their Google Business Profile will reveal if their website is a GBP creation. It’s indicated by a message stating, “You have a website created with Google.” For those without a GBP website, the option to link an external site will be available.In response to this change, Google has recommended several alternative website builders for affected businesses. Among the suggested platforms are Wix, Squarespace, GoDaddy, Google Sites, Shopify (specifically for e-commerce), Durable, Weebly, Strikingly, and WordPress. Each offers unique features, with WordPress notable for its free website builder incorporating generative AI capabilities. However, users should be aware ...

Ep201 - ‘How Google Search Crawls Pages’

Title
Ep201 - ‘How Google Search Crawls Pages’
Copyright
Release Date

flashback