Crawling and indexing is the main part of any crawler’s job!
But what stops a site from getting indexed and what steps can you do to speed up the process? What can you find when you crawl your own site, rather than relying on the search engines to get it right for you?
Dixon Jones was joined by a panel of experts on Wednesday 7th October as they answered these questions and more.
Watch on-demand
Listen to the podcast
Your Panelists
Rasmus Sørensen, Global SEO & ECommerce Director at Blinds2Go.
Patrick Hathaway, Director at Sitebulb.
Romain Damery, Director, Technical SEO at Path Interactive.
Transcript
Dixon Jones
Hello everybody.
Rasmus Sørensen
Hey.
Dixon Jones
Hey. Welcome to another episode of Old Guard vs New Blood with Majestic. So, this is 45 minutes of discussion on a particular topic and we’ve got a couple of oldies in the world here and that’s myself and Rasmus. Hi, Rasmus.
Rasmus Sørensen
Hello, Dixon. Hello everyone.
Dixon Jones
And then we got some youngsters today in the industry, they might not to be youngsters, but they can well are compared to Rasmus and me. That’s Romain and Patrick, so hi guys.
Patrick Hathaway
Hello.
Romain Damery
Hi, everyone.
Dixon Jones
We’re just going to rabbit for a couple of minutes whilst Facebook gets up to speed and people start to come on in. If anybody’s in the chat room and wants to say hi, then feel free. I’ll try and bring her into the conversation. I’m going to press some buttons so I can see some bits and pieces. And if any of you can see the feed on Facebook guys, and there are questions there, then please tell me because I’m really bad at checking those kinds of things out.
So, this month we’ve morphed Old Guard-New Blood into a battle between what works and what doesn’t on crawlers. And so we got a great panel really, I think, and we’re a good balance of in-house, agency and me, a sort of limbo land. And so Romain, why don’t you start by introducing yourself, and what you do, and who’s Path Interactive?
Romain Damery
Yeah. So I’m Romain Damery and Director of Tech SEO at Path Interactive. So we are a pure digital marketing and advertising agency based out of New York City, although we’re all working from home right now. So I guess it doesn’t matter as much. The sneak peek at the office, by the way, right behind me, courtesy of Zoom virtual backgrounds, thanks for that. I’m pretty sure the office is exactly the same right now, just as empty. Yeah. And no, I’ve been doing SEO and digital marketing for slightly over 10 years. So I’m very honored that I made the cut to be considered a young blood, because I guess it wasn’t that obvious at first, Dixon was like, “How old are you Romain again?” Dixon, it’s the beard, am not that old.
Dixon Jones
Just one way, during warm up guys, he did say that he’d use front page so that is old.
Romain Damery
Yeah, I started early. That’s technically not really SEO, I didn’t care about SEO that much. But yeah, I probably created my first front page website when I was 12 or something like that.
Dixon Jones
It was SEO for me at the time. Anyway. Patrick, why don’t you say who you are and talk about your fantastic tool?
Patrick Hathaway
Hi, everybody. Yeah, I’m the co founder of Sitebulb. Which desktop website crawler and SEO auditing tool. So if you don’t already use Sitebulb, just go and download it now while we’re waiting to get started, you might as well go and download it and sign up for trial and get going.
Romain Damery
The plug is at the end.
Patrick Hathaway
Oh, no, sorry. I’ll say it at the end as well. Yeah, so I mean, that’s me, we do we do SEO auditing with the software. And I’ve been in the SEO world for… Since about 2012 before site, but we had another tool called URL profiler, and that’s kind of how we got into the tool game. But it’s been Sitebulb pretty much since 2017. And it’s going from strength to strength and we’re… Yeah, that’s me.
Dixon Jones
Jolly good. Okay. And Rasmus. Lovely blinds that you’ve got behind you there.
Rasmus Sørensen
Yeah. I thought that was appropriate, even though it’s a zoom background, I don’t think you really want to see the real background here.
Dixon Jones
Okay.
Rasmus Sørensen
So, thanks for having me on, Dixon. You and I have known each other for a long time. I think it’s bordering on 20 years and been doing SEO the whole time, both at the agency level at the in-house, and as affiliates, and been that route. And recently, I joined Blinds 2go, which is a blinds and curtains online as the global SEO and eCommerce Director. So it’s exciting to finally be on the client side for a change and steering that along.
Dixon Jones
So I remember once flying to Copenhagen for the day and going to a conference that Rasmus had and I went to Copenhagen for lunch a couple of beers. A trip round his… Well, factory based venue and it was fun, I went back to the days here. But yeah, I think we’ve known each other, I don’t know, since the very first conference I went to. So probably 2002, something like that.
Rasmus Sørensen
Yeah.
Dixon Jones
It’s been a long time, anyway, so yeah.
Patrick Hathaway
That’s a long time ago. Yeah.
Rasmus Sørensen
I know.
Rasmus Sørensen
So, anyway. So, obviously, this will be put on by Majestic. So Majestic have come up with a couple of things since last month. If you haven’t checked out the new hub for for story, looking after your reports and and the new campaign manager tool, then do go and have a check with that if you’re a Majestic user. I hope you are. And also they’ve got some interesting post on link fidelity versus link volatility. Which takes you a little bit around the houses as to why volume is not always good compared to quality and stuff. So interesting posts on there.
Dixon Jones
If you want to try Majestic and you haven’t already, then it’s a good thing to do. A little tip for you guys, if you aren’t a Majestic user, if you… I don’t know if it’s still true. But if you ask them for a demo, really nicely, then they’ll do your one to one demo, and then ask them if there’s any way you can get in, have a bit of a trial. I don’t know if that still works, and I might get sacked for saying that if it doesn’t. But anyway, it’s worth a try.
Rasmus Sørensen
Okay, so, guys, we’re gonna be talking about crawling today, obviously something that Majestic does a lot. Patrick does a lot. But also, we on agency side, you probably got problems with some clients that have a problem with crawl issues and certainly things have changed on the crawling side, a lot over the years. So I think there may be some different perspectives from the olds in the room and the younger ones in the room. So, I’m going to start with the approach that I usually do, which is give us one thing about crawling, that SEOs don’t seem to pay attention to. And I got to start off with Romain, if there’s something that you think that SEOs don’t pay attention to in crawling that they should do?
Romain Damery
What I don’t want to sound presumptuous, so maybe like less experienced SEOs, I would say, maybe don’t pay enough attention to the rendering process that happens between the crawling and indexing. I think that’s something that yeah, that is that people don’t pay enough attention to. And the rendering, like beyond just actually like in JavaScript, but also like, what does it look like? How long does it take to render the main content and things like that? I think that’s more and more important, and yeah.
Dixon Jones
Okay. I think we’ll come back to that a little bit later because I think it’s going to be good. Patrick, do you want to go with one?
Patrick Hathaway
Yeah. So, I get a lot of support tickets or get exposed to kind of customer issues with stuff and like when they get confused, or when they’re not sure what to do with certain things. And often, like, the way it kind of gets pitched to us is, why is this thing happening? Why is X happening? Or Y is why happening? One of the things which we see a lot of is, like the software’s taking ages to crawl my site. Why is it so bad? Why is it so slow? And kind of when you dig into this and say, right, can you send me a screenshot, so I can just have a look at what’s going on? It’s the screenshot arrives and there’s like a queue of 500,000 URLs. It’s already crawled 500,000. And you can just see in the URL log that it stopped crawling, like paginated pages or faceted search or something like that. And when you’ve kind of pointed this out to the customer, that’s you need to go and limit the crawl, that’s what the problem is here. And most of the time, they are SEO professionals who sort of using our software. We’ll get responses like, okay, right-
Dixon Jones
Sorry, that was me. Okay, if any questions on Facebook, I can’t see it. I’ve just closed Facebook down. Sorry, Patrick, carry on.
Patrick Hathaway
What was I saying?
Dixon Jones
Oh, yeah. Crawling too fast.
Patrick Hathaway
So yeah, and like the response might be, okay, these pages are canonicalized anyway, right. And like, that’s not the point. The point is that you shouldn’t be exposing Google or crawlers to these pages in the first place. You’re forcing them to crawl a whole load of crap that they shouldn’t really need to be crawling. Unless you need to, they need to crawl these URLs from a discovery perspective. You shouldn’t be forced them to crawl them. And so like it’s just a massive waste of crawl budget. And I think there’s definitely a disconnect certainly with a bunch of our customers where they’re not really understanding that actually, you should be making it easy for Googlebot to navigate your site and find everything that you want them to find and find all your valuable pages, your unique value indexable valuable stuff and instead of wasting time all over the place of whatever junk they’re crawling.
Dixon Jones
I’ll muck about some tips there as well so you know. And Rasmus, got anything to add on there?
Rasmus Sørensen
Yeah, I think one of the questions I usually ask, at least when I was on the agency side is how many pages do you have. And usually other people around the table, their eyes started to [inaudible 001000] because they really don’t know. So I think that I’ve written a little note here, it says sitemap versus actual pages, versus internal crawls, versus the site domain operator at Google. Which, by that I kind of mean is how many pages do you have? Or do you know if you’ve got a full crawler if you’ve got 100% coverage on your crawls? And usually I get the answer that no, we don’t really know how many pages we have. So how do you know that you’re actually getting everything indexed? And getting the rankings and the traffic you deserve? Is it based on how big your site is.
Dixon Jones
Okay, so there’s some overlap there between canonicalization questions and crawls and stuff.
So I think if I’m gonna add one that’s different to those. I think that a lot of SEOs Don’t think about where their site is hosted and whether there’s a whether there’s other websites sitting on the same IP address, trying to hog all the all the hosting. So Google has, or a search engine has to decide what it’s going to crawl. And if you’ve got one website on the same server as 200 other websites, and one of them happens to be a massive, great big website with 2 million pages and a little bit of authority, then that may be hogging a lot of crawl as well. So that’s kind of a thing I’d have on there. I don’t know, I say that a lot but I don’t know if that’s true with that’s me saying my bit. Do you think that people make a mistake going for cheap hosting sometimes, and just the effects of crawl? Anyone want to go with that?
Rasmus Sørensen
Well, I think absolutely. I talked to a guy the other day in situation, He built the site for his wife. And he said, “Well, I put it on a Raspberry Pi and put it in my living room.” What? So just that’s a huge-
Dixon Jones
He’s clever though.
Rasmus Sørensen
Yeah.
Romain Damery
That is true.
Rasmus Sørensen
But I said, Once you’ve got Google crawling, Yahoo crawling, MSN crawling, and then customers and all that kinds of stuff. So yeah, I do believe that putting things on a shared host that has a lot of pages you want to get crawled is a dangerous path to take if it’s a very important site of yours, definitely.
I was going to say I think shared hosting is not necessarily bad for crawling or indexing, I think it depends more on the type of site that you’re hosting, and how much traffic do you have. Do you have huge peaks of traffic or not? Because especially if you’re on shared hosting… Well, if you have a lot of traffic, that’s probably a bad idea, just because of the bandwidth and the CPUs and things like that. But if it’s like a small blog, and then I would say it should crawl fine, you should be fine.
Patrick Hathaway
Yeah.
Romain Damery
But if you have all those sites on the same machine that are always over capacity, then that will cause issues probably for Googlebot or even ourselves to crawl, we can get kicked out, it just takes forever, things like that. So I guess, yeah, the load on those servers is what may create problems or not. Otherwise, I don’t think that inherently anything very bad with that. And also, you don’t really have much control of what you can do on that server. And then you can have other issues that you can’t install this, you can’t turn that on, you can’t you… It’s very limited to what you can do.
So yeah, I guess it depends.
Dixon Jones
Yeah. Patrick, anything you want to add?
Patrick Hathaway
No, I mean, not really. I was going to say it’s horses for courses. Exactly like Romain said that if you have a small site, it doesn’t matter, you’re not going to not get crawled. But as soon as SEO is a big factor for you. You should be looking just to upgrade your server anyway. Like it shouldn’t come to that, really.
Dixon Jones
Yeah, it’s not that expensive, isn’t really? When it comes to the greater scheme of things. Okay, so, guys, we’re talking about canonicalization or indexing, how many pages you’ve got your site and stuff. So that kind of leads into a lot of our users are WordPress users, 58% of the web? I don’t know what it is, but
Romain Damery
That’s crazy.
Dixon Jones
A lot of people and probably a larger than average number of Majestic users are WordPress users because SEO seemed to dive into that technology. So, I’m not saying whether Majestic or WordPress is good or bad or indifferent, but looking at WordPress and looking at the questions that the points that you guys made about canonicalization versus redirects and trying to limit the crawl to just crawling the pages that you want and not having a bot going through different versions of the page. How does that manifest itself in WordPress? What things go wrong to WordPress users that cause the kind of problems that you’re talking about in the crawl?
Mind, Patrick, do you want to jump in first?
Patrick Hathaway
I mean, I don’t know specifically, I suppose I don’t really get why it’s inherently a WordPress problem that we need to be focusing on.
Dixon Jones
But one that I think it’s in WordPress is because it’s a blog, you’ve always got your content and then you’ve got page two, and then your page three, you’ve got page four. So you’ve got all of those pages, which basically are summaries of all the content of the individual posts and things themselves. So just the way it comes out there, by default, you’ve just got all these other pages which have to be duplicate content, because they’re just getting snippets of the next section. Just an example.
Patrick Hathaway
Yeah, I mean, I think with all that stuff, there’s pretty straightforward plugins. Yoast takes care of all that stuff. So that you’re not even surfacing it to the crawler. So I guess, if you’ve got like a crappy CMS, whether it’s your WordPress or a plugin in WordPress, or whatever, that’s creating extra pages, like different versions of the same page, you don’t want that at all. You need to be turning that stuff off, if you can. And then the rest of it, it just all comes down to consistency in terms of how you’re referencing the pages. So like making sure you’ve got self referencing canonicals and using absolute URLs on them.
Making sure your architecture internal linking is all just completely consistent. And then even when you’re doing External links, making sure that’s consistent, you’re only ever linking to the canonical, only that URL goes in the sitemap. So in terms of the signals you’re sending to Google, it’s like, you’re making sure that it’s completely consistent signals that are going to Google so that they agree with you on a canonical is.
And then in terms of sections of content being repeated on multiple pages, from a crawl indexing perspective, I don’t really see that as a big deal to be honest. I don’t see it as something you need to be worrying about too much. Google can understand that this is a category page. And this is the real post, that’s not too much of a big deal. And it’s it’s quite straightforward to tick a few boxes in your plugin to sort of hide those pages or not surface them in the first place.
Romain Damery
Yeah, I think pagination is good, though, in variety sites. Pagination, I think it’s good to discover content on the site and making sure it’s part of the architecture. But for WordPress, and especially, actually, for Shopify now, because it’s really booming. One of the issues that we see is that you have one product, and you put it in a different collections. And then depending on where the product page itself is linked to on different collections, then the URL changes and adds that collection. And then you end up with having the same page under different URLs.
And then like Patrick said, then you want to make sure that you have a single self referencing canonical tag to usually the shorter version, like the PDP URL, and not the one that’s coming from XY collection page. But then what we find also is that if you’re only really featuring products in specific collections, then even if you have a set referencing canonical to the actual core product URL, Google still prefer the one that has most… That feature the most on the site, that has most of the internal links that… Then the actual product itself, which is not really linked to from anywhere, it’s just that you have to go to all products, and then go to pagination and find it there. So you have to also… The canonical may not work if you’re always featuring, a non canonical URL, and then you may not have the one you want indexed. Same thing for sharing on social media, you kind of need to be aware of that.
But otherwise, yeah, no pagination, I think it’s really great. I mean, for WordPress, like Tags and Categories, can help with the interlinking and can keep users engaged even can be good for them. I think the issue that we see on some large WordPress sites, is just categories and tags getting out of control. You have editors and marketers, you end up with like 25 tags for one piece of content and if there’s really no search volume at all for those tags or the category pages, and they’re all like one [burb 001949] and it’s… You’ve got to have a strategy for like, how many taxonomy mine, right? What are we? What makes sense? What makes sense for the user’s also changing, then maybe these are some tags or some categories that don’t make sense. But sometimes it’s a nice way also, to instead of having a piece of content that’s on page 40 of the pagination theories, then it’s interlinked higher on the website because it’s on a prominent category or tag that is valuable for users and search engines.
Dixon Jones
Is that easy to change your Shopify? So if you’re saying that canonicals don’t really work very well on some Shopify site. I’m not a Shopify expert. So excuse me if I ask two questions
Romain Damery
Nothing you can do to change on Shopify.
Dixon Jones
Yeah, okay. All right. Shopify works great. Okay.
Dixon Jones
Rasmus, you’re on eCommerce, so…
Rasmus Sørensen
Yeah, I think the…
Dixon Jones
… You find that problem a lot?
Rasmus Sørensen
Oh, yeah, definitely. And I think that the problems related to both what Roman and Patrick said here that at least WordPress, for example, out of the box with the categories, the tags, the pages, the posts. In my experience, I spend a lot of time removing stuff, de-indexing stuff, both through plugins and stuff.
But using WordPress out of the box for let’s say a site, a regular site and not a blog. That’s actually, there goes a lot of work into removing stuff and making sure that the tags is not… And I agree with what Romain said, you have 40 tags, and then you suddenly have 40 identical pages where the only thing is maybe the title’s log or something. And the same with the categories and it can easily get out of hand. And I’m seeing that a lot with the standard WordPress, but also in eCommerce and especially, as Romain mentioned the Shopify or where you have, like us selling blinds, we have a wooden blind, but it’s also a white wooden blind. So it’s actually relevant for two categories. But if the only product then we might end up with having two category pages with the same product and the only product in it, which is just a waste of users and crawlers and everybody’s time and effort I think.
Dixon Jones
Presumably, there’s a dichotomy there between… So, Patrick patch would like us all to just crawl one URL, because he’s doing the crawl and just like 301 redirects to the one version. But commerce is going to need… If you’ve got it in red and got it in blue, you kind of want to have some kind of canonical to get to one version of that, but you’re still going to need to have different URLs for the user to be able to use.
Rasmus Sørensen
Yeah, but we’re trying to fix it the other way around and say well, in let’s say we have a blind, that’s where I’m coming from, and we have it in five colors, then we try on the product description and say, well these blinds comes in with red, blue, white and gray. And then might have a drop down on the product page where you choose either your fabric or your color. So you basically have one page, and then you have the variables on the page. So it’s basically the one page, but it will get, once you get more products and you get more diversified colors, It can easily turn into a huge mess. Figuring out what item’s the right one is difficult.
Dixon Jones
So, do you want to jump in?
Romain Damery
Yeah, that’s a good point as well, like when you have filters, and faceted navigation, and to choose the color, things like that, I think usually what we do is that we look at search volume, and do people really look for that blue item. And then maybe it doesn’t make sense to have to add a canonical tag to it. And then if you have three colors, that might not be such a big deal. But the problem starts when you have too many options and filters, and that Google can crawl all that and it becomes like an infinite number of permutation in the filters.
So usually we try to like make sure that the keyword research like what makes sense to try to index because it has volume but it’s valuable. And making sure that on the CMS all the the facets already in the same order. So it’s easy to kind of exclude anything that has like maybe more than two facets at once or three facets at once. Because people will look for like maybe a blue Nike shoe for men, but they usually don’t really type the size direct in search engines. So it’s like get me the shoe and then they know. So maybe we don’t need to say it’s from that sides or that or the finer details that then because it can get out of control.
Dixon Jones
So maybe better off to have a JavaScript sort of carousel of images and things. So you just got one URL, and then they can see the pictures and-
Rasmus Sørensen
At least in theory it sounds pretty good. But then the real world hits and… Probably sometimes do it, like for binds at least, in Europe, especially in Scandinavia, there’s like the white blinds that you can see behind me, it’s like we like the white blinds stuff.
Dixon Jones
It’s just very clean, very clean.
Rasmus Sørensen
Yeah, very. Just white. So instead of having all the colors and optimizing for 10 different colors, we kind of say, well, let’s have the white as as the primary product page. And then we can see, as Romain is mentioning, if people are using the filters, and we see okay, somebody is really changing it to red all the time, then maybe we should do a collection or category page with red in it.
Romain Damery
Mm-hmm (affirmative). Exactly, yeah. And to the stronger page.
Rasmus Sørensen
I think the advice here from Romain, that you should keyword research and search volume as one of the determining factors of whether or not you should create a page is a great idea. Because creating a page that no one’s searching for is again, a waste of time and resources.
Patrick Hathaway
And your internal search right?
Rasmus Sørensen
And your internal search, yeah. That’s a very good question Patrick there. That’s something we just utilized and one thing I started digging into is like the internal search, what are people actually searching for it? It’s a phenomenal insight of why we should creative content.
Dixon Jones
Okay. So-
Romain Damery
Yeah. We don’t just use all our internal search like, militarily. You really have gold like internal search and see like, oh, I didn’t think about that. And sometimes it just ranks. And that’s fine if it doesn’t get out of control, and you can get a lot of traffic on it too. Yeah.
Rasmus Sørensen
But speaking of crawling indexing, and it can also consist of a lot of like this dreaded spider trap, whereas… I saw a site recently where they put like the 10, most internal searches or search engine, underneath the search fields, and if you click them, you’ll get the search and it kind of just was a box wrap like you’ve never seen because because the bot just kept on searching itself. And it just-
Dixon Jones
Yeah, it makes sense to just… You’ve got two robots.txt or block out the actual search results of your own search crawler from a search engine.
Rasmus Sørensen
Yeah, then use the information to build a real page if it turns out as a lot of searches. Yeah.
Dixon Jones
Okay, so you talked about at the start, Romain, sort of JavaScript or rendering really, so page rendering and issues around page rendering. And Google have got… Spend a lot of time recently talking about their JavaScript rendering service, their web rendering service, and how they’re very good at JavaScript rendering these days.
Dixon Jones
So I got a guest who’s got a few questions ready? How good are they? Patrick, how good is Sitebulb compared to Google? And Romain, tell us a bit more about rendering? And for those that don’t know the difference between rendering dolmans and view source, what’s the difference?
Romain Damery
Well, the rendering is when it’s retrieving all the resources that the page is calling and executing those. So that basically means crawling with JavaScript to make it easy to imagine. But it’s also the layout, what does it look like then? And I think there’s just different. Not talking necessarily about the rendering with all of the different body tags and things like that. But just like, I think for crawling, and for what Google rendering services IS more concerned with it’s more like just the content, it doesn’t have to run everything, just like, getting in the JavaScript required to retrieve and display that content and the CSS as well, and things like that.
So I think, where some things may get missed is that yeah, it’s not just enough sometimes just enable JavaScript on your crawler, and then I found it in the crawl, like, okay, and I’m like, it’s also, how long did it take the crawler to actually go through that rendering process for purely the main content itself? And also, what does it look like? Do you find your main content pretty high on the page? Does it look wonky? Are there are resources that are blocked that prevents the page from looking the way it should to the user and things like that.
Dixon Jones
So Patrick, so your tool, that’s what your tool does all the time. So when does it hit problems? When does it sail through? Is being a desktop tool an advantage for you on JavaScript? How does that work?
Patrick Hathaway
Yeah, I mean, so a few of your questions there, like in terms of… Compared to what Google can do. Sitebulb is effectively the same, if not better than what Google can do. So we use headless chrome to do it. And I think it was, I don’t know, this time last year or something, Google switched to evergreen chromium. So basically doing the same, where they’re using pretty much the latest stable release of chromium to do their rendering. And that’s the same. We do the same basically. So whenever like the latest stable version comes out, we upgrade to that.
So what that means is that in terms of features, like page features, there’s not that much that Sitebulb can’t handle. It’s not that much that Google can’t handle. And so it’s very rare that we’ll see a site that Google… I can probably count on one hand the amount of times where we’ve actually seen a site that Sitebulb can’t render at all. The problems tend to come when there’s way too much on a page, there’s too much scripts, too much stuff happening. Either you’re asking Sitebulb, or you’re asking Google to just do way too much work. And it’s not going to just keep downloading more stuff and keep processing everything. It’s just going to go, “I’m going to cut off now and we’re not going to do any more work.” And like that sort of manifests in Sitebulb, when Sitebulb’s trying to crawl a website, there’s a cutoff point.
So let’s say five seconds, by default, if it’s getting to a point where it’s not finding certain links on the page, because it’s hitting this cutoff point before the page is fully rendered, then you’re not going to be able to crawl the whole site. And then in the same way that Romain was just saying there with what Google’s saying they’re kind of prioritizing the content at the top of the page rather than some of the what they’re considering perhaps less got less important content or content that they’re just not getting a chance to get to because the computational resource is too high, you’re asking them to do too much work.
So there’s a massive difference in terms of how much computing power you need to just go and grab a page of HTML versus what you need to do for rendering. I think it’s something like 20 times difference. So that’s why they can’t just do this by default, for everything all the time. And they’re having to make these decisions do we need to do this, and looking for ways and opportunities when they’re crawling when they’re rendering, to just make sure we get the most important stuff. That might mean that there’s a load of content on your page that’s just not getting indexed at all. And if it means that there’s links, it’s not getting indexed, or not getting getting found, then that can be a problem from a discoverability perspective, and then an indexing problem as well.
Romain Damery
But otherwise, for sure, like Googlebot and Sitebulb they are efficient at recording JavaScript, I don’t think there’s really huge issues there. Like the issues arise when yeah, it takes too long. And then for your side is like, Do you have enough crawl budget to get indexed on time and within the core budget that’s allocated to you. So that’s my only issue, it’s that making sure that your pages get rendered-
Patrick Hathaway
It’s about being light weight, right?
Romain Damery
… And you getting more time.
Patrick Hathaway
And like you’ve-
Romain Damery
To speed up, to help the process, yeah.
Patrick Hathaway
Yeah, if anyone’s interested in this stuff, Onely, who’ve… Everything that they publish about all this stuff, they’ve really led the way. They have a tool called Too Long Didn’t Render, which you can use if you’re concerned, you can go and put a web page in there and it’ll tell you the computational cost of rendering a web page and just give you kind of a barometer of, will this take a long time? Is there a chance that this stuff’s not going to get indexed? Or is it just really light weight.
Dixon Jones
I thought that’s what Sitebulb does as well.
Patrick Hathaway
Not in exactly the same way. They’ve collected data, they’ve done a whole load of research on loads of different sites. And so yeah, that’s worth checking out for that sort of thing.
Dixon Jones
Okay. Rasmus-
Rasmus Sørensen
Yeah, it’s a good point here about speed or time it takes. It might be able to render the page and it might look fine. But if you have 10,000 lines of JavaScript, and Google is able to crawl it, but it’s a matter of if they really want to, or at least they want to index it and rank it. Because they know that maybe a user, it will crash at their end as well. So, one thing is if they’re able to do it, another thing is if they’re actually want to do it and in the end to actually rank it for the searchers.
And I do see a lot of challenges there.
Dixon Jones
So one of the problems for a lot of people, though, is that too long didn’t render is coming in, because they just put third party plugins the page, and they didn’t actually know what the hell was happening with that. And they don’t know what’s happening at the other end of that. But the other thing that with JavaScript bloat surely is, and tell me from I’m fixing this the wrong way here, is that sometimes you’ve got JavaScript over all of the site, and you’re actually only using it on four or five pages, so you’ve just put the code on.
Dixon Jones
So presumably, you can use Google Tag Manager then to stop that code. If you put the code in via Tag Manager, you can then use triggers so it only loads when that page is loading. So would that be a clever idea, a clever tip to people or a stupid one?
Romain Damery
I wouldn’t…
Dixon Jones
You wouldn’t do it that way?
Romain Damery
No, well, the thing is all of those and I think you’re referring to the third party tags that are wrecking havoc on your page speed for sure. Those don’t necessarily get crawled by Google during the rendering process. So even if you do that, it’s not really what’s going to speed up the rendering on the Google side of things. And so that will help for sure.
… And doing like several site tagging for, and things like that. But it’s more for the user and getting better, like the Chrome user experience, like great there. But in terms of the rendering on the Google side, I think it’s more like working on the JavaScript that really matters, if it’s required for your content is just trying to…
If it’s a five megabyte ad bundle then that’s not going to work well, you should do like a hybrid between some pre rendered, and then you can hydrate and add the JavaScript bells and whistles on top of it like that. But it’s more like yeah, getting to the core JavaScript that is required for the content and the CSS, like working on the critical rendering path and just shipping what you need for the content. I think that’s what in terms of crawl budget and making sure that you get indexed faster. I think that’s more what’s going to help you than just firing up some specific tags based on if it’s really required on that page or not, that’s going to end up in the Chrome user trends report anyway.
Patrick Hathaway
And there’s a fundamental difference between sites that happen to use a bit of JavaScript, which almost every site does, and sites where the content is dependent upon JavaScript. So the content itself [crosstalk 003617] is changing. That is the difference in what we’re talking about here. So if, like for instance, Sitebulb’s site doesn’t have much dependence on JavaScript at all. So we don’t really need to worry about that, particularly. But if you have something, some bits of content, you might have like a product carousel, for instance, just getting populated by JavaScript. That sort of thing-
Dixon Jones
To shoot different colors for examples.
Patrick Hathaway
To shoot different colors. Exactly. And you might find, all we’re getting is white, white, white, white, white, white, which is perfect, if you’re in sort of Scandinavia, where that’s all they have one anyway. So it doesn’t matter.
Patrick Hathaway
But then you try and launch in the UK, and they have all these idiots trying to get red blinds, and green blinds, and blue ones. They can’t find it because it’s… Google aren’t actually indexing this stuff, because it takes too long to load into JavaScript, like that’s where you could have problems with this.
Rasmus Sørensen
Yeah, that’s a good point. Because we know that the searchers are there, it’s like, we might not create 10 different pages based on color, for example. And then we do a lot of JavaScript. But the key is though, just the challenge is to actually get you in front of those different types of searches, and then figuring out how to do it without having 9000 lines of JavaScript.
Romain Damery
Yeah. Otherwise, you’re only going to rank for your brand.
Rasmus Sørensen
Exactly.
Romain Damery
And the white blinds.
Rasmus Sørensen
Which in some cases is good, but… Or even white blinds, which is also a good idea. But it’s not enough, though.
Dixon Jones
Well, are we… On that one, and if it’s not going if it’s going too far away from the the crawling tree, you said, Patrick, one version is great, that’s perfect, canonicals, not as good as one version. And then we talked about speed, but then Google advocates if you can, then use [AMP 003802] because it speeds everything up. So are you a fan of AMP or not a fan of AMP, because that gives you two versions, two URLs of all your content anyway. Who wants to jump in on AMP?
Rasmus Sørensen
Well, I’ll make it short. I haven’t never touched it and I’m not sure I will. I’m no developer by no means
Dixon Jones
The old guard. Oh, God. Okay.
Rasmus Sørensen
No, just build a fast pace.
Dixon Jones
I’ve been like that as well. I’ve been I’ve been anti-AMP. But I think that’s the old people and not the young people there. So it would be interesting we go over here.
Rasmus Sørensen
Right.
Patrick Hathaway
I think it depends on niches, right? If you’re in news and stuff, AMP is a big deal because you’ll have so much traffic from mobile in the first place. And getting that real estate at the top is really, really important. So we don’t have to do it for our own sites and we don’t have clients anymore because AMP came in after we need the client. So, hopefully we can actually get some proper feedback from Romain on that. But yeah, I see it as something which is important, depending on which niche you’re working in. And otherwise, you probably don’t need to worry about it.
Rasmus Sørensen
Romain.
Romain Damery
Yeah, like news for sure. News, it’s kind of like almost a requirement. I think like I’m not against against AMP it’s just like, maybe you should be more open source then it’s less of Google owned product. But because it’s kind of like hosting the cache of the AMP version and all that.
Romain Damery
But other than that, it can be, even if you’re not in news, it can be a good band-aid, like a quick fix to a site that’s really slow out of control and then there’s just too much. There’s going to be a huge project we’re gonna have to rebuild the whole site with different technology and a very different stack and different concerns from the get go, and that doesn’t happen overnight. And if it’s like you’re two years away from that then maybe doing some AMP pages may actually help you in Google but you need to make sure you have a dedicated team though that really works on the AMP site because you can’t just like slap an AMP plugin on it and call it a day because the conversion rates may also go down the drain too. So, it’s like a micro mobile site you kind of have to like support that.
Dixon Jones
I was surprised though recently. I just, I’ve only… I’m not an AMP person and I’m not a very clever person when it comes to actual proper website stuff. But I found it’s really easy to put AMP on a WordPress site now because WordPress goes through this [inaudible 004045] inbuilt thing and it’s press few buttons and it did speed up the site, for sure. I mean, I tested on Lighthouse. I wonder if Lighthouse and Sitebulb get mixed up a lot.
Romain Damery
For sure.
Dixon Jones
And it speeded up quite a bit. It did come up with a couple of extra errors, but I thought AMP was going to be a pain in the arse but it turned out to be quite easy.
Romain Damery
What was it for? Blog content?
Dixon Jones
Yeah, it was just a WordPress blog. So it’s pretty straightforward.
Romain Damery
Exactly. Did you like how it looked?
Dixon Jones
I would use it on the eCommerce site though, I’m not sure that it’s prime for that still.
Romain Damery
Yeah.
Rasmus Sørensen
But I haven’t looked into it for a while, so I wouldn’t really know.
Dixon Jones
Yeah, I’m not convinced about the longevity of AMP as a concept anyway, I think it’s-
Rasmus Sørensen
I think it’s kind of a fixer, isn’t it? It’s more of fixing a symptom.
Romain Damery
Yeah, exactly.
Rasmus Sørensen
Going into root cause.
Romain Damery
But the idea of it is good. I mean when we have clients, and they’re like, “How do we stop building with the…” and we tell them, “AMP like. Try AMP like” and then don’t worry, like [inaudible 004149] things will add themselves on top of it, but you should aim to develop something that is AMP like. I think because it’s just so fast and that it’s really amazing and great, for eCommerce, or blog, or users around the world, and 3G or whatever. It just works well.
Rasmus Sørensen
I think that’s a good point to actually get the conversation going about fast sites and then you could do some tests and see where we’re actually converting better on faster sites and then get the whole internal discussion going whether or not we should look into AMP for our businesses. So it could be like the icebreaker for having that discussion. So it doesn’t become well i think it’s fast, well, I think it’s slow. It’s-
Dixon Jones
Is it faster or slower than AMP? Okay.
Rasmus Sørensen
Yeah, exactly.
Dixon Jones
Okay, so we’re pretty much up to 45 minutes already, so it doesn’t take very long guys. But what I like to do before we go though, is firstly, say thanks again to Majestic. Also, if you haven’t used Majestic’s new thing, they’re Majestic monitor, that’s pretty cool. A different product that they’ve got that really sort of leverages its data to try and figure out who’s an influencer and who’s not in different verticals, and it’s a different product to the sort of the link building stuff that Majestic does, but give that a try. And I think you’ll find it a nice interesting fresh new kind of idea. Anyway, but before we go, that’s my advert for Majestic anyway.
Dixon Jones
Guys, if people want to get in touch with you what… Where can they find you and maybe not so much with Rasmus, he’s in-house, because unless they want to buy blinds. What kind of people do you want to make contact with you and how do they get in contact with you, Romain?
Romain Damery
Well, you can find me on LinkedIn, on Twitter. I’m coming back to Twitter. So, believe it or not. So yeah, you can find me there. Hit me up there if you need any help with crawling, rendering, indexing, obviously your rankings as well, right? And I’m sure you do if you’re listening right now. So yeah, feel free to hit me up. But hurry because maybe like your traffic, I may disappear from Twitter again. So, be proactive about it.
Dixon Jones
Okay, I’ll let Patrick go last so he can have the last word because I’m sure he’ll sell himself. Rasmus, what-
Rasmus Sørensen
No worries.
Dixon Jones
… Where can we find you?
Rasmus Sørensen
Well, on Twitter, on Facebook, on Instagram, on LinkedIn. I think maybe in your Facebook feed once in a while.
Dixon Jones
So, you just put Sorenson in with that funny O thing that we haven’t gotten our keyboard.
Rasmus Sørensen
Rasmus Sørenson in, not many of us so I should pop up.
Dixon Jones
Excellent. Okay, Patrick, I’m sure you’d like all of our audience to come and speak to you?
Patrick Hathaway
Pretty much, yeah. So, thank you Dixon. Yeah, so you can find me on Twitter as well. I only really hang out there so @HathawayP you’ll find me there. But over on Sitebulb, so everybody should’ve probably downloaded at the start but if you haven’t you can do that now. Anyone who does SEO, basically needs a crawler and therefore needs Sitebulb. Can I also, while I’m here, plug a live stream Q&A which we’re doing a bit like this in a week today, eight o’clock. We’ve got Lilly Ray and Jono Alderson and we’ll be talking about business case structured data. So if you go to youtube.com/sitebulb you can go check it out there and sign up the reminder thing. A week today.
Dixon Jones
Excellent and Lily works with Romain, of course.
And Jono works with Yoast. So, okay, guys, absolutely thank you ever so much. I’m sorry to anyone on Facebook for me not being able to see the questions on the Facebook feed. That’s just because Zoom and Facebook just don’t seem to work as well as they should really.
But we’ll see you again next month. Keep it on the majestic blog for the next topic. And I’m going to press the buttons now to stop recording and everything. And I think when I press it, everything dies and we cut you all off. So I’ll email you to say thank you afterwards, but I really appreciate you guys coming in. So thanks ever so much guys.
Patrick Hathaway
Thank you, Dixon.
Romain Damery
Thanks, everyone.
Dixon Jones
Bye.
Previous Webinars
Follow our Twitter account @Majestic to hear about more upcoming webinars!
Or if you want to catch up with all of our webinars, you can find them on our Digital Marketing Webinars page.
- October 25th, 2024 – Majestic turns 20! - October 25, 2024
- Key SEO Tactics for Entity Optimisation (Live Podcast) - October 24, 2024
- How AI is being used to power organic growth – Live Podcast - October 4, 2024