Internal link mapping: How to create a visual link map

This is a process to map out the internal linking structure of a website and apply statistical analysis to find stronger pages to link from, and those that have too few incoming links.

A well thought out internal linking strategy can give a website a better chance of ranking well, compared to a site with a lot of dead ends and poor access to pages.

We can shape our website’s internal architecture so that all pages are accessible within just a few clicks, creating a great user experience and keeping things simple and quick for Google to crawl.

What you’ll need to follow this tutorial.

  1. Ahrefs
  2. Screaming Frog
  3. Gephi
  4. A spreadsheet – I’m using Google Sheets

Step 1. Gather all internal links

The first thing we need to do is crawl the site and gather all the internal links.

Open up Screaming Frog, but before entering your URL and starting the crawl, let’s change a couple of configuration options so we only bring back the results that will help us improve our internal linking.

In the top menu, go to Configuration the choose Spider.

This opens a box like the one below.

Because we only want the internal links going to pages, we’re only going to check a few of the options.

You can choose the same options as I have in the image.

Click OK.

Screaming Frog Spider Configuration

It’s also possible to only collect links within the content sections of pages by using the Screaming frog custom extraction feature. This can be a little tricky as the code you need will depend on how your site is set up. It will usually look something like this:

div[@class='post-content-text'] 

Now enter your domain and hit Start.

When it’s done, go to Bulk Export and choose All Inlinks.

Open up the .csv you’ve just created and delete the first column, then rename column B Target.

Now, because we only want the internal links going to pages, we’re only going to check a few of the options.

You can choose the same options as I have in the image.

Now enter your domain and hit Start.

When it’s done, go to Bulk Export and choose All Inlinks.

Open up the .csv you’ve just created and delete the first column, then rename column B Target.

Delete all the other columns except Source and Target.

Now we want to clean up the spreadsheet a bit. I’m using Google Sheets.

Follow these instructions:

  1. Delete row one
  2. Select the new row one, then go to ViewFreeze1 row.
  3. Sort column A by A-Z
  4. Delete all the rows that don’t have AHREF in column A
  5. Now delete all columns except B and C
  6. Rename Column B Target.

Now your spreadsheet only contains two columns, and shows all the links from each page (Source) to its destination (Target). Keep the spreadsheet tab open, we’ll come back to it soon.

2. Gather URL ratings

Now we want to get the URL rating for each of the pages on the site so when we’re mapping our internal links, we can use this extra data to add even more power to the system.

Go to Ahrefs Site Explorer and enter your domain.

In the left toolbar, under Pages, click Best by Links. You’ll see a page that looks like this:

Then choose ExportFull Export.

Ahrefs export best pages by backlinks

3. Combine internal links with URL ratings

Now we’re going to put all the URL ratings and internal link data into a single spreadsheet.

Open your new file and delete all columns except Page URL and URL ratings. Now copy those two columns into your Screaming Frog spreadsheet in columns C and D.

Google sheets URL ratings

Now:

  1. Insert one new column to the right of column A, and name it URL rating.
  2. Name the column “URL rating”.
  3. Paste this exact formula into cell B2:
    =VLOOKUP(A:A,D:DD,2,false)
  4. Press enter.
  5. Double click the little blue square at the bottom right of B2 to copy the formula into the whole column.
  6. Now you have the URL rating of each source page, along with all the pages each one links to.
  7. Export as CSV.
Google sheets export

4. Visualizing your internal links

Now it’s time to take all this data and start mapping our website architecture. There are different ways to do this to get slightly different visualizations, so it’s worth experimenting with settings once you know the basics of how it works.

Gephi setup for internal link map
  1. Open Gephi and create a new project.
  2. Now go to FileImport Spreadsheet, and navigate to where your .csv file is. Tip – sometimes Gephi can’t open my downloads folder where the sheet is, so I move the file to desktop.
  3. Choose Edges Table, and click next.
  4. In the next dialogue box, tick the URL rating box and choose Float.
  5. Untick Page URL and URL rating (desc).

You’ll get something that looks like this:

Disorderly nodes and edges of link map

Now we’ve got some kind of techno nightmare spiderweb, but we can’t get a lot of useful insights from it.

What we have now is a scrambled mess of nodes and edges.

Node = a small circle that represents a page on your site.

Edge = an arrow that represents an internal link (they look like lines until you zoom in).

Make sure you’re in the Overview tab, and in the toolbar on the left, choose a layout from the drop-down menu. Fruchtman Reingold is a good one for this.

Click run, and it will give you something more like this:

Fruchtman Reingold link map

Now it’s getting beginning to take shape, but there are a few more things we need to do to make it a visualization we can use.

In the appearance section of the Overview tab, click Nodes, then click the three circles on the top right of the toolbar, choose Ranking, then In-Degree from the drop down menu.

Each node (a page on your site) is sized according to how many links are pointing to it.

Now click the Edges tab, and choose the color palette on the right. Click Ranking and choose URL rating from the drop-down menu, then choose a color scheme you like. I’ve just used default.

Now your nodes are colored based on the URL ratings we got from Ahrefs. The higher the URL rating, the darker the node!

Gephi link map with URL ratings

Now go to the Data Laboratory tab click Copy data to other column in the box at the bottom of the screen, and select ID, then copy to Label in the popup box. This gives all your nodes the label of their respective URL.

Link map page IDs

Finally, go to the Preview tab, select Show Labels and click refresh.

I’ve kept the labels hidden for this example, but it should look something these lines:

Link map with labels

5. Analysing your link map

What we’ve got now is pretty cool, but we’re not done yet.

All this is pointless unless it can gain some valuable insights about the site structure and, more importantly, how to improve it.

From the image above, we can see that the larger nodes have more incoming internal links, and the darker colored nodes have more incoming external links (from the Ahrefs URL rating), while the smaller, lighter ones have fewer internal and external links.

Easy version:

You can zoom in to see the labels of each page and assess whether the internal linking is good for that particular page.

What I often do is create a list of important pages and check each individually in the map.

As we know, we can influence the flow of (external) link juice throughout the site by linking pages with a lot of backlinks to those that we want to rank. So on our link map, we can make a note of darker colored nodes and create internal links from those to pages with smaller nodes (that have few incoming internal links).

If you have important pages that are small and light coloured, you might want to add some internal links from larger, darker nodes.

Less easy version:

Use Gephi’s statistical analysis tools to make your decisions easier.

Some useful functions for this project are:

  • Internal page rank
  • In Degree
  • Betweenness centrality distribution

And a few ways to use them:

Size = Internal page rank

Internal page rank (similar but not the same as the Google metric PageRank) is the likelihood of someone landing on a given page if randomly clicking links. The most linked to pages will tend to be larger in this visualization.

Size = In Degree

In Degree is the relative strength of a page determined by the number of incoming internal links.

In this visualization, the larger nodes have more links coming in from other pages.

Size = Betweenness Centrality

Betweenness centrality distribution measures how often each page acts as a bridge between other pages.

Pages with the highest betweenness centrality should be good choices for building external backlinks to (linkbuilding), since they are the main internal link paths through the site.

Overall strength

When using the above functions to identify stronger and weaker pages, you can combine the outputs (in your spreadsheet) to determine which are the very strongest pages across the site that should be used for outgoing internal links, and which are the weakest pages, that need more incoming links.

A simple but effective way to do this is to combine the top 50 pages for each of the three metrics (page rank, In degree betweenness centrality) and counting how many times each appears. Then do the same for the inverse strength (bottom 50 pages for each metric).

EXTRA – Add your map to your website

This can be useful if you need to upload your map to your site in order to share it with your team/ client/ readers.

The way I like to do it might not be the best way but it’s simple and it works.

Gephi doesn’t have any immediately website compatible outputs, but you can go to ToolsPlugins, and install a plugin called Sigma Exporter.

Now you’ll have the option to export as Sigma.js template.

It will save a folder called Network. Inside are the files and folders you’ll need to upload to your site.

Did you find this useful?

Let me know and I’ll add more fun SEO stuff you can try.

Scroll to Top