To carry out a structured data audit you first need to setup Sitebulb to collect structured data when it crawls your website.
For the purpose of this guide we’re going to assume that you want to create a new project, but the same process can be followed if you’re setting up a new audit within an existing project, or setting up a previous audit to re-run including structured data.
Once you click the ‘Start a new Project’ button, you’ll be presented with the form below:
Name your project and add the URL of your website. Once you click 'Save and Continue', you'll be taken to the 'Audit Setup' page.
You can perform structured data validation with the HTML Crawler or the Chrome Crawler. However, if the website uses JavaScript to load in structured data markup, you will need to select the Chrome Crawler.
On the audit setup page, scroll down to the 'Extraction' section and tick the 'Structured Data' checkbox as shown below:
That's it - you're good to go! If you click 'Start Now' the audit will run with the Sitebulb crawler collecting structured data from your website.
The structured data extraction can be enabled alongside any other data collection options. But consideration should be given to resource requirements as some websites can contain extensive structured data and result in resource intensive audits.
Exploring structured data audit results
Once you've set up Sitebulb to collect structured data, and run your crawl, you're then ready to start exploring the audit results. This guide will take you through the basics of doing just that.
You'll find the structured data section in the main, left hand navigation menu:
When you open the structured data section of the tool, the first thing you'll see is the structured data menu at the top of the page, with tabs for each sub-section.
Above the sub-section tabs, you'll also notice two buttons. The first, 'Printable PDF' button generates a structured data overview PDF containing the pie charts which can be seen beneath the 'Overview' tab (we'll look at those in a moment). And the 'Export Structured Data Summary' button gives the option to export your data in either CSV or Google Sheets format.
Throughout the tool, reporting reflects the two core perspectives from which Sitebulb analyses structured data:
Schema - Schema reporting looks at all structured data within the Schema.org vocabulary, and in turn, Schema Validation follows the technical specification laid out by Schema.org.
Search Features - This analysis focuses only on structured data supported by Google, and carries out Search Feature Validation against Google's specification and guidelines, which go beyond the Schema.org framework.
We'll now give an overview of what you can find within each tab and how to explore your data:
Overview Tab
The overview page contains pie charts giving a high-level view of structured data across your website. On the left side you will see charts relating to Schema, and on the right, charts relating to Google Search Features.
URLs with Search Features and URLs with Schema - This indicates how many pages Sitebulb has identified as containing Schema markup or Search Features. Note that there may be multiple instances of structured data on any given URL. And a Search Feature will always be Schema, but Schema will not necessarily always be a Search Feature.
Search Feature URL Validation and Schema URL Validation - These graphs show how many URLs passed and failed validation checks. In order to pass validation, there must be no errors on the page, so even if the page contains multiple structured data entities that pass validation, the URL will still be recorded as 'failed' if a single error is present.
The same piece of structured data may pass Schema validation, but still fail Search Feature Validation. For Search Features Validation, a URL can still pass validation with 'warnings' present as these are not considered essential.
Search Feature Validation and Schema Entity Validation - These charts are similar to the previous 'URL Validation', however these look at individual structured data entities and search features, of which there may be several on a URL.
Search Features Tab
As the name suggests, this tab focuses on structured data supported by Google for it's Search Features.
Each Google search feature type found during the crawl can be seen in it's own row along with associated data. The blue 'View' button opens up a list of all URLs where that search feature type has been found, and the green 'Export' button lets you choose between downloading the data in CSV or Google Sheets format.
When errors and warning have been found, clicking on the red 'See Errors' button opens up further details for that row in the form of nested tables.
Errors are issues which must to be resolved for your structured data to be eligible for Google search features and rich results, whereas warnings are suggestions of how you can optimise your structured data beyond the bare essentials.
The 'learn more' links point to additional Search Features error and warning documentation on the Sitebulb website. And the Schema type and property names link to the associated documentation on the Schema.org website.
The blue 'View' buttons to the right of each row open up a full list of all the URLs affected by that error.
Schema Tab
The Schema tab has the same functionality as seen on the Search Features tab, but contains all Schema.org structured data on your site, not just the types supported by Google.
Again, rows can be expanded using the red 'See Errors' button, and as with Search Features, links to Schema error explanations and Schema.org documentation are provided for each error.
Unlike Google search features validation, Schema validation gives errors, but not warnings.
URLs, Entity Explorer and Properties Explorer
These last three tabs allow you to explore all of your structured data at different levels:
URLs - This lists all URLs that contain structured data.
Entities - This lists all of the individual structured data entities, of which there may be several per URL.
Properties - This lists all of the structured data properties. Most entities will have several associated properties.
As with other Sitebulb explorers, you can apply advanced filters to your data. With the URL explorer you can also add additional columns to view other audit data alongside your structured data, have a look at our guide on How to Customise URL Lists to find out more.
In all three of the explorers you'll notice a blue 'Structured Data' to the left of each row.
This allows you to drill down further into the structured data for individual URLs using the structured data inspection tool:
Property Explorer Example
As mentioned earlier, both the entity explorer and property explorer allow you to apply advanced filters to your structured data.
With the property explorer, you can start interrogating the actual content of the structured data, which can make advanced filtering very powerful. For instance, on a book website you might want to just look at pages containing books from a particular author, published within a particular time frame, or only those currently in stock.
Example: Find all books with a 5 star rating
This example uses a series of 'AND' filters to return all Book search features, which have a 5 star rating.
The filters could easily be adjusted to only show those with 1 star ratings. Or you could look at those with 4 star reviews, with a view to identify books which could be promoted and pushed to 5 stars.
Alternatively you could apply additional filters to drill down to 5 star rated books which have search feature errors. This kind of approach can be useful for identifying high priority issues on your most important pages.
How to audit JavaScript generated structured data?
Some websites use JavaScript to generate their structured data. Using the HTML crawler with these websites may not render the structured data, therefore the JavaScript crawler should be used.
A simple way to tell if structured data is being generated with JavaScript is to view the source code (CTRL + U shortcut in most browsers) of a page which you know contains structured data, then search the page for 'schema.org'. If the search does not find any matches (unlike the example below), it is likely that the structured data is being generated with JavaScript.
If you do need to use a JavaScript crawler, select 'Chrome Crawler' from the 'Crawler Type' dropdown in the 'Crawler Settings' of your audit.