Use Sitebulb’s S3 Export feature to automatically export audit data to any S3-compatible storage solution.
This feature is particularly useful for data warehousing, if you are looking to independently host crawl data, or use it for your own data modeling.
The S3 Export feature allows you to export bigger CSV files, beyond the 1M rows limit of standard Sitebulb exports.
Prerequisites
To set up automated S3 Data Exports, you will need your AWS S3 credentials, including your S3 endpoint, access key, and secret key. And you should have created at least one data bucket.
This feature is only available on Sitebulb Cloud on certain plans. If you wish to add to S3 export feature to your Sitebulb Cloud plan, please contact sales.
Enable S3 Exports
You will find S3 Exports settings under the ‘Data Exports’ tab in your audit settings.
Add and verify your credentials
Add your S3 endpoint, access key, and secret key, then click Connect to verify the settings and establish the connection.
Select your S3 data bucket
Once you have successfully connected to your S3-compatible endpoint, you will see every available data bucket in the drop-down list.
Need to add a new bucket?
Make the necessary changes in your S3 Endpoint, then click Refresh to update the list of available buckets.
Save and Crawl
At this point, you can save your settings and set your crawl running.
Once your Audit has completed, you should see the S3 Export status on your Audit Overview.
Access your exported data
To access your data, you will need to log into your S3 account and navigate to the relevant data bucket. You should see the CSV export automatically populated as soon as the audit has finished.
Your data is now ready to be used and analyzed!
What is included in the export
The S3 Data Exports contain a list of all Internal URLs crawled as part of your Audit, alongside the respective data columns.
