Want to know how to extract URLs from sitemap? Read this article till the end to get to know about various methods to do this job. But, first, let us provide you with some of the basic information before proceeding to the methods.
What is a Sitemap?
A sitemap is a file where you provide different sorts of information, including videos, pages, and other files available on the website. Google crawlers tend to read this file to crawl your site more effectively.
A sitemap tells Google which pages and files on the site are significant, as well as provides useful insights about these files.
Difference Between The XML and HTML Sitemap
The main distinction is that XML sitemaps are made for web search tools, though HTML sitemaps are made for people or basically, for humans.
An XML sitemap is planned fundamentally for search engine crawlers. By assessing the XML record, a search engine spider insect may rapidly and proficiently remove all of the crucial data about your webpage.
Why To Extract URLs From A Sitemap?
When extracting links from the sitemap, it will provide you with a list of all the links of web pages listed on the website.
But we are not here to tell you the “why” because you already knew that, instead let’s proceed to the steps involving how to extract URLs from the sitemap.
The URLs of sitemaps placed in .xml and .html format can be extracted using several methods and they both have slightly different methods to do so.
We have layout several methods to extract or download URLs from sitemap, ranging from google sheets to online websites and so on.
Without any further ado, let’s get to the first one.
Methods To Extract or Download URLs From Sitemap:
Extract Sitemap Using Google Sheet
Generally, to extract URLs from sitemap, using google sheets could be a decent option to proceed with. When extracting or downloading URLs from sitemap using google sheets, you come up with two options, i.e. JavaScript & Single code.
Let’s get to know how the two methods work.
We are considering https://couchdeck.com/page-sitemap.xml to illustrate the methods.
JavaScript
- Find the URL of the sitemap you wanted to extract.
- Create a new google sheet.
- Look for the menu bar, then click on the Extensions section.
- Select “App Script.”

You will now be redirected to the script page, where we will construct a new javascript function. For the time being, you can copy the code below and put it into the script editor.

function sitemap(sitemapUrl,namespace) {
try {
var xml = UrlFetchApp.fetch(sitemapUrl).getContentText();
var document = XmlService.parse(xml);
var root = document.getRootElement()
var sitemapNameSpace = XmlService.getNamespace(namespace);
var urls = root.getChildren('url', sitemapNameSpace)
var locs = []
for (var i=0;i <urls.length;i++) {
locs.push(urls[i].getChild('loc', sitemapNameSpace).getText())
}
return locs
} catch (e) {
return e
}
}
- Now select the save button.
- After this, it’s time to execute the test.
Note- If it shows zero(0) error, then it refers that the implemented script is perfectly executed.
- While executing, you will be redirected to a new window where you have to click on “Allow” to connect your sheet to an external service.
The formula has been implemented perfectly, now it’s time to extract the URLs from the sitemap using google sheets.
- Go to the google sheets you were using, and type =sitemap(“your sitemap URL”, “Namespace”)
For example;
Sitemap URL- https://couchdeck.com/page-sitemap.xml
Namespace URL- http://www.sitemaps.org/schemas/sitemap/0.9
Namespace URLs are always the same and do not have to be changed.
The formula would be-
=sitemap("https://couchdeck.com/page-sitemap.xml","http://www.sitemaps.org/schemas/sitemap/0.9")
- After entering the formula, press “enter”.
Now, all the extracted URLs will be displayed on the sheet.
Single code
This method is too easy to use and can be easily executed without implementing a script code. Yes, you heard it right, execution of a single code can do the job well.
=IMPORTXML("https://couchdeck.com/page-sitemap.xml", "//*[local-name() ='url']/*[local-name() ='loc']")
This is a working formula and can be used to extract or download URLs from the sitemap.
Extract URLs/Links From XML Sitemaps with Screaming Frog
To extract URLs from a sitemap, software called Screaming Frog should be added to your device. It is a well-popular tool for SEO involving extracting URLs from a sitemap.
This strategy also works well for sitemap index files, which contain a list of sub-sitemaps.
Follow these steps to extract URLs from the sitemap using Screaming frog.
- Run the Screaming Frog SEO Spider Tool.
- Click on Mode & then select List.

- Choose Upload.

- Download Sitemap.
- Now add the sitemap URL in .xml format.
Now, you are great to go. The extracted URLs will be shown on the software itself.
Extract URLs From XML Sitemaps with command-line tools
When extracting URLs from sitemaps, command-line tools could be a better alternative option, so let’s get to know the exact process to extract URLs from sitemap.
Follow the steps below to extract URLs from sitemap using command-line tools

- Open the terminal.
- Type the given command-
curl -s https://couchdeck.com/page-sitemap.xml
- And, you will get all the extracted URLs from the provided sitemap.
Extract URLs From XML Sitemaps with Online Websites
Extracting URLs could be a one-second task if you consider online websites to do so.
No script and no code are required, instead, you just have to place the sitemap link in the desired box and you will eventually get all the extracted URLs from the sitemap.
There are several websites to do this job, therefore it wouldn’t be hard for you to find one. Here’s one example of a website that helps you to extract URLs from the sitemap.

https://www.convertcsv.com/url-extractor.htm
Note- The above website can be used to extract URLs from both types of sitemaps i.e. .xml and .html format.
Extract URLs from HTML Sitemap
There is no big difference or so when comparing both types of sitemaps. As most people use the .xml form, it is more convenient and popular than the .html form.
To extract URLs from the HTML sitemap, you don’t require some command lines at first, you can do so by just placing the link of the HTML sitemap on the online extractor tools available on the web.
Here’s is a list of some popular websites that will help you to extract URLs from HTML sitemap.
- https://tools.fromdev.com/html-link-extractor.html
- http://tools.buzzstream.com/link-building-extract-urls
- https://www.convertcsv.com/url-extractor.htm
You can use any one of the above websites because they all are highly recommended and convenient for extracting URLs from HTML sitemap.
Extract URLs from HTML Sitemap – Chrome Extension
Last but not least, this method is too convenient for most people. In this method of extracting URLs, you just simply download the chrome extension to download all the links of the webpage and the extension will now do the desired job.
Follow the steps to extract URLs from HTML sitemap
- Open Chrome web store
- Click on extension
- Then, search for the chrome extension, in our case, we are considering a popular extension, “Sitemap Parser”.
- Click on “Add to Chrome.”
- Now, the extension is successfully added to your chrome profile and you can now use it to perform the desired action.
There are other extensions too, but we found Sitemap Parser commendable, but you are free to use any other extensions for the same.
Final Note
So, that’s all we can provide you in simpler terms. If you have any suggestions or so, kindly drop a comment below.
CouchDeck is a trusted brand providing comprehensive digital marketing solutions designed to fuel business growth. As specialists in diverse areas such as SEO, Google Ads, Facebook Ads, SEM, Web Development and Hosting, YouTube Marketing, Local SEO, and Social Media Marketing, CouchDeck has an established track record of delivering consistent, impactful results. Serving a clientele of over 300+ satisfied customers across India, United States, Canada, and Australia, We are committed to helping businesses flourish in the digital landscape. For a FREE consultation, please reach out to us via email at [email protected].