How to Detect a Website's CMS Using APIs: A Step-by-Step Guide

How to Detect a Website's CMS Using APIs: A Step-by-Step Guide
Scraping

Identifying the Content Management System (CMS) used by a website can offer valuable insights for businesses, developers, and marketers. In this article, we’ll explore why it's useful to detect a website's CMS technology, and how to retrieve this information using APIs.

Why identify a website's CMS technology?

Understanding the CMS behind a website is essential for several reasons:

  • Competitor Analysis: Knowing what CMS your competitors are using can provide insights into their technological stack, scalability, and potential vulnerabilities.
  • Technical Partnerships: If you're looking to partner or integrate with a website, knowing the CMS helps determine compatibility with your tools or services.
  • Security Awareness: Some CMS platforms are more vulnerable to certain types of attacks than others. Identifying a CMS allows you to assess potential security risks.
  • Customization Potential: Different CMSs have varying degrees of flexibility. Recognizing the CMS helps determine how easy it would be to customize or enhance certain features on a website.

In short, identifying a website's CMS can provide valuable technical and strategic insights.

How to retrieve CMS information?

Sign up for an API service

Several platforms offer APIs that detect CMS technologies, such as Piloterr, BuiltWith and WhatCMS. For this example, we'll use the Piloterr API.

  1. Create your account: Go to the Piloterr dashboard and sign up for a free or paid account, depending on your needs.
  2. Get your API key: After registration, you'll receive an API key.
  3. Set up your environment: Install the necessary libraries
pip install requests

Python basic code

Don't forget to replace PILOTERR_API_KEY with your real API key. The script assumes that the Piloterr API responses are in a format specific to our API, so it may need to be adjusted depending on the provider you choose.

  • Copy the code
  • Create a new file detect_cms.py
  • Replace the API token with your own
  • Replace variable websites by your need
  • Run the script with python detect_cms.py
import requests
import json

PILOTERR_API_KEY = '' #YOUR-API-KEY-HERE

# Piloterr API URL
url = "https://piloterr.com/api/v2/website/technology"

# List of websites to check
websites = [
    "https://www.piloterr.com",
    "https://www.carreblanc.com",
    "https://www.bobbies.com",
    "https://www.leslipfrancais.fr"
]

# List of CMS based on Wappalyzer categories
cms_list = [
    "1c-bitrix", "3dcartstores", "abicart", "ametys", "amiro.cms",
    "apostrophecms", "arc publishing",
    "asciidoc", "atg", "bigcommerce", "blogger", "bludit",
    "borderfree", "bridgeline idev",
    "broadvision", "ckan", "cms made simple", "cmSimple",
    "concrete5", "contao", "contenido",
    "contensis", "contentful", "craft cms", "cryoblock",
    "danneo cms", "datocms", "day.js",
    "dede", "diazo", "dnn", "dokuwiki", "dotcms", "drupal",
    "dynamicweb", "e107", "easydb",
    "easyengine", "ez publish", "expressionengine",
    "fandom", "flash", "flatpress", "flexcmp",
    "fluxbb", "forestry", "fork cms", "foswiki", "frog cms",
    "gatsby", "getsimple cms", "ghost",
    "gitbook", "grav", "gridsome", "groovy", "halo", "hippo", "hubspot", 
    "hugo", "hybris", "ibm websphere commerce",
    "ikiwiki", "imperia cms", "indexhibit", "instapage",
    "intercom articles", "joomla", "kentico cms",
    "koken", "komodo cms", "liferay", "lightspeed ecommerce",
    "lima", "locomotive cms", "magento",
    "mambo", "markdownguide", "matomo", "mediawiki", "miva",
    "modx", "moodle", "movable type",
    "mybb", "neos", "netcat", "netlifycms", "nopcommerce",
    "odoo", "opencart", "orchard cms",
    "oscommerce", "percussion", "phpbb", "phpmyadmin", "pimcore",
    "plone", "posterous", "prestashop",
    "pyrocms", "rbs change", "rcms", "roadiz cms",
    "rs-site", "saleor", "salesforce commerce cloud",
    "serendipity", "shopify", "shopware", "silverstripe", "sitecore",
    "sitefinity", "sitemap generator",
    "siteor", "smelting", "solodev", "squarespace", "statamic",
    "storyblok", "strikingly", "subrion",
    "textpattern", "tilda", "trac", "typo3 cms", "typo3 neos",
    "umbraco", "umiсms", "vbulletin",
    "viennacms", "virto commerce", "visual composer", "voog",
    "wap review", "webflow", "webgui",
    "weebly", "wix", "woocommerce", "wordpress", "xenforo",
    "xt:commerce", "yii", "zencart", "zendesk",
    "zephir", "zeta producer"
]

# Function to find CMS
def find_cms(technologies):
    for tech in technologies:
        tech_name = tech.lower()
        for cms in cms_list:
            if cms in tech_name:
                return tech
    return "No CMS found"

# Function to get technologies and CMS for a website
def get_technologies_and_cms(site):
    params = {"query": site}
    headers = {"x-api-key": YOUR_API_KEY}
    response = requests.get(url, params=params, headers=headers)
    
    if response.status_code == 200:
        data = response.json()
        technologies = [tech["name"] for tech in data.get("technologies", [])]
        cms = find_cms(technologies)
        return {"cms": cms, "technologies": technologies}
    else:
        return {"technologies": [], "cms": "Error retrieving data"}

# Create a dictionary to store results
results = {}

# Get technologies and CMS for each website
for site in websites:
    site_data = get_technologies_and_cms(site)
    results[site] = site_data

# Convert results to JSON
json_results = json.dumps(results, indent=2)

# Print JSON results
print(json_results)

# Optionally, save JSON to a file
with open("website_technologies_and_cms.json", "w") as f:
    f.write(json_results)

Results

After execution, we obtain a table with the name of the company, the number of investors and the founding date such as :

{
  "https://www.piloterr.com": {
    "cms": "Webflow",
    "technologies": [
      "Webflow",
      "jsDelivr",
      "Google Tag Manager",
      "Google Hosted Libraries",
      "Customer.io",
      "reCAPTCHA",
      "jQuery",
      "HubSpot Chat",
      "HubSpot",
      "Google Analytics",
      "core-js",
      "Google Font API",
      "HSTS",
      "Cloudflare",
      "Open Graph",
      "HTTP/3"
    ]
  },
  "https://www.carreblanc.com": {
    "cms": "Magento",
    "technologies": [
      "Magento",
      "Cart Functionality",
      "Sentry",
      "Algolia",
      "MySQL",
      "PHP",
      "Leaflet",
      "Hyva Themes",
      "Tailwind CSS",
      "Vue.js",
      "Alpine.js",
      "Zendesk",
      "reCAPTCHA",
      "Google Tag Manager",
      "Glider.js",
      "Didomi",
      "Splide",
      "Livefyre",
      "core-js",
      "WhatsApp Business Chat",
      "Preact",
      "Google Font API",
      "Font Awesome",
      "Avis Verifies",
      "Cloudflare",
      "Webpack",
      "Open Graph",
      "Module Federation",
      "HTTP/3"
    ]
  },
  "https://www.bobbies.com": {
    "cms": "PrestaShop",
    "technologies": [
      "PrestaShop",
      "Prismic",
      "Doofinder",
      "MySQL",
      "PHP",
      "PayPal",
      "Google Tag Manager",
      "Cloudflare Bot Management",
      "TikTok Pixel",
      "LazySizes",
      "jQuery Migrate",
      "jQuery",
      "Hotjar",
      "Google Analytics",
      "Facebook Pixel",
      "core-js",
      "reCAPTCHA",
      "MailChimp",
      "HSTS",
      "Cloudflare",
      "Open Graph",
      "HTTP/3"
    ]
  },
  "https://www.leslipfrancais.fr": {
    "cms": "Storyblok",
    "technologies": [
      "Storyblok",
      "Cart Functionality",
      "Shopify",
      "Doofinder",
      "Global-e",
      "Secomapp",
      "PayPal",
      "Apple Pay",
      "OneTrust",
      "Google Tag Manager",
      "Glider.js",
      "AB Tasty",
      "New Relic",
      "LazySizes",
      "Klaviyo",
      "core-js",
      "Boomerang",
      "Swiper",
      "Priority Hints",
      "Google Font API",
      "Avis Verifies",
      "Linkedin Insight Tag",
      "HSTS",
      "Cloudflare",
      "Webpack",
      "Open Graph",
      "Module Federation",
      "HTTP/3"
    ]
  }
}