prebid-integration-monitor/known_crawler_lists
2024-10-01 12:17:26 -04:00

10 lines
688 B
Plaintext

# not all formatted the same, for reference
https://github.com/privacy-tech-lab/gpc-web-crawler/blob/main/selenium-optmeowt-crawler/full-crawl-set.csv
https://github.com/InteractiveAdvertisingBureau/adstxtcrawler/blob/master/adstxt_domains_2018-02-13.txt
https://github.com/kaustubhd93/adstxt-crawler/tree/master/archives
https://github.com/zer0h/top-1000000-domains/blob/master/top-10000-domains
https://github.com/zer0h/top-1000000-domains/blob/master/top-100000-domains
https://github.com/Jirehlov/cfranking/blob/main/20240311-20240318/cloudflare-radar-domains-top-50000-20240311-20240318.csv
https://github.com/duckduckgo/tracker-radar/blob/main/build-data/generated/domain_map.json