Your art, designs, and product images are the heart of your shop. That’s why Big Cartel includes AI Shield - a simple way to protect your creative work from being used to train generative AI models, while keeping your shop visible to real customers.
With a single toggle, AI Shield blocks major AI training crawlers from accessing your site. We handle the technical details behind the scenes, including maintaining and updating the crawler list, so you don’t have to.
This feature was built in direct response to feedback from artists and makers on Big Cartel. In our research, 83.3% of sellers told us they want to prevent their work from being used to train AI systems. AI Shield gives you that control - without sacrificing discoverability, search visibility, or growth.
AI Shield is available on all Big Cartel plans, including our forever-free Gold option, because protecting creative work shouldn’t come at an extra cost.
In the sections below, we’ll explain how AI training differs from AI-powered discovery tools, what AI Shield blocks (and what it doesn’t), and how to enable it for your shop.
What are AI training crawlers?
AI training crawlers are automated bots that scan the web to collect content used to train generative AI models. When companies like OpenAI train models such as GPT-5, or Google trains models like Gemini, they rely on crawlers to gather large volumes of text, images, and other creative work from across the internet - including, potentially, from online shops like yours.
Here’s how it works:
- A training crawler automatically visits your shop
- It reads and collects content such as product descriptions, photography, designs, and other creative assets
- That content is stored as training data
- AI companies use this data to teach models how to understand language, recognize images, and generate new content
AI Training Crawlers vs. AI Search Crawlers: What’s the difference?
Not all AI crawlers do the same thing. There are two distinct types, and they affect your shop in very different ways.
AI Training Crawlers (what AI Shield blocks)
Purpose: Collect content to include in datasets used to train generative AI models.
Examples: GPTBot, ClaudeBot, CCBot
What they do: These crawlers systematically scan the web to gather text, images, and other creative work that may be used to train future versions of AI models.
What this means for your content: Your work could become part of a training dataset, with no direct benefit to your shop or visibility.
AI Search Crawlers (what AI Shield does not block)
Purpose: Power AI-assisted search and discovery experiences in tools like Google, Bing, and ChatGPT.
Examples: OAI-SearchBot (ChatGPT Search), Claude-SearchBot
What they do: These crawlers index your shop so your products and pages can appear in AI-powered search results—typically with a link back to your site.
What this means for your content: Your content is indexed for discovery, helping potential customers find your shop, but it isn’t used to train AI models.
Which bots does AI Shield block?
AI Shield blocks ten AI training crawlers that are commonly used to collect web content for training generative AI models.
Major AI Companies
GPTBot (OpenAI)
- Used to train ChatGPT, GPT-4, GPT-5, and other OpenAI models
ClaudeBot (Anthropic)
- Used to train Claude AI models
- Continuous web crawling for training data
Google-Extended (Google)
- Controls whether Google can use your content to train Gemini, Bard, and Vertex AI
Applebot-Extended (Apple)
- Controls whether Apple can use your content to train Apple Intelligence features
Meta-ExternalAgent (Meta)
- Used to train Meta's AI models (Llama, Meta AI)
CCBot (Common Crawl)
- Non-profit that creates open datasets of web content
- These datasets are widely used by AI researchers and companies for training
Bytespider (ByteDance/TikTok)
- Used for AI model training
AI2Bot
- Allen Institute for AI's crawler used to collect data for training open source AI models like OLMo and the Dolma dataset
cohere-training-data-crawler
- Specifically designed to collect data for training Cohere's large language models.
Kangaroo Bot
- Used by Kangaroo LLM to download training data for AI models tailored to Australian language and culture
Why did we choose these specific bots?
We were intentional about which crawlers to include in AI Shield. Our goal was to give you meaningful protection from AI training without compromising your shop’s visibility or growth.
We block: Pure Training Crawlers
These crawlers are designed specifically to collect content for AI training datasets. Blocking them helps protect your creative work, with no impact on search visibility or discoverability.
Examples include:
- GPTBot, which is used only to train OpenAI models
- ClaudeBot, which is used only to train Anthropic’s AI models
- CCBot, which collects data for large training datasets
Why this works
Because these crawlers are focused solely on training, blocking them gives you control over how your content is used—without trade-offs.
We don’t block: Dual-Purpose Crawlers
Some crawlers support both traditional search and AI-powered features. We intentionally do not block these, because they play a critical role in helping customers find your shop.
Googlebot (not blocked)
- Powers Google Search, which is essential for SEO
- Also supports Google’s AI features, such as AI Overviews
- Blocking it would remove your shop from Google Search entirely
Bingbot (not blocked)
- Powers Bing Search
- Also supports Bing Copilot and other AI experiences
- Blocking it would remove your shop from Bing Search
Why we don’t block them
With these crawlers, search and AI functionality can’t be separated. Blocking them would significantly reduce your shop’s discoverability, which runs counter to helping your business grow.
What AI Shield does not block
AI Shield is designed to protect your content without disrupting the tools that help your shop get found, shared, and measured.
Search engines still work
- Googlebot – Your shop continues to appear in Google Search
- Bingbot – Your shop continues to appear in Bing Search
- DuckDuckGoBot – Your shop continues to appear in DuckDuckGo
Your SEO and search visibility are not affected.
Social media previews still work
- LinkedInBot – Link previews on LinkedIn continue to display
- Twitterbot – Link previews on X (formerly Twitter) continue to display
- facebookexternalhit – Link previews on Facebook and Instagram continue to display
Sharing your shop on social platforms works as expected.
Analytics and tools still work
- SemrushBot, AhrefsBot – SEO and site analysis tools continue to function
- Google Analytics – Traffic and behavior tracking remains active
- Monitoring services – Uptime and performance monitoring continue uninterrupted
What to Know
This is an Honor System
AI Shield uses robots.txt, an internet standard that's been around since 1994. Here's how it works:
- When a bot wants to crawl your site, it first checks your robots.txt file
- If the bot sees it's been blocked, it should respect that and not crawl
There's no enforcement mechanism, but major companies comply.
It's Not Retroactive
If AI companies already trained on your content before you enabled AI Shield, that data may already be in their models. This only prevents future crawling.
It May Not Catch Everything
New AI crawlers appear regularly. We keep our blocklist updated, but there may be a delay between when a new AI company launches a crawler and when we add it to AI Shield.
How to Enable AI Shield
- Go to Shop Settings.
- Find the AI Shield toggle.
- Toggle it On.
Once enabled, your robots.txt file will automatically include rules blocking the 7 AI training crawlers listed above.
Need More Help?
If you have questions about AI Shield or want to report a bot that's not respecting your robots.txt file, please contact our support team.
AI Shield FAQs
Will this hurt my SEO?
No. AI Shield only blocks AI training crawlers. Search engines like Google, Bing, and DuckDuckGo continue crawling your shop normally. Your search rankings are not affected.
Will my products still show up in Google?
Yes. We block Google-Extended (the AI training control), not Googlebot (the search crawler). Your products will continue to appear in Google Search results.
Will social media link previews still work?
Yes. Social media bots like LinkedInBot and facebookexternalhit are not affected. When you share your shop on Instagram, Facebook, LinkedIn, etc., the previews will still work.
Does this guarantee protection?
No. This relies on bots respecting the robots.txt standard. Most major AI companies comply. And, while atypical, some bots may ignore it.
What if I want my content used for AI training?
Some sellers choose to contribute their content to AI training—whether to help advance the technology, gain broader exposure for their creative work, or have their ideas and expertise inform AI responses. If that aligns with your goals, simply keep AI Shield turned off and your shop will remain accessible to all crawlers.
Why not just block all AI bots?
Our approach is to help our sellers maximize sales while still giving them the choice whether their data is being used to train AI models. Simply put, we block AI training crawlers while keeping the search bots that bring traffic to your shop.