Using Natural Language to Detect Sensitive Data in Slack and Teams

CultureAI helps organisations detect when sensitive data is shared in tools like Slack and Microsoft Teams — particularly in public or external channels where the risk of accidental exposure is high.

Traditionally, this can be done through our existing methods of classifying sensitive data:

Keyword matching — such as detecting terms like "passport number" or "credit card"
Secure words and regex rules — for matching structured patterns like ID numbers or emails

Now, you can go further — by describing what you're looking for using natural language.

What is Natural Language Detection?

Our natural language detection feature helps you monitor real-world communication patterns, not just predefined keywords.

Rather than uploading a list of specific words, you write a description of the kind of information or topic you want CultureAI to look for. The understands the context of this description to find relevant messages — even if they’re written with spelling errors, slang, or casual phrasing.

Keyword Matching vs Natural Language Detection

Method	What it looks for	Strengths	Limitations
Keyword Matching	Exact matches (e.g. "Green", "Red", "Passport Number")	Highly specific and fast	Misses typos, variations, or indirect language
Natural Language Detection	Message meaning and intent	Understands nuance, tone, context	Depends on how well the description is written

How to Write a Great Description

To get the most accurate results, write a description that’s as specific as possible. Try to answer these four key questions:

The 4 W’s of Description Writing

What are you monitoring?
Where does it happen (e.g. online, in Slack, in specific teams)?
When is it happening (date, month, or timeframe)?
Why is it important to detect this?

To ensure you get the best description possible, try to include as many of these “W’s” as possible in your writing. You don’t need to include them all, but the more, the better!

“We want to monitor how employees feel about the recent hiring wave.”

While this example is clear in it’s intention, and covers some of the “W’s”, it doesn’t actually give much detail on what’s happening. The lack of extra context limits the scope of detections.

Example: Strong Description

“We are currently undertaking a large wave of new hires within the company. This hiring is taking place across all teams in the company, and will be commencing over a short period of time, and should be completed by the end of April.

This will involve several large training sessions within all teams, to get the new members up to speed as quickly as possible.

We are doing this a part of a rapid growth cycle, aiming to double the number of employees we currently have.

We want to monitor the general mood of our current employees around this large wave of hiring, questions they might have about it, or their thoughts positive or negative.”

Why this works:

It names the topic (hiring wave)
Gives a timeline (by end of April)
Adds context (growth initiative)
Defines intent (monitor reactions)

Tuning Your Description

Not getting enough detections?

Add more detail — don’t just list keywords
Avoid vague phrasing like “we’re testing things”
Be specific: names, dates, reasons, or internal labels help

Getting too many detections?

Reduce general terms like “new project” or “this summer”
Add boundaries: “Project X within the marketing team” or “event in July to celebrate 300 employees”

Example Detection Output

Once your description is live, CultureAI will begin scanning your collaboration tools for matches. Detected messages will appear in your dashboard — even if:

They include typos
The phrasing is casual or indirect
The message is broken across multiple lines

Even “grammatical errors” won’t stop detections — we account for natural ways people write in chat.

🧠 FAQ

What platforms does this support?

Currently: Slack and Microsoft Teams

What kinds of messages can be detected?

Messages in public or external channels — where accidental data exposure is most likely.

Can I mix keyword and natural language rules?

Yes! Use both for layered coverage — exact terms and broader concepts.

Can I edit my description later?

Yes — and we recommend doing so. Once you see how it's working in the real world, refine it to be more specific or more focused.

Summary

Natural language detection helps you monitor conversations for intent, not just exact terms. It’s a powerful way to detect sensitive data or emerging risk topics, especially where keyword-based approaches fall short.

By writing detailed, well-scoped descriptions, you’ll improve the accuracy and reduce noise — giving you more confident coverage of sensitive collaboration behaviours.