When to use this
Use this playbook if you are setting up logging from scratch, migrating tools, or dealing with alert fatigue.
Goal
Answer the only incident question that matters
Your logging program should reliably reconstruct:
who did what, from where, with what access.
- Who: identity and permissions
- What: actions and changes
- Where: device, IP, location, tenant/context
- Access: roles, elevation, admin activity
What NOT to do first
These are common failure patterns. They create noise, not answers.
Avoid
“Log everything” configurations
You will collect volume without prioritization. Investigations slow down and alert fatigue rises.
Avoid
Verbose app debug logs as a starting point
Debug logs are not the same as security evidence. Start with user and admin actions first.
Avoid
Dashboards before questions
If you do not define incident questions first, dashboards become cosmetic reporting.
Avoid
SIEM tuning before scope and ownership
Rules without ownership and response actions produce alerts that no one closes.
What to do first (evidence-first sequence)
Follow this order. It gets you to a defensible baseline fast.
Step-by-step
Four signals first
- Identity: sign-ins, MFA, lockouts, risky login indicators, privilege changes
- Endpoint: malware/quarantine events, new admin group membership, suspicious process execution (high-signal)
- Email: forwarding rules, mailbox delegation, unusual access, suspicious link/attachment indicators
- Admin & audit: changes to roles, policies, logging settings, and critical configuration
Once these are stable, expand into application logs and cloud service logs—selectively, based on incident questions.
Minimum alert set (starter pack)
Start small. Each alert must have an owner and a next action.
Alerts
High-signal alerts you can actually operate
- New admin account created or privileged role assigned
- Multiple failed logins followed by a success
- MFA disabled or authentication policy weakened
- Email forwarding rule created or mailbox delegation granted
- Suspicious sign-in location or impossible travel pattern
- Endpoint malware detected, quarantined, or protections disabled
- Logs stop arriving (visibility gap)
Definition of done
You are not “done” when logs exist. You are done when they produce answers.
Done when
A defensible baseline is in place
- You can reconstruct a user/admin action timeline in under 30 minutes.
- Log access is restricted and auditable.
- Retention is set and matches business/regulatory needs.
- Every alert has an owner and a written next step.
- You have run at least one tabletop using your logs to answer “what happened?”
If you want the full baseline, use:
Minimum viable logging for small teams →