AI-Powered Transaction Categorisation
We built a two-layer categorisation workflow that handles the repetitive work automatically and sends edge cases to review.
The Challenge
The accounting team was processing high transaction volume across multiple clients, and manual coding was slowing month-end work down.
- Merchant text was often vague or misleading.
- Client spending behaviour changed over time.
- Important context lived in notes and history, not just descriptions.
The Approach
Phase 1: ML baseline
First, we shipped a supervised model trained on each client's historical ledger data.
- One model per client
- Features from transaction text + ledger history
- Feedback loop from accountant corrections
This handled the bulk of recurring transactions well.
Phase 2: Context layer for edge cases
Then we added Gemini for transactions where pure pattern matching was not enough.
- Uses notes and surrounding context when descriptions are unclear
- Flags low-confidence decisions for review
- Improves outcomes on unusual purchases and unseen merchants
We kept the ML layer as the default path and called Gemini only when needed to keep API usage practical.
New Workflow
- Import transactions and notes.
- Run ML categorisation.
- Route uncertain rows through Gemini.
- Review flagged entries and feed corrections back.
The Results
- Manual review time dropped significantly.
- Categorisation became more consistent across clients.
- Edge cases were easier to audit and resolve.
- The system improved continuously as corrections were captured.
Technical Highlights
- Scikit-learn classification pipeline
- Text + ledger feature engineering
- Gemini integration for context-aware fallback
- Feedback pipeline for iterative retraining
TL;DR
Most transactions are now handled automatically by ML, with Gemini covering the tricky edge cases. Accountants still stay in control, but spend far less time on repetitive categorisation.
