I am building a system to separate fraudulent transactions, these will be then manually verified, helping me in turn build a labelled dataset over time.

For now I have transaction data and customer behavior Information.

I intend to do this like this:The possible fraud cases include:- Abuse of all cashbacks and discounts ( Coupons / Vouchers / Auto Refund)- Retailing

- Acquiring sensitive SKUs

Since I don’t really have labelled data, I am going with the unsupervised learning approach (isolation forest).I plan on having 3 modules : Users, SKUs, Localities For the last 2 I am suffering with setting meaningful thresholds, I standardized slope of sales trend and intercept, then divided them to get a compound variable which I am using to compare. Please share thoughts and or Resources.