Most traders who tag their trades never extract anything useful from those tags. They either tag every trade with 15 labels until filters return sample sizes of 3, or they use tags so broad — “breakout,” “momentum” — that every setup qualifies and the data is homogeneous. This guide fixes both problems.

It covers the four-category taxonomy that actually produces edge discovery, how to filter combinations instead of individual tags, and how to use emotional scoring to find the exact point where your psychology starts costing you money.

Step 1: Build Your Four-Category Taxonomy

Every trade gets exactly four tags — one per category:

CategoryPurposeExample Tags
Setup TypeWhat pattern triggered the entrybull-flag, VWAP-reclaim, opening-range-breakout, pullback-to-EMA
Market ConditionWhat the broader tape was doingtrending, ranging, news-driven, low-volume
SessionWhen in the day the trade occurredpremarket, first-30-min, midday, power-hour
Emotional StateHow you felt at entryconfident, anxious, FOMO, revenge, neutral

These four categories cover the variables that most predictably separate your winning trades from your losing ones. Setup Type alone is insufficient — the same bull-flag setup behaves completely differently in a trending tape versus a choppy midday range.

Step 2: Enforce Mutual Exclusivity Within Categories

Cap each category at 5-8 options and never add a new tag mid-quarter without reviewing whether it belongs in an existing category. The constraint is intentional: a trader who creates 40 custom tags across all categories ends up with an average of 3-5 trades per tag combination — statistically meaningless and impossible to act on.

Write your taxonomy down before your next trading session. For each category, list every option on a single line. If you can’t fit a trade into one of the listed options, it goes into an “other” bucket — not a new custom tag. Revisit “other” once it accumulates 10+ trades, and decide if it warrants its own label.

Strict discipline here is what separates traders whose tags produce insight from traders who feel organized but still can’t answer “which of my setups is actually profitable?”

Step 3: Reach the 30-Trade Threshold Before Drawing Conclusions

Thirty trades per tag combination is the minimum sample size before any win rate or average-R figure is worth acting on — this is a standard benchmark in trading research and statistical process control. Below that, a 70% win rate could simply be a coin-flip variance.

If you’re starting fresh, retroactively tag 3-6 months of imported trade history before you run your first filter. Export your broker statements, import them into your journal, and batch-tag each trade using the four categories. You already have the data — it just lacks labels.

For day traders executing 10-20 trades per week, 30 samples per combination is reachable within 4-8 weeks. For swing traders taking 5-10 trades per month, retroactive tagging isn’t optional — it’s the only way to get enough data before the market conditions that generated those trades have rotated away.

Step 4: Filter by Combinations, Not Individual Tags

This is where most traders leave edge on the table. Filtering by a single tag — “show me all my VWAP-reclaim trades” — gives you a blended view that averages together profitable and unprofitable variants of the same setup.

Consider a real example: a trader with a $30,000 account reviews 6 months of tagged trades. Their VWAP-reclaim tag has 58 total trades. Unfiltered, the stats look marginal — 51% win rate, 1.3R average. Not clearly worth trading.

Filtered to “VWAP-reclaim + trending day + first 60 min”: 22 trades, 68% win rate, 2.2R average, +$4,800 net. Filtered to “VWAP-reclaim + ranging day + any session”: 36 trades, 39% win rate, 0.8R average, -$2,100 net.

Same setup. Two completely different edge profiles. The decision becomes obvious: stop trading VWAP-reclaims on ranging days entirely, and double size on confirmed trending-day entries in the first hour. That single pattern discovery converts a marginally profitable setup into a core strategy.

Step 5: Score Emotional State Numerically

Categorical tags like “FOMO” or “confident” tell you what, but not how much. Add a 1-5 intensity score to your Emotional State tag: 1 is fully calm and prepared, 5 is peak impulsive or distressed.

After 50+ trades with numeric scores logged, filter your P&L by score. Most traders find a clear inflection point — commonly around score 3 or 4 — where net P&L turns negative. That inflection is your personal tilt threshold.

Once you know it, the rule writes itself: above a score of 3 (or wherever your data shows the break), cut position size in half or pass on the trade entirely. This is more reliable than vague journaling notes about “feeling off” because it’s derived from your actual P&L outcomes, not self-assessment.

This approach is especially valuable for options traders where emotional entries on high-IV setups produce outsized losses relative to their intended risk.

Step 6: Act on What the Data Shows

Tag analysis without action is documentation, not improvement. Once you have 30+ trades per combination, run the following review monthly:

  1. Rank all tag combinations with at least 30 samples by net P&L.
  2. Identify the bottom quartile — combinations with negative expectancy. Remove those setups from your active playbook or impose a hard size reduction (50% of normal risk) until you have a plan to fix them.
  3. Identify the top quartile. These are the combinations where your edge is statistically demonstrated. These deserve larger size and tighter trade management criteria.

The trading expectancy formula — (Win Rate × Avg Win) - (Loss Rate × Avg Loss) — applied per tag combination gives you a precise ranking. A combination with 65% win rate and 2.1R average has an expectancy of roughly +0.98R per trade. A combination with 39% win rate and 0.8R average has an expectancy of roughly -0.24R. Cut the latter. Scale the former.

Pro Tips

  • Session tags often reveal more edge than setup tags. Many traders are net-profitable in the first 30 minutes and net-negative in midday chop — eliminating midday trading entirely can move the overall P&L more than any setup refinement.
  • Tag at entry, not at close. Your emotional state at entry is predictive; your emotional state after a winner is not what you’re trying to measure.
  • If a tag combination has 30+ trades and a negative expectancy, that’s signal — not bad luck. Treat it the same way you’d treat a broken indicator: retire it.
  • Cross-reference your Market Condition tag against the VIX or ADR% to make the category more objective. “Trending” should mean the index moved more than 1% in one direction before your entry — not a subjective feeling.
  • Build your taxonomy before a trading session, not during. Mid-trade tag decisions are contaminated by outcome bias.

Common Mistakes to Avoid

  1. Over-tagging with too many unique labels. A trader using 40 distinct tags across all categories fragments each combination to fewer than 5 trades — no statistical validity. Fix: cap each category at 8 options maximum and consolidate before adding anything new.

  2. Filtering single tags instead of combinations. “All my bull-flag trades” is a mix of winning and losing market contexts. Fix: always filter at least two dimensions simultaneously.

  3. Drawing conclusions before 30 samples. A 5-trade sample showing a 100% win rate is noise. Fix: treat any combination under 30 trades as “data collection mode only” — no sizing decisions based on it.

  4. Tagging after the trade closes. Emotional state tags logged after you know the outcome are outcome-contaminated. Fix: log the emotional score at the moment of entry, before the trade resolves.

  5. Never revisiting the tag taxonomy. A tag that made sense in a high-volatility environment may be too coarse to distinguish setups in a low-volatility regime. Fix: review your taxonomy at the start of each quarter and consolidate or split tags as your sample sizes grow.

How JournalPlus Helps

JournalPlus includes a multi-dimensional tag filtering system that lets you cross-filter any combination of your four tag categories and see win rate, average R, and net P&L for each combination in a single view. The analytics dashboard surfaces your highest and lowest-expectancy combinations automatically once you have enough samples, so you don’t need to export to a spreadsheet to run the analysis described in this guide. The emotional state field supports both categorical labels and a numeric score, which JournalPlus correlates against your P&L outcomes to surface your personal tilt threshold. All trade imports — from brokers, CSV, or manual entry — support bulk tag editing so you can retroactively build your dataset from existing history.

People Also Ask

How many tags should I use per trade?

Exactly four — one per category (Setup Type, Market Condition, Session, Emotional State). Adding more tags fragments your data and makes filters statistically useless.

How many trades do I need before tag analysis is meaningful?

A minimum of 30 trades per tag combination. With fewer, the win rate and average R figures are noise, not signal.

Can I retroactively tag trades I already imported?

Yes. Most journaling platforms let you bulk-edit tags on historical trades. Start by tagging the past 3-6 months using your broker's export — it's the fastest way to build a usable dataset.

What if I trade multiple setups and none has 30 samples yet?

Focus on your highest-frequency setup first. Tag everything else under a catch-all "other" label until you have enough data to break it out properly.

Should emotional state tags affect position sizing?

Once you identify your tilt threshold (e.g., emotional score of 4 or higher correlates with negative P&L), the practical rule is to cut size by 50% or skip the trade entirely above that score.

Was this article helpful?

J
Written by

JournalPlus Team