Itinerary v2 Prompt - 11-way Doability Benchmark

2026-06-25 - OLD vs NEW vs 9 research-grounded variants - 20 real-saves scenarios - 220 Sonnet generations - blind Opus panel

Verdict (two rounds). Round 1 (11-way) proved the soft-priority graft beats both OLD (5.96) and NEW (7.01) outright - every graft did. Round 2 (champion: the 4 strongest head-to-head) settled the port config: the anchors-pool structural split is essential, and V9 (anchors split + intensity tags + rhythm + variety + clustering) wins the decisive comparison at 8.11/10 (avg rank 2.07). V4 - which topped the wide 11-way field - drops to LAST head-to-head, exposing that its rhythm rule rode a weak field rather than a strong structure.

Which prompt builds the best, most doable trips - using saves wisely (no forced cramming), no repeats, in-city except labeled day-trips?

Key findings

  1. Soft-priority is the foundational win. Every one of the 9 grafts beats NEW's hard "every save MUST appear" rule on the overall judge score - even the minimal V1 (7.59) clears NEW (7.01) and OLD (5.96). The biggest single lift is just replacing the must-include rule with soft-priority + no-dup + same-city + don't-overload.
  2. Doability is where OLD/NEW actually fail. OLD doability 5.16, NEW 6.29 vs the best graft's 7.89. NEW's hard must-include over-packs days; OLD invents (gap-fill 6.2/plan) and schedules restaurants as activities (1.15/plan, vs ~0.3 for pool-driven prompts).
  3. The anchors-pool structural split is the load-bearing ingredient. The champion round (the 4 strongest variants head-to-head) is decisive: V4 (rhythm rule alone, the 11-way winner) drops to LAST (7.28) because it lacks the split, while every variant that HAS it (V7/V10/V9) scores 7.77+. V4 won the wide 11-way field only because weak singletons diluted it; head-to-head it is exposed.
  4. V9 wins the decisive head-to-head: 8.11 overall, 17 firsts, avg rank 2.07 - a clean monotonic V9 (8.11) > V10 (7.93) > V7 (7.77). Layering rhythm + variety + clustering ON TOP of the anchors split keeps improving the trip; the earlier "kitchen-sink over-constrains" read was an artifact of the wide field.
  5. V10 (anchors + rhythm) is the deterministically cleanest near-tie: 93% priority coverage (tied best), 0.05 duplicates (lowest), 0.8 gap-fill (lowest). The combination beats rhythm-alone head-to-head (V10 over V4 in 12/20 scenarios), confirming the levers are additive, not redundant.

Method

What makes a good trip (research-grounded): don't over-schedule (cardinal sin); cluster by area; intensity rhythm; variety/anti-fatigue; time-of-day sequencing; light arrival/departure; personalization; feasibility is the bottleneck (academic LLM-planning benchmarks: best models ~33% feasibility, collapsing past ~10 coupled constraints).

The 11 variants

VariantLeverDelta on top of NEW pool
OLDreferencemain names-only prompt (USER SAVES / PRIORITY SAVES / KNOWN PLACES; model picks + invents)
NEWreferenceAshwin branch as-is: collection pool, every HEARTED/SAVED/ICONIC MUST appear
V1rules (minimal)soft-priority + no-dup + same-city + dont-overload (pool unchanged)
V2rules (pacing)V1 + hard per-day caps by pace + full-day = +1 stop + light arrival/departure
V3rules (clustering)V1 + cluster each day by neighbourhood, minimize backtracking
V4rules (rhythm)V1 + alternate heavy/light days, mid-trip light day on 5d+
V5rules (variety)V1 + no two same-type anchors in a row, mix types (anti-fatigue)
V6pool tagsV1 + per-item [FULL-DAY]/[HALF-DAY]/[QUICK] tags + time-of-day ordering
V7structuralV1 + pool split ANCHORS vs IF TIME PERMITS (hearts/iconics never trimmed)
V8combodoability-max: pacing + clustering + time-of-day + intensity tags + ROUTING block
V9combobest-of: anchors split + intensity tags + rhythm + variety + clustering

Anatomy of each variant (real prompt content)

Every variant shares an identical system prompt, trip-context, day-allocation, seasonality, and output-schema block - they differ ONLY in how the user's saves are delivered. Excerpts below are the real rendered text from scenario s6_japan_epic (Japan 10d, dense + 1 hearted).

OLD (main) - names-only

USER SAVES (collection, weave in where natural, ignore mismatches):
PRIORITY SAVES (the user hearted these - include them in the itinerary where they fit the day's city + flow, ahead of other saves): "Nishiki Market"
KNOWN PLACES IN OUR DATABASE (real, popular places we have photos + details for - STRONGLY PREFER these EXACT names when they fit the day's city + theme, so the trip shows rich detail; you may still add other well-known places where these don't cover the day):

Loose name lists; the model picks/invents and may ignore saves ("ignore mismatches").

NEW (Ashwin) - hard must-include

RULES: pick the best 3-4 items PER CITY PER DAY from that city's pool, ordered sensibly through the day. Every [HEARTED] item and every [SAVED]/[ICONIC] item MUST appear somewhere in the trip. Do NOT invent places outside the pool except thin gap-fill when a day is short - and write " (gap-fill)" at the end of the note for anything you add. Copy pool names EXACTLY.

Pool-driven, but every SAVED/ICONIC must appear - this is what over-packs dense days.

Shared soft-priority graft (the new baseline, in ALL of V1-V9)

RULES:
- PRIORITY (no forced use): schedule [HEARTED] first, then [ICONIC], then [SAVED]/[GEM]. Cover as many high-priority items as fit a COMFORTABLE day. It is better to leave a low-priority [SAVED] out than to overload a day - do NOT force every pool item in.
- ONE PLACE ONCE: never schedule the same place on two days, or twice in a day.
- STAY IN THE DAY'S CITY: every item must be in that day's city, or a well-known day-trip you return from (label any day-trip in the note).
- DON'T OVERLOAD: respect the PACE; cramming is the cardinal mistake. Order each day as a sensible route with minimal backtracking.
- GAP-FILL: invent a place only to round out a thin day, only in that day's city, never a duplicate; append " (gap-fill)" to its note. Copy pool names EXACTLY.

V1-V9 - delta on top of the shared graft

V1 - rules (minimal)

Added rule(s): none - shared graft only
Pool format: Flat list best-first flat list (same as NEW)

V2 - rules (pacing)

Added rule(s):
- DAY BUDGET: chill = 2-3 items/day; balanced = 3-4; packed = 4-5. A full-day site (theme park, big fort/palace, national park, safari) fills a day - pair it with at most 1 light nearby stop.
- LIGHT EDGES: the arrival day and the departure day are light (1-2 items); never pack a travel day.
Pool format: Flat list best-first flat list (same as NEW)

V3 - rules (clustering)

Added rule(s):
- CLUSTER BY AREA: build each day around ONE neighbourhood/zone so lunch and the afternoon are near the morning anchor. Minimize cross-town backtracking.
Pool format: Flat list best-first flat list (same as NEW)

V4 - rules (rhythm) winner

Added rule(s):
- INTENSITY RHYTHM: alternate heavier and lighter days - energy does not reset overnight, so avoid two heavy days back-to-back. On trips of 5+ days make one mid-trip day deliberately light.
Pool format: Flat list best-first flat list (same as NEW)

V5 - rules (variety)

Added rule(s):
- VARIETY: do not stack two same-type anchors in a row (e.g. two temples, two museums). Mix each day across types - a sight, some nature/outdoors, a neighbourhood walk, a hands-on experience - to avoid sightseeing fatigue.
Pool format: Flat list best-first flat list (same as NEW)

V6 - pool tags

Added rule(s):
- TIME-OF-DAY: order each day by fit - markets and sunrise spots early; viewpoints/sunsets late; indoor museums for midday. Each pool item is tagged [FULL-DAY]/[HALF-DAY]/[QUICK]; budget the day by those (one [FULL-DAY] is most of a day).
Pool format: Intensity-tagged every item tagged [FULL-DAY]/[HALF-DAY]/[QUICK] for time budgeting
Tokyo (days 1,2,3,4):
  [ICONIC] [HALF-DAY] "Tsukiji Outer Market"
  [ICONIC] [HALF-DAY] "Senso-ji Temple"
  [SAVED] [HALF-DAY] "Akihabara Electric Town"
  [SAVED] [HALF-DAY] "Robot Restaurant Show — Shinjuku"
  [SAVED] [HALF-DAY] "Golden Gai — Shinjuku"
  [SAVED] [HALF-DAY] "teamLab Borderless"
  [SAVED] [HALF-DAY] "Shibuya Crossing & Hachiko"
  [GEM] [HALF-DAY] "Kyu-Furukawa Gardens"
  [GEM] [HALF-DAY] "Ukima Park"

V7 - structural

Added rule(s):
- ANCHORS FIRST: each city's pool is split into ANCHORS (do these first) and IF TIME PERMITS. Fill each day from ANCHORS first; pull from IF TIME PERMITS only to round out a lighter day. Never exceed a comfortable day to cram more in.
Pool format: Anchors split pool pre-split in code into ANCHORS (hearts+iconics, never trimmed) vs IF TIME PERMITS
Tokyo (days 1,2,3,4):
  ANCHORS (schedule these first):
    [ICONIC] "Tsukiji Outer Market"
    [ICONIC] "Senso-ji Temple"
    [ICONIC] "Kinkaku-ji"
    [ICONIC] "Yasaka Shrine"
    [SAVED] "Akihabara Electric Town"
    [SAVED] "Robot Restaurant Show — Shinjuku"
    [SAVED] "Golden Gai — Shinjuku"
    [SAVED] "teamLab Borderless"

V8 - combo

Added rule(s):
- DAY BUDGET: chill = 2-3 items/day; balanced = 3-4; packed = 4-5. A full-day site (theme park, big fort/palace, national park, safari) fills a day - pair it with at most 1 light nearby stop.
- LIGHT EDGES: the arrival day and the departure day are light (1-2 items); never pack a travel day.
- CLUSTER BY AREA: build each day around ONE neighbourhood/zone so lunch and the afternoon are near the morning anchor. Minimize cross-town backtracking.
- TIME-OF-DAY: order each day by fit - markets and sunrise spots early; viewpoints/sunsets late; indoor museums for midday. Each pool item is tagged [FULL-DAY]/[HALF-DAY]/[QUICK]; budget the day by those (one [FULL-DAY] is most of a day).
Pool format: Intensity-tagged every item tagged [FULL-DAY]/[HALF-DAY]/[QUICK] for time budgeting
Tokyo (days 1,2,3,4):
  [ICONIC] [HALF-DAY] "Tsukiji Outer Market"
  [ICONIC] [HALF-DAY] "Senso-ji Temple"
  [SAVED] [HALF-DAY] "Akihabara Electric Town"
  [SAVED] [HALF-DAY] "Robot Restaurant Show — Shinjuku"
  [SAVED] [HALF-DAY] "Golden Gai — Shinjuku"
  [SAVED] [HALF-DAY] "teamLab Borderless"
  [SAVED] [HALF-DAY] "Shibuya Crossing & Hachiko"
  [GEM] [HALF-DAY] "Kyu-Furukawa Gardens"
  [GEM] [HALF-DAY] "Ukima Park"
+ extra block:
ROUTING & FEASIBILITY (build a DOABLE day, then fill it):
- For each day first pick the anchor: one [FULL-DAY], or up to two [HALF-DAY] sights in the same area.
- Then add nearby [QUICK] stops that are on the way - cluster by zone, minimize backtracking.
- Sequence by time-of-day (markets AM, viewpoints/sunset PM) and respect opening hours.
- Honour SEASONALITY: avoid weather-exposed sites in the worst window for the month; keep heavy days off arrival/departure.

V9 - combo

Added rule(s):
- ANCHORS FIRST + CLUSTER: fill each day from ANCHORS first (IF TIME PERMITS only rounds out a light day), and build the day around one zone to minimize backtracking.
- INTENSITY RHYTHM: alternate heavier and lighter days - energy does not reset overnight, so avoid two heavy days back-to-back. On trips of 5+ days make one mid-trip day deliberately light.
- VARIETY: do not stack two same-type anchors in a row (e.g. two temples, two museums). Mix each day across types - a sight, some nature/outdoors, a neighbourhood walk, a hands-on experience - to avoid sightseeing fatigue.
Pool format: Anchors split pool pre-split in code into ANCHORS (hearts+iconics, never trimmed) vs IF TIME PERMITS
Tokyo (days 1,2,3,4):
  ANCHORS (schedule these first):
    [ICONIC] [HALF-DAY] "Tsukiji Outer Market"
    [ICONIC] [HALF-DAY] "Senso-ji Temple"
    [ICONIC] [HALF-DAY] "Kinkaku-ji"
    [ICONIC] [HALF-DAY] "Yasaka Shrine"
    [SAVED] [HALF-DAY] "Akihabara Electric Town"
    [SAVED] [HALF-DAY] "Robot Restaurant Show — Shinjuku"
    [SAVED] [HALF-DAY] "Golden Gai — Shinjuku"
    [SAVED] [HALF-DAY] "teamLab Borderless"

Leaderboard - Opus panel (0-10)

#VariantOverallDoabilityGeo-efficiencyRhythm+VarietySelectionNarrativeBorda1stavgRank
1V4
rules (rhythm)
7.92 7.898.068.047.628.0141574.08
2V9
combo
7.76 7.597.657.807.738.0334995.18
3V7
structural
7.75 7.547.767.597.917.9640694.23
4V8
combo
7.71 7.817.827.797.277.8635485.10
5V1
rules (minimal)
7.59 7.357.577.587.547.9233575.42
6V2
rules (pacing)
7.59 7.467.907.527.207.8629956.02
7V3
rules (clustering)
7.49 7.157.577.427.397.9230655.90
8V5
rules (variety)
7.44 7.177.527.457.157.9228056.33
9V6
pool tags
7.43 7.077.687.287.257.8927336.45
10NEW
reference
7.01 6.297.126.747.087.8422127.32
11OLD
reference
5.96 5.165.866.135.227.436209.97

Overall = mean of the 5 dimension means. Borda = rank points summed across all judges (higher better). avgRank = mean placement (1 best).

Deterministic objective scores

VariantPriority covSaves usedDupsRest-as-activityCity-lockEmpty daysOverloaded daysGap-fillOut-of-city
OLD81%66%0.21.15100%00.156.20.4
NEW90%86%0.150.3100%00.10.70.2
V191%80%0.10.35100%00.051.30.15
V286%74%0.050.35100%0010.1
V389%79%0.050.35100%00.051.450.25
V486%78%0.050.35100%00.11.30.15
V586%77%00.4100%001.80.2
V687%75%0.10.35100%001.10.1
V794%78%0.10.35100%001.350.15
V886%73%0.050.3100%00.10.650.1
V992%75%0.20.4100%001.50

Priority cov = % of HEARTED+ICONIC saves scheduled. Saves used = % of all day-eligible saves (context, not a target - smart omission is good). Restaurant-as-activity = saved eateries wrongly scheduled as day items (the OLD bug). Lower is better for dups / rest / empty / overloaded / gap-fill / out-of-city.

Per-scenario breakdown

s1_thailand - Thailand 7d (dense+hearted, 3 cities)

7d - a couple - pace balanced - must-haves 40 (hearts 6) - winner V7
OLD1
NEW2
V125
V27
V313
V424
V59
V625
V726
V814
V919

The decisive differentiator is the hearted must-have The Ancient City (Muang Boran), which only P9 schedules - and it does so sanely, isolating it on day 3 with just Chinatown and the evening flight south, while still covering Wat Saket, Jim Thompson, BACC, Chatuchak, Grand Palace, all three Phuket iconics, and both Krabi anchors; its only real flaw is scheduling Cafe 8 98 (a cafe) as a day item, which costs it some selection points but not the top spot given it alone honors the full hearted set. P8, P6, and P3 form a tight second tier: each covers every must-have EXCEPT The Ancient City, keeps days well-clustered and well-paced (light arrival/departure, beach-and-temple alternation, no restaurants scheduled), and writes specific non-cliche summaries - P8/P6 edge P3 slightly on rhythm. P11 and P5 match that coverage and P11 even adds IconSiam, but P11 also schedules Cafe 8 98 as a day item. The bottom of the field fails on doability and coverage: P1 is weakest - it omits THREE must-haves (The Ancient City hearted, Freedom Beach iconic, Tiger Cave hearted) and crams a ferry, a longtail Railay run, and the 1,237-step Tiger Cave climb onto a single travel day. P2 is nearly as weak, missing two hearted items (Wat Saket and The Ancient City) and overloading its departure day with a dawn Tiger Cave climb plus a full James Bond Island day-trip. P7 misses Wat Tham Suea as a distinct item and squeezes a sunset Tiger Cave climb into a ferry-arrival day, while P4 and P10 each drop an iconic (Karon Beach) and The Ancient City. Across the board the strongest plans separated themselves by must-have completeness and by refusing to bolt full-day boat trips or step-climbs onto arrival/departure days.

s2_italy - Italy 6d (thin->topup, hearted spread)

6d - a couple - pace balanced - must-haves 10 (hearts 3) - winner V4
OLD0
NEW10
V128
V217
V318
V429
V512
V66
V713
V816
V916

The central trap is Villa del Balbianello: a hearted must-have nominally filed under Rome but physically on Lake Como (4+ hours each way), so a Rome day-trip is infeasible. The strongest plans either omit it cleanly while staying well-paced (P2, P5) or include it with honest acknowledgment of its infeasibility (P4 flags it as a pre-trip add-on rather than faking a normal day). P2 leads because every Amalfi/Capri must-have lands (Duomo, Marina Grande Beach, Gardens of Augustus, plus the saved Piazzetta/Faraglioni/Marina Piccola), pace is balanced with a light arrival day, no restaurant is mis-scheduled, and routing is feasible; its only blemish is the omitted Villa. P5 mirrors this with correct Salerno-ferry routing. P4 honestly handles Villa (rewarded on selection_judgment) but pays in doability with a 4-item d2 and a sprawling Ravello+Vietri+Amalfi d4. The weakest tier collapses on doability and coverage: P7, P9, and P10 each drop TWO must-haves (Villa plus the iconic Marina Grande Beach), and P10/P9 pair that with sprawling 4-item Amalfi days. P11 is worst by a wide margin -- it crams five anchors into d2 (full Vatican Museums + Sistine + St Peter's + Piazza Navona + Pantheon + Spanish Steps), an exhausting and infeasible day, AND it schedules two restaurants as day items (Ristorante Belvedere on d4, the Michelin Il Riccio on d6), a direct violation of the food-rail rule, while still missing Villa. P1 and P6 sit mid-low because they fabricate Villa feasibility -- P1 falsely calls Como "a feasible day trip from Rome by high-speed train" and P6 calls it "Rome's garden escapes... a short transfer out of the city centre" -- which is a serious doability/geography fault even though it preserves nominal must-have coverage.

s3_japan - Japan 5d (dense+thin topup)

5d - a couple - pace balanced - must-haves 17 (hearts 0) - winner V9
OLD0
NEW16
V17
V213
V323
V416
V510
V613
V727
V810
V930

The single strongest signal is the three ICONIC must-haves (Osaka Aquarium Kaiyukan, Umeda Sky Building, Universal Studios Japan). Only P3, P7, P9 and P11 schedule all three; the rest (P1, P2, P4, P5, P6, P8, P10) drop USJ entirely - a full-day iconic anchor - which is a heavy must-have miss that caps them. P9 wins because it not only covers all three iconics but treats USJ correctly: a dedicated Day 2 with only Shinsekai appended (light, feasible), a genuinely balanced Day 3 (Kaiyukan, Tempozan, America Mura, Kuromon), and a sane 3-stop final Kyoto day - the best pace/coverage balance, with smart omission of low-priority saves rather than cramming. P7 and P3 also cover all iconics but pack their USJ day with extra stops (P7 adds Sumiyoshi+Kuromon; P3 adds Umeda+Nakanoshima+America Mura), slightly straining the theme-park day. P11 has the richest must-have coverage (all iconics plus Amazing Pass, Abeno Harukas, teamLab) but pays for it in doability: it crams USJ together with Kaiyukan and Tempozan on one waterfront day and stacks a five-item Day 3, the exact over-scheduling the rubric penalizes. The weakest is P10: it omits USJ, invents out-of-pool filler (Osaka Castle, an entire "Dotonbori Nightlife"/"Dotonbori Street Food" pseudo-item, Tsutenkaku Tower, Yasaka Pagoda), and bloats Day 3 to five heavy anchors including Abeno Harukas, Umeda, Tempozan and Kaiyukan back-to-back - poor doability, weak selection judgment, and the most generic narrative. No plan scheduled a restaurant/cafe/bar as a day item, so none was penalized on that axis.

s4_india_raj - India/Rajasthan 7d (all-iconic, no hearts)

7d - a couple - pace balanced - must-haves 11 (hearts 0) - winner V3
OLD0
NEW14
V126
V211
V326
V412
V518
V65
V720
V821
V912

All eleven plans honor the city/day allocation and schedule every ICONIC must-have (Hawa Mahal, Mehrangarh, Umaid Bhawan, Jaisalmer Fort, Sam Dunes Camel Safari), and none schedule a restaurant as a day item, so differentiation came down to pacing, day-trip sanity, and restraint with SAVED items. P2 leads: it covers all five iconics plus a sensible saved subset (bazaars, Amber, Blue City walk, Bishnoi, Suryagarh, stargazing), keeps arrival/departure light, and its notes are the most concrete (distances, gate times, why-now reasoning) without cramming. P10, P7, and P11 are nearly identical in quality with clean clustering and balanced days. P5 dips slightly because day 5 stacks a full-day Bishnoi safari onto a 5.5-hour evening drive arriving "by night," compressing the rhythm. P9 is decisively the weakest: it loads day 6 with Patwon Ki Haveli + War Museum + camel safari, inserts a 65km Osian detour on a travel day, and most damningly schedules Tanot Mata Temple (150km each way toward the Pakistan border, a ~300km round trip) plus a hot-air-balloon and a City Palace on a 7-day budget couple's departure day — geographically and time-wise infeasible, and it forces in many unrequested heavy stops rather than exercising the smart omission the brief rewards.

s5_japan_long - Japan 8d (Tokyo+Hakone day-trip+Kyoto, dense)

8d - a couple - pace balanced - must-haves 18 (hearts 0) - winner V4
OLD0
NEW26
V115
V218
V314
V428
V512
V616
V77
V85
V924

The corpus is unusually tight on must-haves: nearly every plan correctly groups the Chiba/Mount Nokogiri saved cluster (Tokyo Wan Ferry, Ropeway, Jigoku Nozoki, Hyaku-shaku Kannon, Nihon-ji, Hamakanaya) and the northern-Kyoto Tango cluster (Amanohashidate, Viewland, Ine, Chionji), so ranking turns on (a) how cleanly those clusters sit on a single sane round-trip day and (b) whether the iconic/gem anchors (TeamLab Planets, Edo Museum, Shin-Fuji Station, Nishiki Market, Ine) all land. P7 leads: it covers all 18 must-haves including the full Tango cluster on one well-paced Day 7, keeps anchors intact, and is on-voice; its only real flaw is splitting Nokogiri across two Tokyo days (two bay crossings). P2 is the cleanest single-day execution of both clusters with strong pacing and an honest "save Ine for next time" hedge, dropping only three low-priority saved (Nike, Hamakanaya, Hyaku-shaku). P11 and P9 also cover the clusters well on dedicated days but each slips on one iconic-ish item (P11 drops Shin-Fuji + Hamakanaya; P9 crams TeamLab + the ferry + Hamakanaya onto the arrival day). P3 is the most realistic, beautifully paced plan on the board but it is the cautionary case the rubric punishes: it silently DROPS the gem Ine, the entire Amanohashidate cluster, AND iconic Shin-Fuji in favor of generic Fushimi/Philosopher's-Path filler, so its selection_judgment cripples it despite top doability. The weakest plans fail on substance, not polish: P6 crams a Harajuku-Nike morning with a south-bay Nokogiri ferry day-trip (geographic whiplash) and loses Ine/Nihon-ji/Ropeway; P10 and P5 pair the far-west Edo museum with the far-southeast Chiba ferry on one day and strand Ine on a departure day; and P8 is disqualifying — it breaks the day allocation (3 Tokyo / 2 Hakone), omits the ENTIRE Nokogiri cluster, Shin-Fuji, Ine, and the Tango cluster, and pads with generic Kinkaku-ji/Kiyomizu/Byodo-in/kimono-rental tourism, scoring lowest on must-have coverage and selection.

s6_japan_epic - Japan 10d (Tokyo+Kyoto+Osaka, very dense, rhythm)

10d - a couple - pace balanced - must-haves 47 (hearts 1) - winner V3
OLD0
NEW19
V13
V218
V326
V417
V515
V624
V724
V813
V96

Coverage of the 12 critical hearted/iconic anchors (Tokyo: Tsukiji, Senso-ji; Kyoto: Nishiki Market, Fushimi Inari, Kinkaku-ji, Yasaka Pagoda, Kiyomizu-dera, Byodo-in; Osaka: Dotonbori, Umeda Sky, USJ, Osaka Castle) is the headline. P11, P3 and P10 are the strongest: each respects the user's 4/3/3 city split, hits all or all-but-one iconic anchor, keeps USJ as a clean dedicated full day, paces arrival/departure lightly, and uses a labeled Uji day-trip for Byodo-in correctly. P11 edges ahead on clean per-day clustering (Higashiyama lanes grouped, Uji as a single afternoon hop) and full anchor coverage; P3 mirrors it with sunrise Fushimi Inari and a tidy Uji trip; P10 matches but slightly over-stacks day 7 (Kinkaku-ji + Sanjusangen-do + a full Uji day trip is a stretch). P1 is solid but drops the iconic Byodo-in/Uji entirely. The weakest are P7 and P9. P7 breaks the allocation outright (Kyoto 4 / Osaka 2), which forces it to drop the iconic Universal Studios Japan completely and pad days with non-saved filler (Osaka Aquarium, Abeno Harukas, Tempozan Ferris Wheel, Ryoan-ji, Arashiyama) - a major must-have miss plus over-reach. P9 crams its full-day theme park: day 9 stacks Umeda Sky Building, Shinsekai AND a half-day Universal Studios visit, which is unrealistic, while omitting the iconic Kiyomizu-dera and duplicating Shibuya (Crossing + Scramble Crossing on the same day). Between them P9 ranks above P7 only because it at least keeps USJ on the itinerary and honors the 4/3/3 split. No plan scheduled a sit-down restaurant as a day item (Dotonbori/Tsukiji/Nishiki are food-streets/markets, correctly allowed), so no plan was penalized on that axis.

s7_london_solo - UK 5d (London single-city dense, solo, packed)

5d - solo - pace packed - must-haves 35 (hearts 0) - winner V8
OLD3
NEW15
V112
V23
V315
V419
V522
V66
V720
V830
V920

All 11 plans correctly treat the saved places as activities (no restaurants are mis-scheduled) and stay within London, so the spread comes down to must-have coverage (15 distinct score-2 iconic+gem places, treating "Big Ben" and "Big Ben & Houses of Parliament" as one), pacing on a packed solo trip, and geographic sanity. P4 wins: 14/15 must-haves (only Conduit Mews is absent, and it actually includes Conduit on day 3), tight area clusters (Westminster, South Bank+West End lanes, royal-parks-to-Little-Venice, north heights, then a deliberately light 3-stop departure day grouping St Dunstan/Wapping/St Katharine), with no cross-city zigzags. P5 is a close second - 13/15, clean clusters, and the only plan besides P11 that builds in a labeled mid-trip "rest" day, which suits a packed itinerary's rhythm. P11 has the best raw coverage tie (14/15, missing only St Dunstan) plus a rest day, but its day-5 departure crams 7 items across Hampstead → Bayswater mews → Hyde/Green Park, dragging doability. The weakest plans fail on coverage and pacing: P2 misses four must-haves (Buckingham Palace, St Dunstan, Wapping, Hertford Union Canal) for the lowest selection_judgment; P6 only schedules 12 must-haves (it buries Conduit and Bathurst Mews inside a Paddington note rather than as items, drops Wapping, and stacks a 6-stop "departure" day ending with Buckingham Palace as the final pre-Heathrow stop, which is geographically backwards); P9 is well-paced but also leaves four must-haves unscheduled. P7 has strong 14/15 coverage but is dragged down by a brutal final-day zigzag (far-north Hampstead → west Bayswater mews → far-east Wapping/St Katharine, 7 items), the worst geographic efficiency of the top-coverage group.

s8_italy_nature - Italy 9d (Rome+Lake Como+Dolomites+Positano)

9d - a couple - pace balanced - must-haves 44 (hearts 0) - winner V9
OLD0
NEW10
V120
V214
V320
V46
V510
V615
V726
V815
V929

All eleven plans share the same correct city/day skeleton, so ranking turns on must-have coverage, pacing, and clean execution. P4 leads: it captures all 5 Rome iconics (incl. the Spanish Steps), all 6 Dolomites iconics, all 4 Positano iconics, and 5 of 6 Como villas, while explicitly flagging Day 7 as a lighter "rest" day after the heavy Tre Cime circuit and keeping arrival/departure days reasonable - the textbook balance the "balanced" pace asks for. P2 and P8 are nearly as strong: both achieve full Rome/Dolomites/Positano iconic coverage with deliberately light arrival days (P2's Day-5 Val Badia settle-in, P8's two-item Day-1) and sensible villa subsets; P8 loses a hair for stacking four lakes including the Sorapis hike on Day 6. The mid-pack (P1, P7, P6) hit all the Rome iconics but pay for it with heavier days - P1's three-major Day 6 plus a 7-8h transfer day, P7's four-anchor Rome Day 2 - and P6 commits a real error by scheduling Rome's Belvedere Cederna as a Lake Como item (out-of-city). The weakest tier drops required iconics: P3 omits the Colosseum and Dobbiaco; P9, P10 and P11 all skip the Spanish Steps (an iconic) and Arienzo, with P10 additionally missing Lago di Sorapis. P5 is clearly last - it breaks the prescribed day allocation (only 2 Dolomites days vs. 3, padding Positano to 3), overloads Rome Day 2 with five marquee sites, crams four Dolomite anchors into a single day while still missing Sorapis, drops the Spanish Steps, and schedules a "Lakeside Aperitivo" as a day item (a food/bar entry that should never be a scheduled activity). Restaurant-rail discipline was otherwise clean across the field; none of the legitimately ranked plans forced low-value saved items, and smart omission of unplaceable noise (the Florence-based Officina perfumery mis-tagged under Dolomites, the redundant sixth Como villa) was correctly rewarded rather than penalized.

s9_srilanka - Sri Lanka 8d (Kandy+Ella+Nuwara Eliya+Mirissa)

8d - a couple - pace balanced - must-haves 16 (hearts 0) - winner V7
OLD2
NEW11
V117
V28
V317
V420
V510
V61
V729
V822
V928

The decisive axes here are coverage of the 7 ICONIC must-haves (Kandy View Point; Udawalawe NP, Ravana Falls, Nine Arches; Labookellie Tea Estate, Hanuman Temple, Ramboda Falls) and day-4 pacing around the Udawalawe safari. P1 leads: it carries all seven ICONIC, places Udawalawe alone with only an easy Little Adam's Peak on the safari day, keeps the Kandy and Nuwara Eliya days clustered and varied, schedules no restaurants as day items, and reads on-voice without brochure cliches. P3 and P2 are close behind with full ICONIC coverage and the same sane Udawalawe + Little Adam's day-4 structure (P3 slightly cleaner geographically; P2 mislabels the Ella-to-Nuwara-Eliya train ride as "Ella Rock"). The middle pack (P6, P8, P9, P5, P7) all cover the ICONIC set but stack a 3-4h Ella Rock hike with the Udawalawe safari (and sometimes Little Adam's too) on the same day, which is genuinely exhausting and the headline doability fault. The weakest plans fail on must-haves or overload: P11 crams two hikes plus a FULL-day Udawalawe safari into day 4 (physically impossible as written); P10 omits Udawalawe entirely (a missing ICONIC) while doubling up hikes; and P4 is worst, missing two ICONIC (Udawalawe and Hanuman Temple) while substituting a Horton Plains/World's End full-day padded with Hakgala and Peradeniya, over-scheduling its national-park day and ignoring the user's actual saved set.

s10_bali_chill - Bali 6d (Ubud+Canggu+Uluwatu, chill, temple day-trip)

6d - a couple - pace chill - must-haves 15 (hearts 0) - winner V5
OLD1
NEW2
V18
V27
V320
V424
V530
V624
V715
V812
V922

All eleven share the same correct three-city structure (Ubud/Canggu/Uluwatu, 2-2-2) and most lead with a light arrival, so differentiation comes from iconic must-have coverage, correct placement, and respecting the chill pace for a couple. The decisive must-have is Blue Point Beach, which is ICONIC in Uluwatu (on the Bukit Peninsula, NOT Canggu). P9 is strongest: it is the only plan that covers ALL eight distinct iconic must-haves (Tirta Empul, Tegallalang, Monkey Forest, The Lawn, Potato Head, Savaya, Padang Padang, Uluwatu/Kecak) AND places Blue Point in Uluwatu correctly, with chill, well-clustered days and clean narrative (its only ding is a 3-item departure day). P10/P2/P8/P7/P11 all cover the seven non-Blue-Point iconic items cleanly at a relaxed pace, differing mainly in whether Tanah Lot is folded into the Canggu drive vs a full day; they only miss the lower-priority Blue Point. The weak tail is clear: P5 and P6 mis-place Blue Point in Canggu (an out-of-city error since it sits on the southern peninsula), costing geographic efficiency; P4 covers everything including Blue Point but cripples the chill brief by stacking a 4am Mount Batur trek + quad biking + a river canyon on a single Day 2, the antithesis of a couple's slow trip; and P3 is weakest of all, omitting TWO iconic must-haves (Potato Head and Savaya) and substituting non-pool venues (La Brisa, Old Man's, Rock Bar, Nyang Nyang) that the user never saved, plus Old Man's reads as a surf bar pulled in as a day item. None scheduled true restaurants as day items, and the beach clubs are legitimate iconic anchors here, so the food-rail rule was not triggered.

s11_vietnam_thin - Vietnam 9d (Hanoi+Hoi An+HCMC, THIN->topup)

9d - friends - pace balanced - must-haves 0 (hearts 0) - winner V4
OLD2
NEW9
V127
V218
V33
V428
V512
V67
V718
V819
V922

This is a THIN case with zero hearted/iconic must-haves, so ranking turns on doability, smart subset selection without cramming, and avoiding the three concrete failure modes: full-day-trip overload (Cu Chi half-day and My Son half-day each effectively eat a half/whole day), duplicate places, and restaurants/bars scheduled as day items. P7 wins because it is the only plan that deliberately builds in a lighter mid-trip day (Day 6 typed "rest": My Son trip + a single relaxed night market), keeps Cu Chi+War Remnants as a sane standalone history day, runs light arrival/departure days, never duplicates, and schedules no food items. P4 and P1 are nearly as clean - balanced, no duplicates, no restaurants, coherent geographic clusters - P4 just slightly edges P1 on rhythm and P1 keeps a couple of three-item days that lean fuller. P5 is solid but its Day 8 crams a half-day Cu Chi trip with the War Remnants Museum and the Opera House. The weakest plans fail on concrete grounds: P2 is worst - it schedules a cafe (Egg Coffee at Giang Cafe) and a food hall (Ben Thanh Street Food Market) plus a bar street (Bui Vien) as day items, doubles up two pottery villages on one overloaded four-item Hoi An day, and front-loads a five-item history slog on Day 2. P6 and P8 both schedule a redundant second "Coconut Boat Tour" gap-fill the day after a Bay Mau basket-boat tour, and P6 additionally crams a half-day Cu Chi trip onto the departure-day flight. P3 stacks a near-duplicate "War Museum" gap-fill alongside the War Remnants Museum AND Cu Chi on one day. P9 and P10 each push a heavy three-anchor war day or a Cu Chi run onto a travel/departure day. Across all eleven, narrative quality is uniformly strong and on-voice with little brochure cliche, so it barely separates the field; doability and cram-avoidance do the real sorting.

s12_dubai_family - Dubai 4d (single-city, family+kids, luxury)

4d - a family - pace balanced - must-haves 11 (hearts 0) - winner OLD
OLD26
NEW13
V117
V211
V34
V425
V54
V625
V715
V823
V92

Both ICONIC must-haves are The Dubai Mall & Aquarium and Dubai Frame; the rest are saved items where smart subsetting is fine. The two structural traps are (a) cramming the full-day desert safari (afternoon pickup, returns ~10pm) or the Atlantis Aquaventure water park with extra heavy stops, and (b) burying an iconic. P6 is the clear winner: both iconics present, every saved item placed, and clean area-clustered days that give the desert safari its own day (old-Dubai morning + afternoon pickup) and Atlantis its own Palm day, with a light beach/Frame departure. P1, P3 and P10 are close behind with both iconics, sane safari days, and well-paced light arrival/departure days. By contrast, P5 and P11 collapse on doability: P5 stacks Atlantis (2-3h) AND the full desert safari into a single day, and P11 pairs the Atlantis water park with a metro hop to Deira's Gold Souk and Al Fahidi - both over-cram an anchor and force backtracking across the city. P7 dilutes Atlantis into a lobby/exterior stroll and runs two 4-item days (including a safari day padded with a Marina loop), weakening doability. P4 is the worst on selection_judgment because it omits the iconic Dubai Frame entirely, a heavy penalty despite otherwise sound pacing. P2, P8 and P9 schedule the desert safari on the departure day (returns late, then fly home) which is the only blemish on otherwise complete, well-clustered itineraries. No plan scheduled a restaurant/cafe/bar as a day item; the fountain shows and abra crossing are legitimate free attractions, not penalized.

s13_singapore_family - Singapore 4d (Singapore City+Sentosa, family+kids)

4d - a family - pace balanced - must-haves 16 (hearts 0) - winner V2
OLD0
NEW15
V112
V230
V317
V421
V516
V610
V715
V825
V94

All four iconic must-haves (Jewel Changi, Gardens by the Bay, Supertree Grove, Merlion Park) are the deciding axis, and no plan schedules a restaurant/cafe as a day item, so judging turns on pace, clustering, and avoiding cram. The strongest plans, P3 and P5, hit all four iconics, keep arrival/departure days deliberately light (P3 runs 2-item bookend days; P5 marks its lone gap-fill honestly), and exercise smart omission of low-priority saved items rather than forcing all eight per city - matching the balanced, kids-of-two brief in June heat where cooled conservatories are sensibly mid-day anchors. P9 is similarly disciplined (light bookends, all iconics) but trims a touch more aggressively. P1, P7, and P8 cover all iconics with rich saved coverage but pay a doability tax for stacking two 90-minute museums (ArtScience + teamLab) onto a departure day. P4 and P10 weaken on geographic efficiency by pairing a "Sentosa Merlion" with mainland Marina Bay items in the same day, implying real back-and-forth. P6 actually omits "Gardens by the Bay" as a named iconic and over-stuffs its transition day. The clear weakest is P11: it violates the 2+2 day allocation (3 days city + 1 Sentosa), drops the iconic Merlion Park entirely and never names Gardens by the Bay, invents an entire wildlife day (Zoo, River Wonders, Night Safari, Botanic Gardens) absent from the user's pool, and crams Universal Studios with a beach, Cloud Forest and Supertree onto a single departure day - an exhausting, off-brief, must-have-missing day that no balanced family could realistically execute.

s14_egypt - Egypt 7d (Cairo+Luxor, heritage)

7d - a couple - pace balanced - must-haves 13 (hearts 0) - winner V3
OLD1
NEW23
V116
V216
V324
V417
V55
V68
V724
V89
V922

All 11 plans cover the 5 Cairo iconics (Pyramids, Sphinx, Egyptian Museum, Citadel, Khan el-Khalili) and the easy Luxor iconics; the discriminators are the awkward 6th Luxor iconic (Sahara Desert Safari), out-of-city errors, and departure/arrival-day overloading. The strongest plan, P3, schedules all 5 Cairo iconics plus 5/6 Luxor iconics (only the ill-fitting Sahara Safari is omitted), commits zero out-of-city errors, uses genuinely on-route gap-fills (Colossi of Memnon on the VoK return road, Corniche walk between morning Karnak and night Luxor Temple), and paces temple days with correct early/dusk timing and a light departure day. P4, P9, P6 and P2 are close behind: clean geography, alternating intensity, and smart omission of the low-value Sahara - they trade only that one odd iconic for excellent doability. P7 and P11 are the only plans (besides flawed P1) to land all 6 Luxor iconics, but pay for the Sahara by overloading - P11 crams an Edfu day-trip plus a desert safari onto the day-7 departure, and P7 stacks Karnak + a half-day 4WD safari then a VoK+Hatshepsut+Edfu mega-day. The weakest plans fail on feasibility and geography: P1 hallucinates "Giza Necropolis" as a Luxor West Bank viewpoint (a city away) and crams Edfu + Sahara onto departure day; P5 and P8 both misplace Cairo's National Museum of Egyptian Civilization as a Luxor item and P5 also front-loads VoK+Hatshepsut+a Cairo museum onto the Luxor arrival day; and P10 lands at the bottom by missing the Sahara iconic entirely while inventing a north-bound Dendera day-trip on the departure day and overloading day 6 with Karnak + Luxor Museum + a 3-hour Edfu round-trip. No plan scheduled a restaurant as a day item (Nile Dinner Cruise is a legitimate evening experience), so that rule did not separate them.

s15_peru - Peru 8d (Lima+Cusco, altitude rest, MP day-trips)

8d - a couple - pace balanced - must-haves 11 (hearts 0) - winner V5
OLD17
NEW1
V113
V225
V313
V418
V529
V66
V727
V814
V92

All 11 plans cover every one of the 11 hearted/iconic must-haves and correctly treat "Lima Ceviche & Food Tour" as an experience (not a penalizable restaurant day-item), so differentiation comes down to doability, duplicate-avoidance, and Rainbow Mountain restraint. The strongest plans (P4, P6) keep the brutal 5,200m Vinicunca day SOLO (or with only a light on-route stop), honor a genuinely light Cusco arrival/acclimatization day, run Pisac as a clean Sacred Valley round-trip, and carry no duplicate sites; P4 edges P6 with the most disciplined Rainbow day and zero overload. P1 is nearly as clean. The weakest plans actively hurt the traveler: P8 mislabels a Pisac DAY-TRIP day as "rest," then stacks three items (Vinicunca + Qoricocha Lagoon + San Blas workshops) onto the hardest day at altitude and crams the arrival/departure days; P10 bolts a full second major Sacred Valley site (a scenic drive plus Ollantaytambo Fortress) onto an already-full Pisac day-trip, turning it into an exhausting valley marathon. P5, P7, P2, P3 and P11 each schedule the SAME fortress twice (listing both "Sacsayhuaman Fortress" and the seeded "Saqsaywaman" entry on different days), a real duplicate that costs geographic/selection points, and several of them tack an extra stop onto the post-Vinicunca return. Narrative quality is broadly good and on-voice across the field, so it barely separates the pack; doability and pacing decide the order.

s16_kerala - Kerala 6d (Kochi+Munnar, domestic, chill, family)

6d - a family - pace chill - must-haves 6 (hearts 0) - winner V2
OLD0
NEW7
V17
V226
V34
V423
V514
V621
V721
V825
V917

All eleven plans correctly schedule the single iconic must-have (Eravikulam National Park) plus the two saved Kochi anchors (Chinese Fishing Nets, Kathakali) and Munnar Tea Gardens, and all sensibly omit the strenuous Meesapulimala full-day trek given the chill pace, young kid, and February timing, so differentiation comes down to doability, duplicates, and coverage of the remaining saved item, Top Station. The top tier (P4, P11, P7, P5) pairs clean geographic clustering and a properly light arrival/departure rhythm with full saved coverage including Top Station, while avoiding duplicates or restaurant-as-day-items; P4 edges ahead on the richest, best-balanced variety, and P11/P5 are the most chill-appropriate (2-3 well-spaced items per day). P2, P6, and P3 are clean and well-paced but drop the saved Top Station, a minor selection ding. The bottom is clear: P1 schedules Cherai Beach twice as two near-identical entries on Day 3, P9 stacks both "Eravikulam National Park" and its "Nilgiri Tahr Trail" duplicate on the same morning, and P8 is the weakest by a wide margin - it breaks the Kochi day-3 allocation entirely by inserting an Alleppey overnight houseboat (out of region, then driving Alleppey-to-Munnar on Day 4) and schedules a "Kerala Sadya Feast" meal as a day item, both of which the rubric explicitly penalizes.

s17_meghalaya_thin - Meghalaya 5d (Shillong+Cherrapunji+Dawki, THIN, nature)

5d - friends - pace balanced - must-haves 18 (hearts 0) - winner V1
OLD4
NEW1
V130
V24
V317
V421
V523
V616
V79
V820
V920

All 11 plans nail the three iconic Cherrapunji anchors (Mawsmai, Double Decker, Wei Sawdong) and both Shillong iconics (Laitlum, Don Bosco), so differentiation comes down to (a) capturing the third Dawki iconic Krang Shuri Falls without wrecking the departure day, and (b) keeping the brutal Day-4 Nongriat trek (3,500 steps) from being over-stacked. P9 wins: it schedules all eight must-haves including Krang Shuri as a sane on-route stop toward Guwahati, keeps Day 4 to the trek + Wei Sawdong + an easy gap-fill cave, varies intensity well, and writes specific, cliche-free copy. P2, P5, and P11 are tightly clustered behind it - clean pacing and good clustering, but each omits the iconic Krang Shuri (a must-have miss) in exchange for a lighter, very doable trip. The weakest plans over-stack the hardest day or cram the departure day: P8 piles Rainbow Falls Trek ON TOP of the Double Decker descent (an exhausting beyond-the-bridge extension) AND still misses Krang Shuri, while P10 does the same Rainbow Falls + Double Decker + Wei Sawdong triple-stack on one trek day and also misses the iconic. P7 turns the final day into a five-item Dawki + Shnongpdeng + far-flung Krang Shuri (~30km) marathon after the long drive, hurting doability. P4 and P6 earn high selection_judgment by capturing Krang Shuri but lose doability points for crowding the trek/departure days; overall the spread is narrow because the corpus is thin and every plan respects the city-day allocation with no out-of-city or restaurant-as-day-item errors.

s18_ladakh - Ladakh 7d (Leh single-base, altitude, day-trips)

7d - a couple - pace balanced - must-haves 7 (hearts 0) - winner V8
OLD2
NEW3
V123
V210
V318
V427
V54
V616
V719
V828
V915

All 11 plans cover both ICONIC must-haves (Leh Palace + Shanti Stupa, with Shanti often via the "Acclimatisation Walk" variant) and most saved items (bazaar, rafting, Khardung La, Alchi, Magnetic Hill), so differentiation comes almost entirely from doability. The strongest plan, P4, respects altitude (Khardung La pushed to day 4, day-1/2 kept low and in-town), keeps every day a sane Leh-based round-trip, never schedules the un-doable Tso Moriri, mixes types well, and ends with a genuinely light departure day. P8 and P11 are close behind: both pace acclimatization correctly and keep transitions realistic; P11 earns top rhythm marks for an explicit mid-trip rest day (day 5) but loses a little for dangling an optional Tso Moriri on the departure day. The weakest plans cram an un-doable Tso Moriri (~220km/4hr+ each way) as a single-day Leh round-trip on the arrival-adjacent or departure day: P1 puts Tso Moriri on the DEPARTURE day before an evening flight (and also stacks Khardung La day-6 + Tso Moriri day-7 back-to-back as two exhausting 5,000m+ days), and P7 strings together Khardung La (day 5), Chang La + Tso Moriri (day 6), then a cooking class far from town on the departure day — three brutal high-altitude days running with no recovery, plus the closing day items scattered across Leh, Alchi-area and Spituk. P6 routes a realistic Nubra/Pangong overnight loop but mislabels every item as city "Leh" (Diskit, Hunder, Pangong are not in Leh), tanking geographic efficiency and breaking the single-base brief. No plan scheduled a restaurant as a pure day item (the Alchi cooking class reads as an experience, correctly paired with the Alchi day-trip), and duplicate Shanti Stupa/Acclimatisation-walk pairings were mostly split across distinct days rather than stacked, so duplicate penalties were minor.

s19_korea_packed - South Korea 6d (Seoul+Busan, urban, packed)

6d - friends - pace packed - must-haves 12 (hearts 0) - winner V2
OLD0
NEW24
V112
V229
V38
V420
V55
V613
V726
V818
V910

All 11 plans cover the 8 iconic must-haves (4 Seoul + 4 Busan) and most also fold in the 4 Seoul saved items (Jimjilbang, K-pop Gangnam, DMZ, Hongdae), so ranking turns on doability, geographic clustering, and avoiding over-cramming. P2 leads: it sequences the heaviest Seoul day cleanly (morning DMZ to far north, afternoon jimjilbang, night Hongdae), clusters Busan day 5 west (Gamcheon+Jagalchi+BIFF) and day 6 northeast (Haedong then Haeundae before departure), keeps arrival/departure light, and reads on-voice without brochure cliches. P8 is nearly identical in quality with the same disciplined 3-item Busan day and clean DMZ/K-pop/Jimjilbang chain. P10 earns points for an explicit mid-trip "rest" day (DMZ + Hongdae) that genuinely alternates intensity. P3 and P6 are clean but under-fill for a "packed" traveler (P3's departure day is a single Gamcheon stop; P6 has several 2-item days). The weakest plans cram or backtrack: P11 routes Gangnam-to-DMZ-to-Yeouido-to-Ewha across one day (huge north-south-northwest backtracking) and loads a far-north mountain temple (Beomeosa) onto the departure day; P9 crams temple+village+market into Busan day 5 and pushes iconic Haeundae onto a travel-day evening. P5 is the clear last: it is massively over-scheduled (six items on multiple days for an already exhausting pace) and, critically, schedules restaurants as day items - notably "Korean BBQ in Gangnam" plus food-market stops framed as eating stops - which violates the food-rail rule and tanks both doability and selection judgment.

s20_goa_short - Goa 3d (Panaji+Arpora, short weekend, chill)

3d - friends - pace chill - must-haves 12 (hearts 0) - winner V9
OLD3
NEW0
V117
V214
V36
V420
V520
V616
V725
V815
V929

The deciding tension is iconic coverage (Dudhsagar, Basilica, Palolem on the Panaji side; Cubana, Curlies, Anjuna on the Arpora side) versus doability over a chill 3-day window with 12 must-haves that cannot all sanely fit. P3 wins by capturing all six iconics while keeping pacing honest: it pairs Dudhsagar (a full-day SE jeep-safari) with Palolem framed as an evening arrival rather than a crammed midday stop, light arrival/island days bookend it, and day 3 is a clean Anjuna-Curlies-Cubana nightlife cluster. P2 is the most geographically disciplined (tight Dudhsagar+Spice inland day, light arrival) but drops two iconics (Palolem and Curlies), which costs it the top spot. P1 also covers all six iconics but loads the arrival day with Spice and stacks Dudhsagar+Palolem (far-apart SE/south anchors) on one day, hurting doability. The large P6/P7/P9/P10/P11 family is highly doable and clean but each omits both Palolem and Curlies (two iconics) in favor of the safe Dudhsagar+Spice / Anjuna+SatMarket+Cubana template, so they cluster mid-pack. P8 swaps Cubana for Curlies but then misses both Palolem and the iconic Cubana. The weakest are P4 and P5: P4 violates the day allocation (Panaji only day 1) and misses three iconics (Dudhsagar, Palolem, Curlies) while detouring 1.5h south to Cola on arrival day; P5 is the most exhausting, cramming Dudhsagar+Palolem+a 21:30 club on day 2 and FIVE anchors including three back-to-back clubs (Sat Market, Chronicle, Cubana) on day 3, gutting doability and selection judgment despite raw coverage.

Round 2 - champion round (confirming the port config)

The 11-way winner V4 (a rhythm rule with no structural change) looked suspicious - it might have ridden a wide field thinned by weak singletons. So the 4 strongest distinct mechanisms were re-run head-to-head on the same 20 scenarios (60 fresh blind Opus verdicts): V4 (rhythm only), V7 (anchors split only), V10 (the V4+V7 union), V9 (anchors + intensity tags + rhythm + variety + clustering).

#VariantOverallDoabilitySelectionavgRank1stPriority covDupsGap-fill
1V9
anchors split + intensity tags + rhythm + variety + clustering
8.118.037.992.071791%0.150.95
2V10
union: anchors split + rhythm rule
7.937.717.962.381893%0.050.8
3V7
anchors-pool structural split only
7.777.388.012.351893%0.151.05
4V4
rhythm rule only (no anchors split) - the 11-way winner
7.286.796.973.20786%0.051.3

Clean monotonic V9 > V10 > V7, and V4 (no anchors split) falls to last - the structural split is the load-bearing piece. Head-to-head V10 beats V4 in 12/20 scenarios (V4 wins 6, 2 ties), so rhythm + anchors are additive. Deterministically V10 is cleanest (dups 0.05, gap-fill 0.8); V9 is most selective (saves-util 74% - smart omission - and best restaurant discipline 0.3).

Example itineraries (recommended port vs OLD vs NEW)

V9 (the recommended port) vs OLD vs NEW on three scenarios. Watch how OLD invents/over-lists, NEW over-packs, and V9 stays doable.

s6_japan_epic - Japan 10d (Tokyo+Kyoto+Osaka, very dense, rhythm)

V9 (recommended)
Tokyo to Osaka: Temples, Markets and City Lights for Two
  1. Day 1 - Tokyo: Arriving in Tokyo - First Steps in Asakusa [Senso-ji Temple, Shibuya Crossing & Hachiko]
  2. Day 2 - Tokyo: Fish, Art and Neon - Tsukiji, teamLab, Shinjuku [Tsukiji Outer Market, teamLab Borderless, Golden Gai — Shinjuku]
  3. Day 3 - Tokyo: Akihabara, Godzilla and a Robot Dinner Show [Akihabara Electric Town, Nezu Shrine, Godzilla Head in Shinjuku, Robot Restaurant Show — Shinjuku]
  4. Day 4 - Tokyo: Gardens, Views and a Quiet Morning in Yanaka [Hamarikyu Gardens, Tokyo Metropolitan Government Building, Kyu-Furukawa Gardens]
  5. Day 5 - Kyoto: Shinkansen to Kyoto - Gion and Yasaka Pagoda [Shinkansen Tokyo to Kyoto, Gion District — Geisha Quarter, Yasaka Pagoda]
  6. Day 6 - Kyoto: Thousand Torii Gates and the Golden Pavilion [Fushimi Inari Taisha, Kinkaku-ji (Golden Pavilion), Kimono Rental & Tea Ceremony — Kyoto]
  7. Day 7 - Kyoto: Nishiki Market, Kiyomizu-dera and a Temple Hillside [Nishiki Market, Kiyomizu-dera, Sanjūsangen-dō Temple]
  8. Day 8 - Osaka: Kyoto to Osaka - Castle District and Dotonbori at Night [Shinkansen Kyoto to Osaka, Osaka Castle, Osaka Dotonbori Street Food]
  9. Day 9 - Osaka: Universal Studios Japan - Full Day in the Park [Universal Studios Japan]
  10. Day 10 - Osaka: Shinsekai and Sky Views Before Heading Home [Shinsekai, Umeda Sky Building]
NEW
Tokyo, Kyoto & Osaka: Ten Days of Temples, Markets and City Nights
  1. Day 1 - Tokyo: Landing in Tokyo - Asakusa and Senso-ji [Senso-ji Temple, Nezu Shrine, Kyu-Furukawa Gardens]
  2. Day 2 - Tokyo: Tsukiji, Hamarikyu and Shibuya [Tsukiji Outer Market, Hamarikyu Gardens, Shibuya Crossing & Hachiko, Golden Gai — Shinjuku]
  3. Day 3 - Tokyo: Shinjuku by Day and Night - Skyline, Godzilla and an Immersive Show [Tokyo Metropolitan Government Building, Godzilla Head in Shinjuku, teamLab Borderless, Robot Restaurant Show — Shinjuku]
  4. Day 4 - Tokyo: Akihabara, a Ring-Making Workshop and Showa Memorial Park [Akihabara Electric Town, Ring-Making Experience at Enzo Valentino, Showa Memorial Park, Ukima Park]
  5. Day 5 - Kyoto: Shinkansen to Kyoto - Fushimi Inari at Dusk [Tokyo to Kyoto Shinkansen, Fushimi Inari Taisha, Gion District — Geisha Quarter]
  6. Day 6 - Kyoto: Nishiki Market, Kiyomizu-dera and the Higashiyama Lanes [Nishiki Market, Ninenzaka Street, Kiyomizu-dera, Sanjusangen-do Temple]
  7. Day 7 - Kyoto: Golden Pavilion, a Tea Ceremony and Uji [Kinkaku-ji (Golden Pavilion), Kimono Rental & Tea Ceremony — Kyoto, Uji, Byodo-in]
  8. Day 8 - Osaka: Kyoto to Osaka - Osaka Castle and Shinsekai [Kyoto to Osaka Shinkansen, Osaka Castle, Shinsekai]
  9. Day 9 - Osaka: Universal Studios Japan - Full Day in the Park [Universal Studios Japan, Tombori River Cruise]
  10. Day 10 - Osaka: Dotonbori, Umeda Sky Building and Departure [Osaka Dotonbori Street Food, Umeda Sky Building, Osaka Amazing Pass]
OLD
Tokyo, Kyoto & Osaka: A Couple's Journey Through Temples, Markets and City Lights
  1. Day 1 - Tokyo: Arrival in Tokyo — Shinjuku first impressions [Narita Express or Keikyu Line to central Tokyo, Shinjuku Gyoen National Garden, Golden Gai — Shinjuku]
  2. Day 2 - Tokyo: Senso-ji, Tsukiji and teamLab [Senso-ji Temple, Tsukiji Outer Market, teamLab Borderless, Shibuya Crossing & Hachiko]
  3. Day 3 - Tokyo: Harajuku, Meiji Shrine and Akihabara after dark [Meiji Shrine & Yoyogi Park, Omotesando Hills, Nezu Shrine, Akihabara Electric Town]
  4. Day 4 - Tokyo: Shibuya Sky, TeamLab Planets and Hamarikyu Gardens [Shibuya Sky, TeamLab Planets, Hamarikyu Gardens, Kyu-Furukawa Gardens]
  5. Day 5 - Kyoto: Shinkansen to Kyoto — Fushimi Inari at dusk [Tokaido Shinkansen — Tokyo to Kyoto, Fushimi Inari Taisha, Fushimi Inari Shrine]
  6. Day 6 - Kyoto: Arashiyama bamboo, Nishiki Market and Gion evening [Arashiyama Bamboo Forest, Tenryu-ji Temple, Nishiki Market, Gion District]
  7. Day 7 - Kyoto: Kinkaku-ji, Kiyomizu-dera and a kimono tea ceremony [Kinkaku-ji (Golden Pavilion), Ryoan-ji Temple, Kiyomizu-dera, Kimono Rental & Tea Ceremony — Kyoto]
  8. Day 8 - Kyoto: Yasaka Pagoda, Sanjusangen-do and Byodo-in day trip to Uji [Yasaka Pagoda, Sanjusangen-do Temple, Byodo-in, Uji]
  9. Day 9 - Osaka: Train to Osaka — Dotonbori, Shinsekai and Umeda [Shinkansen or JR Line — Kyoto to Osaka, Osaka Castle, Dotonbori, Shinsekai, Umeda Sky Building]
  10. Day 10 - Osaka: Osaka Aquarium, Abeno Harukas and evening departure [Osaka Aquarium Kaiyukan, Tempozan Ferris Wheel, Abeno Harukas, Nankai Rapid to Kansai International Airport]

s1_thailand - Thailand 7d (dense+hearted, 3 cities)

V9 (recommended)
Bangkok to Krabi: Temples, Beaches, and Night Markets for Two
  1. Day 1 - Bangkok: Arrival and the Old City [Grand Palace, Wat Pho, Chao Phraya River Express, Khao San Road]
  2. Day 2 - Bangkok: Art, Temples, and a River Market [Wat Saket, Jim Thompson House, Bangkok Art and Culture Centre (BACC), IconSiam]
  3. Day 3 - Bangkok: Markets, Chinatown, and a Skyline Finish [Chatuchak Weekend Market, Chinatown (Yaowarat), King Power Mahanakhon, Asiatique The Riverfront]
  4. Day 4 - Phuket: Flight to Phuket, then Patong and Karon [Bangkok to Phuket, Karon Beach, Patong Beach]
  5. Day 5 - Phuket: Big Buddha, Freedom Beach, and Phuket Old Town [Big Buddha Phuket, Freedom Beach, Phuket Old Town, Promthep Cape]
  6. Day 6 - Krabi: Ferry to Krabi, Tiger Cave, and the Night Market [Phuket to Krabi, Tiger Cave, Wat Tham Suea, Krabi Weekend Night Market]
  7. Day 7 - Krabi: Emerald Pool, Ao Nang, and the Journey Home [Emerald Blue Pool, Ao Nang Beach, Cafe 8 98]
NEW
Golden Temples, Turquoise Shores & Night Markets - A Thai Couple's Journey
  1. Day 1 - Bangkok: Arriving in Bangkok - Temples, River & Khao San [Chao Phraya River Express, Wat Pho, Wat Arun, Khao San Road]
  2. Day 2 - Bangkok: Grand Palace, Silk & Street Art in Bangkok [Grand Palace, Jim Thompson House, Bangkok Art and Culture Centre (BACC), Siam Paragon]
  3. Day 3 - Bangkok: Canal Villages, Canopy Views & Chinatown Nights [Khlong Bang Luang & Baan Silapin (The Artist's House), Chatuchak Weekend Market, King Power Mahanakhon, Chinatown (Yaowarat)]
  4. Day 4 - Phuket: Flying into Phuket - Old Town & Beachside Arrival [Bangkok to Phuket (flight), Phuket Old Town, Promthep Cape]
  5. Day 5 - Phuket: Big Buddha, Freedom Beach & Phi Phi Island [Big Buddha Phuket, Freedom Beach, Phi Phi Island, Karon Beach]
  6. Day 6 - Krabi: Ferry to Krabi - Emerald Pools & Night Market [Phuket to Krabi (ferry), Ao Nang Beach, Emerald Blue Pool, Krabi Weekend Night Market]
  7. Day 7 - Krabi: Tiger Cave Climb, James Bond Island & Farewell [Tiger Cave, Wat Tham Suea, James Bond Island]
OLD
Bangkok Streets to Andaman Shores - A Couple's Thailand
  1. Day 1 - Bangkok: Arrival and Chao Phraya Evening [Chao Phraya River Express, Asiatique The Riverfront, Tichuca Rooftop Bar]
  2. Day 2 - Bangkok: Grand Temples and Chinatown Night [Wat Saket, Wat Pho, Jim Thompson House, Chinatown (Yaowarat)]
  3. Day 3 - Bangkok: Grand Palace, Arts, and Flight South [Grand Palace, Bangkok Art and Culture Centre (BACC), Suvarnabhumi Airport to Phuket International Airport]
  4. Day 4 - Phuket: Big Buddha, Old Town, and Patong Night [Big Buddha Phuket, Phuket Old Town, Karon Beach, Patong Beach]
  5. Day 5 - Phuket: Phi Phi Island Day Trip and Promthep Sunset [Phi Phi Island, Promthep Cape, Yona Beach Club]
  6. Day 6 - Krabi: Ferry to Krabi and Railay Beach Afternoon [Phuket to Krabi by Ferry, Railay Beach, Wat Tham Suea]
  7. Day 7 - Krabi: James Bond Island and Departure [James Bond Island, Ao Nang Beach, Krabi Airport to Mumbai]

s12_dubai_family - Dubai 4d (single-city, family+kids, luxury)

V9 (recommended)
Dubai with Kids: Skyline, Sand and Theme Parks
  1. Day 1 - Dubai: Arrival and Downtown Icons [The Dubai Mall & Aquarium, Burj Khalifa At the Top]
  2. Day 2 - Dubai: Desert Dunes and BBQ Under the Stars [Dubai Frame, Desert Safari with BBQ Dinner]
  3. Day 3 - Dubai: Palm, Beach and Old Dubai Bazaars [Atlantis The Palm, Palm Jumeirah Boardwalk, Gold Souk Deira, Al Fahidi Historical Neighbourhood]
  4. Day 4 - Dubai: Beachfront Morning and JBR Send-off [JBR The Walk and The Beach, La Mer Beachfront, Dubai Marina Walk]
NEW
Dubai with Kids: Skylines, Sand Dunes, and Theme Parks
  1. Day 1 - Dubai: Arrival, the Mall, and the World's Tallest View [The Dubai Mall & Aquarium, Burj Khalifa At the Top, Dubai Fountain Show (gap-fill)]
  2. Day 2 - Dubai: Old Dubai, Gold Souk, and a Waterfront Evening [Al Fahidi Historical Neighbourhood, Gold Souk Deira, Dubai Frame, JBR The Walk and The Beach]
  3. Day 3 - Dubai: Atlantis Water Park and Palm Boardwalk [Atlantis The Palm, Palm Jumeirah Boardwalk, Dubai Marina Walk]
  4. Day 4 - Dubai: Desert Dunes, BBQ Under the Stars, and Heading Home [La Mer Beachfront, Desert Safari with BBQ Dinner]
OLD
Dubai with Kids: Dunes, Skyscrapers and Theme Parks
  1. Day 1 - Dubai: Arrival and Downtown Icons [Burj Khalifa At the Top, The Dubai Mall & Aquarium, Dubai Fountain Show]
  2. Day 2 - Dubai: Desert Safari and Dune Camp [Dubai Frame, Desert Safari with BBQ Dinner]
  3. Day 3 - Dubai: Atlantis and Palm Jumeirah [Atlantis The Palm, Palm Jumeirah Boardwalk, JBR The Walk and The Beach]
  4. Day 4 - Dubai: Old Dubai, Gold Souk and Departure [Al Fahidi Historical Neighbourhood, Gold Souk Deira]

Recommendation - port in three layers

  1. Soft-priority graft as the new baseline RULES (prompts-v2.ts) - replaces NEW's hard "every save MUST appear". The single biggest win across both rounds; fixes over-packing plus OLD's restaurant-as-activity + invention bugs.
  2. Anchors-pool structural split (candidate-pool.ts) - the load-bearing ingredient. Split each city's pool into ANCHORS (hearts+iconics, never trimmed) vs IF-TIME-PERMITS. Every champion leader has it; the one without it (V4) came last.
  3. Rhythm + variety + clustering prose rules (prompts-v2.ts) - layered on the split, these took V9 to the top (8.11). Additive, not redundant.
Benchmarked winner = V9 (8.11); V10 (anchors + rhythm) is a deterministically-cleaner near-tie (7.93) and a simpler port. Adopt V9's rule prose, but treat its intensity tags ([FULL-DAY]/[HALF-DAY]/[QUICK]) as a deferred enhancement - the benchmark derived them from a title regex (a proxy); in production they should come from real place duration/category data before shipping. Net port = V9-minus-intensity-tags (anchors split + soft-priority + rhythm + variety + clustering), which captures the judge-winning rules at V10's clean deterministic profile.
Benchmark is subagent-simulated generation (directional relative ranking, not absolute quality); production also runs repairCityCoverage as a floor the benchmark omits, so prompts are tested on their own merit. Generated with Claude Code.