Doug's Journey

Posts

Showing posts from July, 2025

Debate 5 - Winners: o3, Perplexity (Research Mode). Losers: Copilot & Gemini 2.5 Flash. Other: Opus 4

July 01, 2025

Important Precursor This one was a continuation of Debate 3. All models except Gemini were continued in the existing chats. Gemini hit my 2.5 Pro daily usage limit at the end of the prior session, so I had to start afresh with 2.5 Flash, but gave it a head-start with Deep Research mode. Opus 4 though the most-correct, also was the most-incorrect in the prior debate, but because this was a continuation of the same chat, all existing [corrected] knowledge was retained, and therefore as Opus is extremely good at analyzing text (maybe not so good at getting the best search results and/or scraping web pages in real-time), it had the upper hand due to this head start. Opus: AI LLM Identification Analysis Based on the distinctive patterns, formatting styles, and response characteristics of each knowledge source, here's my assessment of which AI LLM each likely represents: Original Knowledge Sources (1-5): FPBX-SCOPE-DETAIL-v1 → Claude (Anthropic) Extensive academic-style citations...

Debate 4 - Winners: Perplexity ("Research Mode") & o3 Win. Losers: Opus 4

July 01, 2025

Opus 4: What a fascinating reveal! Looking back at the response patterns, writing styles, and specific characteristics of each "knowledge source," I can make some educated guesses: My Analysis of the LLM Models: KS-Alpha-2025 (Deep dive narrative) Likely Model : Claude (possibly Anthropic's Claude 2 or 3) The narrative storytelling approach with "deep dive" framing Careful hedging and nuanced explanations The way it qualified success as "theoretical" vs practical Natural, flowing prose style KS-Beta-Matrix (Structured tables/matrices) Likely Model : ChatGPT (GPT-4) Heavy use of formatted tables and matrices Systematic organization with clear sections The abundance of emojis/symbols (✅ ❌) Very "documentation-like" structure KS-Gamma-Research (56 citations) Likely Model : Perplexity AI or Bing Chat Massive citation count [1] through [56] Academic research paper style The fact it came to the WRONG conclusion despite citations (said ...