Transformers treat inputs as sets of tokens, which works well for language but is sub-optimal for spatial data that lives in 3-D. The discussion highlights the need for new primitives that map better to distributed hardware and for architectures that can capture physical laws implicitly.
Starting with a single, well-defined model prevents the paralysis that comes from trying to absorb every ICT concept. Incremental refinement and a disciplined "start-small" mindset produce a robust, adaptable trading system.