HAPPY + GPT steered via Reinforcement Learning to explore the Tg–Egb design space. Watch how generated structures evolve toward the target — then browse top candidates at any checkpoint.
Explore Trajectories →GPT-based generative model trained on HAPPY sequences, steered via RL to target desired property ranges.
Specify desired Tg ≥ 600 K and Egb ≥ 4.5 eV simultaneously as RL reward targets.
Autoregressive generation of HAPPY token sequences, constrained to valid polymer chemistries.
Reward: Tg, Egb, SAScore, Diversity (Tnms), Novelty — balancing property targeting and chemical diversity.
Simultaneously targeting Tg ≥ 600 K and Eg ≥ 4.5 eV via RL. Each point is the mean of 512 generated molecules at that step. Color gradient = training progress (light → dark). ★ = target.
Chemistry-steered RL constrained to imide scaffold structures.