Yahoo Italia Ricerca nel Web

Risultati di ricerca

  1. 1 gen 2023 · We present Second Thought, a new learning paradigm that enables language models (LMs) to re-align with human values. By modeling the chain-of-edits between value-unaligned and value-aligned text,...

    • arXiv:2301.00355 [cs.CL]
  2. Trained with SECOND THOUGHTS, LMs can not only re-align their generation with human values, even when the context has already been poisoned, but also show the chain of editing steps for ease of interpretability and to facilitate further edits (§4.5).

  3. 1 gen 2023 · We present Second Thought, a new learning paradigm that enables language models (LMs) to re-align with human values. By modeling the chain-of-edits between value-unaligned and value-aligned text, with LM fine-tuning and additional refinement through reinforcement learning, Second Thought not only achieves superior performance in ...

    • Ruibo Liu
  4. We present SECOND THOUGHTS, a new learning paradigm that enables language models (LMs) to re-align with human values.

  5. 31 ott 2022 · By modeling the chain-of-edits between value-unaligned and value-aligned text, with LM fine-tuning and additional refinement through reinforcement learning, Second Thoughts not only achieves superior performance in three value alignment benchmark datasets but also shows strong human-value transfer learning ability in few-shot scenarios.

  6. 1 gen 2023 · We present Second Thought, a new learning paradigm that enables language models (LMs) to re-align with human values. By modeling the chain-of-edits between value-unaligned and value-aligned...

  7. We present Second Thoughts, a new learning paradigm that enables language models (LMs) to re-align with human values. By modeling the chain-of-edits between value-unaligned and value-aligned text, with LM fine-tuning and additional refinement through reinforcement learning, Second Thoughts not only achieves superior performance in three value ...