Tutorial Reading Group 2024 #4
The 4th tutorial reading group 2024 will be held on 30th May 2024, Thursday 13:00-16:00 online via Teams. Mr. Jinwei Hu will give us a talk on “How to Control LLMs’ behaviors and Design Strategy to safeguard LLMs”.
Meeting ID: 386 776 521 756
Passcode: jnZAom
Reference:
- [1] Finetuned Language Models Are Zero-Shot Learners
- [2] Training language models to follow instructions with human feedback
- [3] Constitutional ai: Harmlessness from ai feedback
- [4] Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
- [5] NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails
- [6] Prompting Is Programming: A Query Language for Large Language Models
- [7] RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
- [8] A Causal Explainable Guardrails for Large Language Models
- [9] Safeguarding Large Language Models: A Survey
Related Codes:
-
[1] Llama Guard
-
[2] NeMo
-
[3] Trulens
-
[4] Lmql