Fundamental & Policy 2026: AI Safety Technical Course
A 8-week guided course on Technical AI Safety, following the BlueDot Impact curriculum.
About Our Reading Group|關於我們的讀書會
The NTU AI Safety Reading Group aims to build a student community in Taiwan interested in AI Safety & Alignment. As an active member of the global AI safety community, we regularly participate in international conferences across Singapore, the United States, Hong Kong, and beyond. We maintain close ties with the international AI safety network and frequently host visiting scholars for knowledge-sharing sessions.
This year, drawing inspiration from how AI safety organizations at other universities structure their programs, we decided to launch a dedicated Fundamental + Policy track — combining foundational AI safety concepts with governance and policy discussions. Our goal is to welcome people from diverse backgrounds to explore different dimensions of AI safety together, broaden awareness of global AI safety developments, and help grow the conversation within Taiwan’s own community.
Course Link: Technical AI Safety
台大 AI Safety 讀書會旨在台灣建立一個對 AI Safety & Alignment 感興趣的學生社群。作為全球 AI Safety 社群的活躍成員,我們經常參與在新加坡、美國、香港等地舉辦的國際會議,並與國際 AI Safety 網絡保持緊密聯繫,定期邀請國際學者進行知識分享。我們使用由 BlueDot Impact 與來自 OpenAI 和劍橋大學的 AI Safety 學者共同開發的課程內容,希望為大家提供全面的 AI Safety 概念和挑戰的介紹。 今年我們參考國外其他大學AI Safety的分組與課綱,希望可以藉由fundamental與policy的結合吸引更多不同background的人可以一起討論AI Safety的各個面向,除了讓更多人了解國外AI Safety的發展,也希望可以促進台灣社群對此議題的意識。
What We’ll Cover|課程內容
Over 8 weeks, we’ll explore core AI safety concepts together through readings and group discussion — each session we work through the week’s materials as a group, unpacking the key ideas, challenging each other’s understanding, and connecting concepts to real-world implications.
Topics we’ll read and discuss:
- Introduction to AI Safety — why it matters and how the field is framed
- AI alignment — why building safe AI is harder than it sounds
- Training safer models Part I — reinforcement learning from human feedback (RLHF)
- Training safer models Part II — scalable oversight techniques
- Detecting danger — model evaluations and red teaming
- Understanding AI Part I — introduction to interpretability
- Understanding AI Part II — interpretability in practice
- Minimising harm Part I — AI control and machine unlearning
- Minimising harm Part II — AI control via chain of thought
我們為期八週的讀書會將透過閱讀與小組討論來探索 AI Safety 的核心概念——每週大家帶著對當週材料的理解一同聚會,共同梳理關鍵想法、相互挑戰彼此的理解,並將所學連結至現實世界的意涵。
我們會一起閱讀並討論的主題包括:
- AI Safety 導論——為何重要、這個領域如何被定義
- AI 對齊——為什麼建造安全的 AI 比想像中更困難
- 訓練更安全的模型(上)——來自人類反饋的強化學習(RLHF)
- 訓練更安全的模型(下)——可擴展監督技術
- 偵測危險——模型評估與 Red Teaming
- 理解 AI(上)——可解釋性導論
- 理解 AI(下)——可解釋性的實際應用
- 最小化危害(上)——AI 控制與機器遺忘(Unlearning)
- 最小化危害(下)——透過思維鏈(Chain of Thought)實現 AI 控制
Who Should Join?|誰適合參加?
This reading group is for you if you…
- Want to start exploring AI safety topics and their broader implications
- Are interested in AI governance or policy and looking for people to discuss with
- Want to deepen your understanding of AI-related issues and current developments
- Care about the future impact of AI on society
- No prior AI knowledge required — just bring your curiosity!
這個讀書會適合:
- 想開始關注AI Safety相關議題的人
- 對AI Governance 或 Policy有興趣想要找人討論
- 希望可以增加AI相關知識與議題
- 任何關心 AI 未來影響的人
- 不需要 AI 相關知識,只要帶著好奇心來就可以!
Format and Schedule|形式和時間
- Weekly 2-hour sessions
- Bilingual discussions (English and Mandarin)
- Mix of presentations and group discussions
- Opportunities to engage in research projects after the reading group
- Free dinner
- 每週 2 小時的聚會
- 中英雙語討論
- 結合導讀和小組討論
- 讀書會後有機會參與研究項目
- 免費晚餐
Why Join Us?|為什麼要參加?
- Build Essential Knowledge: Understand one of the most important challenges facing humanity
- Join a Global Movement: Connect with the international AI safety community
- Career Development: Explore opportunities in AI safety research and development
- Make an Impact: Contribute to ensuring AI benefits humanity
- 建立重要知識:了解人類面臨的最重要挑戰之一
- 加入全球運動:與國際 AI safety 社群連結
- 職業發展:探索 AI safety 研究和開發的機會
- 產生影響:為確保 AI 造福人類貢獻一份心力
Real-World Impact|實際影響
The effectiveness of this curriculum has been proven. According to BlueDot Impact’s data, after completing similar programs:
- Many participants successfully transitioned into AI safety work
- Participants gained clearer understanding of AI safety challenges
- Some initiated their own AI safety research projects
這個課程的效果已經得到證實。根據 BlueDot Impact 的數據,完成類似項目後:
- 許多參與者成功轉入 AI safety 工作
- 參與者對 AI safety 挑戰有了更清晰的認識
- 一些人開始了自己的 AI Safety 研究項目
How to Join|如何參加
- Venue: TBD
- First Session: 2026/3/16 (Monday) 19:00–21:00
- Free of charge with food provided
- Limited spots available
- Both English and Mandarin welcome
Sign up now to be part of this important initiative in understanding and shaping the future of AI!
立即報名,成為理解和塑造 AI 未來的重要行動的一份子!
Contact Us|聯絡我們
Email: ntuaisafety@gmail.com
Discord: https://discord.gg/CUz4tWpggV