Beginner Friendly Past

NTU AI Safety Reading Group 2025

An 8-week bi-weekly reading group for NTU students interested in AI Safety & Alignment. Open to all — no prior AI knowledge required.

November 20, 2024

reading-groupintroductorybeginner-friendlybluedotpast2024

Sign-up: Google Form | Contact: aisafety.taiwan@gmail.com

Why AI Safety Matters Now More Than Ever｜為什麼 AI Safety 的議題刻不容緩

As artificial intelligence systems become increasingly powerful and integrated into our daily lives, ensuring their safety and alignment with human values has become one of the most critical challenges of our time. From ChatGPT to autonomous vehicles, AI capabilities are advancing at an unprecedented pace. However, with these rapid developments come significant risks and challenges that we must address.

As models grow more complex and powerful, AI systems are no longer simply following instructions — they are autonomously learning objectives and strategies from data. This shift makes alignment and safety critical: how do we ensure that AI truly understands our intentions and behaves as expected?

Specifically, the development of more advanced AI systems raises crucial questions:

How do we design reward functions to ensure AI systems understand our true intentions without “cheating” by exploiting loopholes in the rules?
As AI capabilities grow, how can we make sure they remain aligned with human values even in new environments?
How do we prevent catastrophic risks from advanced AI systems and ensure their safe deployment in the real world?

These are not only technical challenges but also ethical and long-term issues for humanity. Understanding and addressing AI safety and alignment is a priority we must tackle in our time.

隨著人工智能系統變得越來越強大並深入我們的日常生活，確保這些系統的安全性和與人類價值的對齊，已成為當代最具挑戰性的議題之一。從 ChatGPT 到自動駕駛車輛，AI 的能力正以驚人的速度提升。然而，隨著技術進步，我們必須面對一系列潛在風險和挑戰。

隨著模型變得更複雜、強大，AI 不再僅僅執行簡單的指令，而是開始從數據中自主學習目標和策略。這樣的發展讓 AI 對齊（alignment）和安全性成為關鍵議題：如何確保 AI 真正理解我們的需求並按照預期行動。

具體來說，先進 AI 系統的發展引發了一些核心問題：

如何設計 reward function，確保 AI 系統理解我們的真實意圖，並避免因為尋找規則漏洞而出現「作弊」行為？
在 AI 能力持續提升的情況下，如何保證它們在新情境中仍然對齊人類價值觀？
如何預防 AI 系統可能帶來的災難性風險，確保它們在現實世界中安全運行？

這些問題不僅僅是技術上的挑戰，還涉及倫理和人類長期的福祉。理解並解決 AI 安全和對齊問題，是我們這個時代必須優先考量的議題。

About Our Reading Group｜關於我們的讀書會

The NTU AI Safety Reading Group is an initiative to build a community of students interested in AI safety and alignment in Taiwan. As an active member of the global AI safety community, our members regularly participate in international conferences in Japan, the United States, and beyond. We maintain strong connections with the international AI safety network, frequently hosting visiting scholars for knowledge sharing sessions. Using the curriculum developed by BlueDot Impact and AI safety experts from OpenAI and Cambridge University, we aim to provide a comprehensive introduction to AI safety concepts and challenges.

Course Link: AI Safety Fundamentals — Alignment

台大 AI Safety 讀書會旨在台灣建立一個對 AI Safety & Alignment 感興趣的學生社群。作為全球 AI Safety 社群的活躍成員，我們經常參與在日本、美國等地舉辦的國際會議，並與國際 AI Safety 網絡保持緊密聯繫，定期邀請國際學者進行知識分享。我們使用由 BlueDot Impact 與來自 OpenAI 和劍橋大學的 AI Safety 學者共同開發的課程內容，希望為大家提供全面的 AI Safety 概念和挑戰的介紹。

What We’ll Cover｜課程內容

Our 8-week reading group will explore fundamental concepts in AI safety, including:

Artificial General Intelligence (AGI) and its implications
Reward misspecification and instrumental convergence
Goal misgeneralization
Scalable oversight techniques
AI interpretability
And more!

我們為期八週的讀書會將探討 AI 安全的基本概念，包括：

通用人工智能（AGI）及其影響
獎勵誤特化（Reward misspecification）和工具性收斂（instrumental convergence）
目標誤泛化（Goal misgeneralization）
可擴展監督技術
AI 可解釋性
以及更多！

Who Should Join?｜誰適合參加？

This reading group is designed to be accessible and valuable for:

Students interested in AI safety and ethics
Computer science and engineering students curious about AI development
Anyone concerned about the future impact of AI
No prior AI knowledge required — just bring your curiosity!

這個讀書會適合：

對 AI safety 和倫理感興趣的學生
對 AI 發展好奇的資訊和工程科系學生
任何關心 AI 未來影響的人
不需要 AI 相關知識，只要帶著好奇心來就可以！

Format and Schedule｜形式和時間

Bi-weekly 2-hour sessions
Bilingual discussions (English and Mandarin)
Mix of presentations and group discussions
Opportunities to engage in research projects after the reading group
Free dinner

每雙週 2 小時的聚會
中英雙語討論
結合導讀和小組討論
讀書會後有機會參與研究項目
免費晚餐

Why Join Us?｜為什麼要參加？

Build Essential Knowledge: Understand one of the most important challenges facing humanity
Join a Global Movement: Connect with the international AI safety community
Career Development: Explore opportunities in AI safety research and development
Make an Impact: Contribute to ensuring AI benefits humanity

建立重要知識：了解人類面臨的最重要挑戰之一
加入全球運動：與國際 AI safety 社群連結
職業發展：探索 AI safety 研究和開發的機會
產生影響：為確保 AI 造福人類貢獻一份心力

Real-World Impact｜實際影響

The effectiveness of this curriculum has been proven. According to BlueDot Impact’s data, after completing similar programs:

Many participants successfully transitioned into AI safety work
Participants gained clearer understanding of AI safety challenges
Some initiated their own AI safety research projects

這個課程的效果已經得到證實。根據 BlueDot Impact 的數據，完成類似項目後：

許多參與者成功轉入 AI safety 工作
參與者對 AI safety 挑戰有了更清晰的認識
一些人開始了自己的 AI Safety 研究項目

How to Join｜如何參加

Venue: Junyi Academy, MRT Ximen Station（均一平台教育基金會，捷運西門）
First Session: 2024/11/20 (Wednesday) 19:00–21:00
Free of charge with food provided
Limited spots available
Both English and Mandarin welcome

Sign up now →

Sign up now to be part of this important initiative in understanding and shaping the future of AI!
立即報名，成為理解和塑造 AI 未來的重要行動的一份子！

Contact Us｜聯絡我們

Email: aisafety.taiwan@gmail.com