Overview
Consider a financial sentiment analysis system deployed to monitor real-time market trends from social media. As user-generated content evolves, such systems often suffer performance degradation when encountering linguistic novelty, failing to adapt to shifting underlying data distributions. In this high-stakes environment, the deployed system may prove inadequate in detecting current trends, potentially resulting in tangible harm. This scenario underscores the critical inadequacy of traditional, static model training pipelines as they neglect the dynamic, non-stationary nature inherent to social media data and thus negatively impact several applications, including market trend analysis, harmful content detection for public safety, brand monitoring, and scam detection.
This tutorial posits Continual Learning (CL) as the indispensable solution, enabling systems to assimilate evolving data distributions without incurring the prohibitive resource expenditure associated with complete retraining. Distinguishing itself from general CL tutorials, this session focuses on online behavioral analytics, addressing domain-specific challenges like streaming social media data and resource-constrained environments. We will provide a comprehensive background on CL, summarize current approaches, and detail techniques tailored for dynamic, limited-resource settings. Finally, we discuss open challenges and prospective research avenues.
Target Audience
This tutorial is designed for researchers, engineers, and practitioners engaged in the design and development of systems for online behavior analytics. The audience is expected to have basic knowledge of probability, text classification, and deep learning. Prior familiarity with fine-tuning transformer-based models is preferred. No prior knowledge of Continual Learning is required.
Why This Tutorial Is Relevant to ICWSM
Behavioral Analytics Systems: The primary application addressed is social media and web-based behavioral analytics systems — a core interest of the ICWSM community. Among the various techniques employed in such systems, text classifiers assume a significant role in understanding and summarizing online content.
Continual Learning: Systems intended for deployment necessitate regular updates to accommodate shifts in the observed data distribution over time. This requirement is especially pronounced within web and social media analytics, where user behavior changes regularly with real-world events and the shared interests of the public.
Schedule
Half-day tutorial — total duration 4 hours.
| Session |
|---|
| Part I — Introduction and Background Motivation & applications within web/social media analytics |
| Part II — Experience-Replay-based Methods Approaches used for exemplar selection; techniques for utilizing replay data |
| Break |
| Part III — Regularization-based Methods Parameter regularization & function regularization |
| Part IV — Beyond Traditional Settings CL of LLMs; Knowledge-guided CL; |
| Break |
| Part V — Open Challenges and Future Directions Social Good Continual Learning; CL for Era of LLMs; Multimodal CL (MMCL) |
| Conclusion & Q&A |
Presenters
Yasas Senarath recently completed his PhD in the School of Computing at George Mason University, specializing in Social Computing, Continual Learning, and Natural Language Understanding. His Ph.D. research focuses on robust continual learning frameworks for online behavioral analytics. His recent work, Knowledge-guided Continual Learning for Behavioral Analytics Systems (2025), directly tackles the stability-plasticity dilemma.
Marcos Zampieri is an Assistant Professor at George Mason University. His main research interests are in Natural Language Processing (NLP). He regularly publishes in top-tier venues such as AAAI, ACL, CIKM, EMNLP, IJCAI, and NAACL, with work spanning offensive language and mental health modeling on social media. Marcos delivered the tutorial Countering Hateful and Offensive Speech Online: Open Challenges at EMNLP 2024, and has served as chair of workshops including SemEval, VarDial, and WMT. He was lead organizer of OffensEval-2019 and OffensEval-2020, two of the most widely participated offensive language identification shared tasks to date.
Hemant Purohit is an Associate Professor in the Information Sciences & Technology Department at George Mason University. His broad research interests lie in Social Computing, Natural Language Understanding, and Human-AI Collaboration, with a focus on creating robust applications for high-risk, dynamic environments in public services and nonprofits. He applies this research to managing natural crises (e.g., hurricanes), societal crises (e.g., hate and violence), and human crises (e.g., cyber attacks, scams). Hemant delivered an ICWSM 2013 tutorial on Crisis Mapping, Citizen Sensing, and Social Media Analytics.