ICWSM 2026 — Half-Day Tutorial

Continual Learning for Online Behavioral Analytics

Lecture-style  |  4 Hours  |  ICWSM 2026
Yasas Senarath  ·  Marcos Zampieri  ·  Hemant Purohit
Information Sciences and Technology (IST) Department, George Mason University, Fairfax, Virginia

Overview

Consider a financial sentiment analysis system deployed to monitor real-time market trends from social media. As user-generated content evolves, such systems often suffer performance degradation when encountering linguistic novelty, failing to adapt to shifting underlying data distributions. In this high-stakes environment, the deployed system may prove inadequate in detecting current trends, potentially resulting in tangible harm. This scenario underscores the critical inadequacy of traditional, static model training pipelines as they neglect the dynamic, non-stationary nature inherent to social media data and thus negatively impact several applications, including market trend analysis, harmful content detection for public safety, brand monitoring, and scam detection.

This tutorial posits Continual Learning (CL) as the indispensable solution, enabling systems to assimilate evolving data distributions without incurring the prohibitive resource expenditure associated with complete retraining. Distinguishing itself from general CL tutorials, this session focuses on online behavioral analytics, addressing domain-specific challenges like streaming social media data and resource-constrained environments. We will provide a comprehensive background on CL, summarize current approaches, and detail techniques tailored for dynamic, limited-resource settings. Finally, we discuss open challenges and prospective research avenues.

Target Audience

This tutorial is designed for researchers, engineers, and practitioners engaged in the design and development of systems for online behavior analytics. The audience is expected to have basic knowledge of probability, text classification, and deep learning. Prior familiarity with fine-tuning transformer-based models is preferred. No prior knowledge of Continual Learning is required.

Why This Tutorial Is Relevant to ICWSM

Behavioral Analytics Systems: The primary application addressed is social media and web-based behavioral analytics systems — a core interest of the ICWSM community. Among the various techniques employed in such systems, text classifiers assume a significant role in understanding and summarizing online content.

Continual Learning: Systems intended for deployment necessitate regular updates to accommodate shifts in the observed data distribution over time. This requirement is especially pronounced within web and social media analytics, where user behavior changes regularly with real-world events and the shared interests of the public.

Schedule

Half-day tutorial — total duration 4 hours.

Duration Session
30 min Part I — Introduction and Background
Motivation & applications within web/social media analytics
45 min Part II — Experience-Replay-based Methods
Approaches used for exemplar selection; techniques for utilizing replay data
15 min Break
45 min Part III — Regularization-based Methods
Parameter regularization & function regularization
30 min Part IV — Beyond Traditional Settings
Adapter-based learning & reservoir sampling
15 min Break
40 min Part V — Open Challenges and Future Directions
LLMs as teachers; LLM-specific CL challenges
20 min Conclusion & Q&A

Tutorial Content

Part I  ·  30 minutes

Introduction and Background

The initial segment addresses the foundational background and motivation for investigating CL for online behavioral analytics systems. We explore the general definition of the CL problem and its variants, then focus on key challenges such as catastrophic forgetting and scalability of LLMs for online behavior detection.

  • Motivation for Continual Learning in social media contexts
  • Definition of CL and problem variants (task-incremental, class-incremental, etc.)
  • The stability-plasticity dilemma and catastrophic forgetting
  • Applications: financial market analysis, crisis informatics (disaster response), cybersecurity (scam detection)
Part II  ·  45 minutes

Experience-Replay-based Methods

Replay represents the most prevalent approach employed for CL across various problems and domains. It contributes to performance enhancement without requiring the entirety of previously trained datasets. This method consists of two components: a replay buffer and model adaptation.

  • Exemplar Selection for Replay: Task-aware random selection; clustering-based approaches for diverse sample selection; influence-based selection
  • Replay-based Model Adaptation: Data concatenation strategies; handling class imbalance; data regularization with loss constraints comparing replay data on current vs. previous model versions
Part III  ·  45 minutes

Regularization-based Methods

Parameter regularization involves storing model parameters deemed important for prior tasks in a knowledge store, then utilizing that stored knowledge during current task learning as a regularizer — aiming to prevent overriding of weights critical to past tasks.

  • Parameter Regularization: Importance-weighted parameter preservation; elastic weight consolidation and related techniques
  • Feature (Function) Regularization: Constraining the feature space during new task learning to prevent divergence from features learned in prior tasks
Part IV  ·  30 minutes

Beyond Traditional Settings

The standard CL approach often assumes sufficient resources and independent data samples. However, online behavioral analytics presents unique challenges: data arrives in high-velocity bursty streams, and user behaviors are inherently sequential and context-dependent.

  • Resource-Constrained Environments: Knowledge Distillation from continually updating teacher to lightweight student models; Parameter-Efficient Fine-Tuning (PEFT) with adapters
  • Streaming Data Dynamics: Reservoir Sampling and its derivatives for minority-class representation without explicit task boundaries
Part V  ·  40 minutes

Open Challenges and Future Directions

Continual fine-tuning of LLMs remains an open problem, as techniques such as regularization do not scale well with large numbers of parameters. This section surveys emerging research directions.

  • Continual Learning with LLMs: Using large language models as teachers to train smaller, efficient models suited for deployment
  • From Internal to External Knowledge: External knowledge integration for continual text classification; Retrieval-Augmented Generation (RAG) for dynamic data access; tool-based learning

Presenters

YS
Yasas Senarath
Ph.D. Candidate — George Mason University

Yasas Senarath is a Ph.D. candidate in Information Technology at George Mason University, specializing in Social Computing, Continual Learning, and Natural Language Understanding. His Ph.D. research focuses on robust continual learning frameworks for online behavioral analytics. His recent work, Knowledge-guided Continual Learning for Behavioral Analytics Systems (2025), directly tackles the stability-plasticity dilemma.

MZ
Marcos Zampieri
Assistant Professor — George Mason University

Marcos Zampieri is an Assistant Professor at George Mason University. His main research interests are in Natural Language Processing (NLP). He regularly publishes in top-tier venues such as AAAI, ACL, CIKM, EMNLP, IJCAI, and NAACL, with work spanning offensive language and mental health modeling on social media. Marcos delivered the tutorial Countering Hateful and Offensive Speech Online: Open Challenges at EMNLP 2024, and has served as chair of workshops including SemEval, VarDial, and WMT. He was lead organizer of OffensEval-2019 and OffensEval-2020, two of the most widely participated offensive language identification shared tasks to date.

HP
Hemant Purohit
Associate Professor — George Mason University

Hemant Purohit is an Associate Professor in the Information Sciences & Technology Department at George Mason University. His broad research interests lie in Social Computing, Natural Language Understanding, and Human-AI Collaboration, with a focus on creating robust applications for high-risk, dynamic environments in public services and nonprofits. He applies this research to managing natural crises (e.g., hurricanes), societal crises (e.g., hate and violence), and human crises (e.g., cyber attacks, scams). Hemant delivered an ICWSM 2013 tutorial on Crisis Mapping, Citizen Sensing, and Social Media Analytics.