Research

Peer-reviewed research from DeepAuto.ai — 72 papers across 10 venues

Research Area
Topic
Year
72 papers
ICLR 2026ML

LS-Merge: Merging Language Models in Latent Space

Bedionita Soro, Aoxuan Silvia Zhang, Bruno Andreis, Jaehyeong Jo, Song Chong, Sung Ju Hwang

  • Merges LLMs in latent space rather than weight space, enabling cross-architecture model combination
  • Two-stage compression yields stronger performance than traditional weight-averaging
  • Opens the door to composable AI systems — mixing specialist models of different architectures into unified, more capable agents
Paper
ICLR 2026ML

Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs

Yumin Choi, Dongki Kim, Jinheon Baek, Sung Ju Hwang

  • Proposes MPO, the first framework to jointly optimize textual and non-textual prompts (images, video, molecules) for multimodal LLMs via Bayesian-guided candidate selection
  • Outperforms leading text-only prompt optimization methods across diverse modalities while reducing search cost
  • Unlocks automatic prompt engineering for multimodal agents — systems that self-optimize their own instructions across vision, language, and scientific domains
Paper
ICLR 2026ML

Multi-View Encoders for Performance Prediction in LLM-Based Agentic Workflows

Patara Trirat, Wonyong Jeong, Sung Ju Hwang

  • Introduces Agentic Predictor, a lightweight model that predicts LLM agent workflow performance via multi-view encoding of code, prompts, and interaction graphs
  • Achieves superior predictive accuracy across 3 domains, eliminating costly trial-and-error agent configuration search
  • A key step toward self-optimizing agentic systems — agents that can predict which workflow configurations will succeed before costly execution
Paper
ICLR 2026CV

Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Model

Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Jaehong Yoon, Soo Ye Kim, Zhe Lin, Sung Ju Hwang

  • Proposes Frame Guidance, a training-free method for frame-level video control (keyframes, style, sketches, depth) via latent-space optimization in diffusion models
  • Achieves high-quality controlled video generation with dramatically reduced memory usage — no model fine-tuning required
  • Broadens the scope of controllable generative AI — frame-precise video manipulation without retraining enables new creative and analytical applications
Paper
ICLR 2026ML

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Minju Seo, Jinheon Baek, Seongyun Lee, Sung Ju Hwang

  • Presents PaperCoder, a multi-agent LLM framework that converts ML papers into functional code repos via planning, analysis, and generation stages
  • Surpasses strong baselines on PaperBench with positive evaluations from original paper authors
  • A milestone for AI-driven scientific reproduction — multi-agent collaboration can now bridge the gap between published research and runnable implementations
Paper
arXiv 2026CV

Self-Refining Video Sampling

Sangwon Jang, Taekyung Ki, Jaehyeong Jo, Saining Xie, Jaehong Yoon, Sung Ju Hwang

  • Introduces self-refining video sampling using pre-trained generators as denoising autoencoders for iterative inference-time refinement without external verifiers
  • Achieves >70% human preference over default samplers with improved motion coherence via uncertainty-aware selective refinement
  • Advances inference-time compute for generative AI — iterative self-refinement at generation time yields higher-quality outputs without additional training
Paper
NeurIPS 2025NLPspotlight

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Minki Kang, Jongwon Jeong, Seanie Lee, Jaewoong Cho, Sung Ju Hwang

  • Proposes Agent Distillation that transfers full tool-using agent behavior (retrieval + code) from large LLMs into small models (0.5B–3B) via first-thought prefix prompting
  • Small models match next-tier larger models trained with standard CoT distillation across 8 reasoning tasks
  • A breakthrough for compact agentic AI — distilling full tool-use capabilities into sub-3B models makes autonomous agents viable on edge and resource-limited hardware
Paper
NeurIPS 2025ML

Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction

Jeffrey Willette, Heejun Lee, Sung Ju Hwang

  • Introduces Delta Attention, a distributional-shift correction for sparse attention applicable to any sparse mechanism (sliding window, sink tokens, etc.)
  • Recovers 88% of full attention accuracy at 98.5% sparsity — 32x speedup over Flash Attention 2 on 1M-token prefills
  • Makes accurate million-token inference practical — correcting sparse attention errors enables reliable long-context processing at a fraction of full-attention cost
Paper
NeurIPS 2025ML

Continuous Diffusion Model for Language Modeling

Jaehyeong Jo, Sung Ju Hwang

  • Proposes a continuous diffusion model for discrete language operating on the statistical manifold of categorical distributions with simulation-free training
  • Outperforms existing discrete diffusion models and approaches autoregressive performance with non-autoregressive parallel generation
  • Opens a new paradigm for non-autoregressive language generation — parallel decoding via continuous diffusion could dramatically reduce agent response latency
Paper
NeurIPS 2025AI4Sci

Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model

Dongki Kim, Wonbin Lee, Sung Ju Hwang

  • Presents Mol-LLaMA, a large molecular language model integrating complementary molecular encoders for general-purpose molecular reasoning
  • Demonstrates broad molecular comprehension and explainable reasoning across chemistry and biology tasks
  • A step toward general-purpose molecular reasoning AI — bridging structural chemistry and natural language enables new paradigms in drug discovery and materials science
Paper
NeurIPS 2025NLP

System Prompt Optimization with Meta-Learning

Yumin Choi, Jinheon Baek, Sung Ju Hwang

  • Introduces bilevel system prompt optimization with a meta-learning framework that jointly optimizes system prompts across multiple datasets and user prompts
  • Optimized system prompts generalize across 14 unseen datasets spanning 5 domains — enabling rapid adaptation with fewer optimization steps
  • A foundational technique for self-improving AI agents — meta-learned system prompts generalize across tasks, enabling agents to auto-tune their own instructions
Paper
NeurIPS 2025ML

FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA

Seanie Lee, Sangwoo Park, Dong Bok Lee, Dominik Wagner, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang

  • Proposes FedSVD that mitigates noise amplification in DP-SGD + LoRA federated fine-tuning via SVD-based reparameterization yielding orthonormal gradient bounds
  • Consistently outperforms baselines across privacy budgets in both private and non-private federated regimes
  • Advances privacy-preserving collaborative AI — organizations can jointly fine-tune LLM agents across data silos with formal differential privacy guarantees
Paper
NeurIPS 2025ML

Cost-Sensitive Freeze-thaw Bayesian Optimization for Efficient Hyperparameter Tuning

Dong Bok Lee, Aoxuan Silvia Zhang, Byungjoo Kim, Junhyeon Park, Steven Adriaensen, Juho Lee, Sung Ju Hwang, Hae Beom Lee

  • Introduces cost-sensitive freeze-thaw Bayesian optimization with user-defined utility functions balancing cost vs. performance and automatic stopping
  • Outperforms all prior freeze-thaw BO and transfer-BO baselines on established multi-fidelity HPO benchmarks
  • Makes budget-aware AutoML practical — practitioners can express cost-performance trade-offs directly, enabling efficient model optimization under real compute constraints
Paper
EMNLP 2025NLP

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Simon A. Aytes, Jinheon Baek, Sung Ju Hwang

  • Introduces Sketch-of-Thought (SoT), a prompting framework with three cognitive-inspired reasoning paradigms (Conceptual Chaining, Chunked Symbolism, Expert Lexicons) selected dynamically by a lightweight routing model
  • Achieves up to 84% token reduction with minimal accuracy loss across 18 reasoning benchmarks spanning multiple domains, languages, and modalities
  • Demonstrates that cognitive-inspired reasoning can dramatically cut LLM inference costs — a practical path toward affordable agentic AI at scale
Paper
EMNLP 2025NLP

Efficient Real-time Refinement of Language Model Text Generation

Joonho Ko, Jinheon Baek, Sung Ju Hwang

  • Proposes Streaming-VR, a streaming verification and refinement approach that performs on-the-fly factual correction of LLM tokens as they are generated rather than post-hoc
  • Catches errors early — once incorrect tokens appear, subsequent tokens are more likely wrong; real-time intervention improves both factual accuracy and generation efficiency
  • Shifts LLM factuality from post-hoc to real-time correction — catching errors as they stream reduces hallucination propagation in agentic pipelines
Paper
EMNLP 2025NLP

Database-Augmented Query Representation for Information Retrieval

Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

  • Presents DAQu (Database-Augmented Query), a retrieval framework that enriches short queries with structured metadata across multiple relational database tables using a graph-based set-encoding strategy
  • Significantly enhances retrieval performance over baselines by leveraging hierarchical features from databases without imposing order
  • Bridges the gap between structured databases and neural retrieval — agents can leverage relational metadata to answer complex queries more accurately
Paper
ACL 2025NLP

Efficient Long Context Language Model Retrieval with Compression

Minju Seo, Jinheon Baek, Seongyun Lee, Sung Ju Hwang

  • Develops a compression-based retrieval method for long-context language models that reduces memory and computation costs while preserving retrieval quality
  • Enables practical deployment of long-context LLMs for information-intensive tasks without sacrificing accuracy
  • Makes long-context retrieval affordable — compression-based approaches let LLMs process massive document collections without prohibitive memory and compute costs
Paper
ACL Findings 2025CV

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Soyeong Jeong, Kangsan Kim, Jinheon Baek, Sung Ju Hwang

  • Introduces VideoRAG, a framework that dynamically retrieves relevant videos and leverages both visual and textual information via Large Video Language Models for retrieval-augmented generation
  • Features an informative frame selection mechanism for extremely long videos and subtitle extraction strategy, outperforming text-only and image-based RAG baselines
  • Extends RAG beyond text — video-augmented generation enables AI systems to reason jointly over visual and textual knowledge for richer, grounded answers
Paper
ACL Findings 2025NLP

SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models

Seanie Lee, Dong Bok Lee, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang

  • Proposes SafeRoute, a binary router that adaptively selects between small distilled and large safety guard models based on input difficulty — applying heavy models only to hard examples
  • Significantly improves the efficiency-accuracy trade-off for safety guardrails across multiple benchmark datasets, outperforming relevant baselines
  • A practical blueprint for scalable AI safety — adaptive routing lets production systems balance safety thoroughness against latency, applying heavy guardrails only where needed
Paper
ICML 2025ML

AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML

Patara Trirat, Wonyong Jeong, Sung Ju Hwang

  • Presents AutoML-Agent, a multi-agent LLM framework automating the full ML pipeline from data preprocessing to model deployment
  • Specialized agents collaborate on feature engineering, model selection, hyperparameter tuning, and evaluation automatically
  • Demonstrates that multi-agent LLM collaboration can automate the full ML lifecycle — a template for building agentic systems that handle complex, multi-stage workflows
Paper
ICML 2025ML

Bayesian Neural Scaling Laws Extrapolation with Prior-Fitted Networks

Dongwoo Lee, Dong Bok Lee, Steven Adriaensen, Juho Lee, Sung Ju Hwang, Frank Hutter, Seon Joo Kim, Hae Beom Lee

  • Develops a Bayesian scaling law extrapolation framework using prior-fitted networks to predict model performance at larger scales
  • Enables accurate estimation of larger model performance before investing in expensive training runs
  • Enables informed scaling decisions — practitioners can predict large-model performance before committing to expensive training, reducing wasted compute
Paper
CVPR 2025CV

VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding

Kangsan Kim, Geon Park, Youngwan Lee, Woongyeong Yeo, Sung Ju Hwang

  • Proposes VideoICL, a video in-context learning framework with confidence-based iterative inference and similarity-based example selection
  • Significantly improves OOD video understanding by iteratively refining results until high-confidence responses are obtained
  • Advances robust video understanding — confidence-driven iterative inference lets vision models handle distribution shifts without costly retraining
Paper
CVPR 2025CV

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Sangwon Jang, June Suk Choi, Jaehyeong Jo, Kimin Lee, Sung Ju Hwang

  • Reveals a novel trigger-free data poisoning vulnerability in text-to-image models — brand logos embedded without any text triggers
  • Demonstrates that visual patterns in training data cause models to reproduce them naturally, raising critical AI safety concerns
  • Raises critical AI safety awareness — trigger-free poisoning attacks highlight a subtle, hard-to-detect vulnerability class in generative model training pipelines
Paper
NAACL 2025NLP

ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models

Jinheon Baek, Sujay Kumar Jauhar, Silviu Cucerzan, Sung Ju Hwang

  • Introduces ResearchAgent, an LLM-based agent that iteratively generates novel research ideas by synthesizing scientific literature
  • Produces research proposals evaluated as creative and feasible by human reviewers through multiple rounds of literature review
  • Pioneers autonomous scientific ideation — LLM agents that iteratively synthesize literature and generate novel hypotheses represent a new paradigm for AI-assisted research
Paper
ICLR 2025ML

A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention

Heejun Lee, Geon Park, Youngwan Lee, Jaduk Suh, Jina Kim, Wonyong Jeong, Bumsik Kim, Hyemin Lee, Myeongjae Jeon, Sung Ju Hwang

  • Presents a training-free sub-quadratic transformer serving framework via hierarchically pruned attention computations
  • Significantly reduces inference latency and memory for long sequences — no model retraining or fine-tuning required
  • Makes sub-quadratic transformer serving a drop-in upgrade — existing models gain dramatically faster long-sequence inference without any retraining
Paper
ICLR 2025NLP

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee, Haebin Seong, Dong Bok Lee, Minki Kang, Xiaoyin Chen, Dominik Wagner, Yoshua Bengio, Juho Lee, Sung Ju Hwang

  • Develops HarmAug, data augmentation for training compact safety guardrail models through knowledge distillation with diverse harmful examples
  • Maintains strong detection performance at lower computational costs — improving robustness and coverage of smaller safety classifiers
  • Enables lightweight, deployable safety guardrails — compact distilled models can run alongside LLM agents with minimal overhead, making safety practical at scale
Paper
ICLR 2025NLP

Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning

Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain

  • Proposes a framework for generating diverse adversarial attacks on LLMs for more robust red-teaming and safety training
  • Discovers vulnerabilities that uniform attack methods miss — leading to stronger safety tuning through diverse attack strategies
  • Strengthens the red-teaming toolkit for LLM safety — diverse attack generation uncovers blind spots that uniform methods miss, leading to more robust safety tuning
Paper
ICLR 2025ML

Training Free Exponential Context Extension via Cascading KV Cache

Jeffrey Willette, Heejun Lee, Youngwan Lee, Myeongjae Jeon, Sung Ju Hwang

  • Enables exponential context extension without training by cascading KV caches — hierarchically organizing tokens for extremely long contexts
  • Allows pretrained models to handle sequences far beyond original training length with no fine-tuning needed
  • Unlocks exponentially longer contexts at zero training cost — pretrained models can process far longer sequences through hierarchical KV cache organization
Paper
ICLR 2025ML

Diffusion-based Neural Network Weights Generation

Bedionita Soro, Bruno Andreis, Hayeon Lee, Wonyong Jeong, Song Chong, Frank Hutter, Sung Ju Hwang

  • Uses diffusion models to generate neural network weights directly, bypassing traditional training via conditioned weight generation
  • Learns to generate high-quality weight configurations conditioned on desired properties for rapid model initialization
  • Reimagines model creation as generation — diffusion over weight space could bypass traditional training entirely, enabling instant model initialization for new tasks
Paper
arXiv 2025ML

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Heejun Lee, Geon Park, Jaduk Suh, Sung Ju Hwang

  • Enables processing of up to 3 million tokens on a single GPU through hierarchical token pruning that dynamically eliminates irrelevant context
  • Achieves 18.95x speedup in attention decoding for million-token contexts without permanent loss of context information
  • Removes the hardware barrier for long-context AI — processing 3M tokens on a single GPU makes million-token inference accessible without expensive multi-GPU setups
Paper
NeurIPS 2024CV

KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis

Youngwan Lee, Kwanyong Park, Yoorhim Cho, Yong-Ju Lee, Sung Ju Hwang

  • Distills practical lessons for building memory-efficient text-to-image diffusion — compact U-Nets (700M–1B) reduce model size up to 69% vs. SDXL
  • Generates 1024px images on consumer-grade GPUs with 8GB VRAM — 4x faster than SDXL while maintaining quality
  • Democratizes high-resolution image generation — 1024px outputs on consumer GPUs with 8GB VRAM brings diffusion models to resource-constrained environments
Paper
NeurIPS 2024ML

Set-based Neural Network Encoding Without Weight Tying

Bruno Andreis, Soro Bedionita, Philip Torr, Sung Ju Hwang

  • Proposes set-based neural network encoding without weight tying — enabling flexible weight space representations
  • Improves neural architecture search and model analysis through more expressive weight encodings
  • Improves neural architecture understanding — richer weight-space representations enable more expressive model analysis and faster architecture search
Paper
NeurIPS 2024NLP

Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models

Minki Kang, Sung Ju Hwang, Gibbeum Lee, Jaewoong Cho

  • Introduces LaPael, a latent-level paraphrasing method applying input-dependent noise to early LLM layers for diverse augmentation
  • Achieves more robust knowledge injection than standard fine-tuning — eliminates recurring costs of paraphrase generation for each update
  • Provides a cheap, training-time augmentation for knowledge injection — latent-level perturbation generates diverse paraphrases implicitly, avoiding costly data generation pipelines
Paper
NeurIPS 2024CV

Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models

Sangwon Jang, Jaehyeong Jo, Kimin Lee, Sung Ju Hwang

  • Presents MuDI, a framework solving identity mixing in multi-subject personalization by decoupling identity features via segmentation-based augmentation
  • Achieves 2x success rate over baselines for multi-subject generation without identity blending — preferred >70% against strongest baseline
  • Solves a key bottleneck in personalized image generation — reliable multi-subject synthesis without identity mixing unlocks practical creative and commercial applications
Paper
EMNLP 2024CV

Concept-skill Transferability-based Data Selection for Large Vision-Language Models

Jaewoo Lee, Boyang Li, Sung Ju Hwang

  • Develops concept-skill transferability-based data selection for training large vision-language models efficiently
  • Identifies training examples that contribute most to skill transfer — enabling effective training with smaller curated datasets
  • Makes vision-language model training more data-efficient — selecting high-transferability examples reduces dataset size while maintaining downstream performance
Paper
EMNLP Findings 2024NLP

Rethinking Code Refinement: Learning to Judge Code Efficiency

Minju Seo, Jinheon Baek, Sung Ju Hwang

  • Reframes code refinement by training models to judge code efficiency rather than just correctness — addressing runtime performance gaps
  • Develops benchmarks and methods for evaluating and improving generated code runtime performance
  • Shifts code generation evaluation toward runtime efficiency — training models to judge performance, not just correctness, is essential for production-grade code agents
Paper
ICML 2024CV

STELLA: Continual Audio-Video Pre-training with SpatioTemporal Localized Alignment

Jaewoo Lee, Jaehong Yoon, Wonjae Kim, Yunji Kim, Sung Ju Hwang

  • Proposes STELLA, a continual audio-video pre-training method with spatiotemporal localized alignment emphasizing semantically intertwined patches
  • Achieves 3.69% relative gain in zero-shot retrieval while reducing memory by ~45% vs. continual learning baselines
  • Advances continual multimodal learning — models that accumulate audio-visual knowledge over time without forgetting are essential for long-lived AI systems
Paper
ICML 2024ML

Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes

Jaehyeong Jo, Sung Ju Hwang

  • Introduces Riemannian Diffusion Mixture for generative modeling on curved manifold spaces — bypassing expensive heat kernel estimations
  • Achieves superior generation quality on diverse manifolds with dramatically reduced inference steps
  • Enables geometry-aware molecular generation on curved manifolds — a principled approach to generating molecules and proteins with correct geometric structure
Paper
ICML 2024ML

Graph Generation with Diffusion Mixture

Jaehyeong Jo, Dongki Kim, Sung Ju Hwang

  • Applies diffusion mixture to graph generation — modeling topology by explicitly learning endpoint-conditioned diffusion toward predicted graph structures
  • Outperforms previous models on general graph and 2D/3D molecule generation with correct topology and rapid convergence
  • Provides a topology-aware generative framework for molecular design — explicit endpoint conditioning ensures chemically valid graph structures for drug and materials discovery
Paper
ICML 2024AI4Sci

Drug Discovery with Dynamic Goal-aware Fragments

Seul Lee, Seanie Lee, Kenji Kawaguchi, Sung Ju Hwang

  • Proposes GEAM, a molecular generative framework with dynamic goal-aware fragment extraction, assembly, and modification for drug discovery
  • Discovers novel fragments during generation via dynamic vocabulary updates — adaptively selecting building blocks by pharmacological properties
  • Advances goal-directed drug design — dynamic fragment vocabularies that adapt during generation produce more diverse and pharmacologically relevant molecular candidates
Paper
ICML 2024ML

BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation

Daeun Lee, Jaehong Yoon, Sung Ju Hwang

  • Introduces BECoTTA, a Mixture-of-Domain Low-rank Experts framework for continual test-time adaptation with domain-adaptive routing
  • Achieves 5.81% performance gain while requiring only 0.001x trainable parameters — maximizing expert-domain synergy
  • Enables continual adaptation to distribution shifts — domain-adaptive expert routing lets deployed models stay accurate as real-world data evolves over time
Paper
ICML 2024CV

EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens

Sunil Hwang, Jaehong Yoon, Youngwan Lee, Sung Ju Hwang

  • Proposes EVEREST, an efficient masked video autoencoder that removes redundant spatiotemporal tokens based on motion information density
  • Enables pre-training and fine-tuning on a single machine with 8 GPUs while matching heavy baselines (vs. 16 nodes with 128 A100s)
  • Makes large-scale video pre-training accessible — matching heavy baselines (128 A100s) on a single 8-GPU machine democratizes video understanding research
Paper
ICML 2024NLP

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

Ruochen Wang, Sohyun An, Minhao Cheng, Tianyi Zhou, Sung Ju Hwang, Cho-Jui Hsieh

  • Proposes Mixture-of-Expert Prompts — automated construction of specialized prompt-demo experts per task sub-region via kernel-regression-inspired grouping
  • Significantly improves LLM performance by routing inputs to the most relevant expert prompt via region-based joint search
  • Demonstrates that one prompt is never enough — routing inputs to task-specialized expert prompts significantly outperforms single-prompt approaches for diverse workloads
Paper
NAACL 2024NLP

Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity

Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

  • Develops Adaptive-RAG that adjusts retrieval strategy based on question complexity — simple questions bypass retrieval, complex ones trigger multi-step
  • Optimizes both efficiency and answer quality by routing to appropriate retrieval depth per query
  • A practical design pattern for efficient RAG systems — matching retrieval effort to query complexity avoids unnecessary computation on simple questions while going deep on hard ones
Paper
NAACL 2024NLP

Carpe diem: On the Evaluation of World Knowledge in Lifelong Language Models

Yujin Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, Se-Young Yun

  • Introduces a benchmark for evaluating temporal knowledge maintenance in language models — studying knowledge staleness over time
  • Proposes evaluation metrics for lifelong learning scenarios where facts evolve and old knowledge must be updated
  • Highlights a critical gap in temporal knowledge maintenance — as the world changes, LLMs need systematic mechanisms to update stale knowledge without full retraining
Paper
CVPR 2024CV

ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning

Beomgyoung Kim, Joonsang Yu, Sung Ju Hwang

  • Introduces ECLIPSE, efficient continual learning for panoptic segmentation via visual prompt tuning — freezing base parameters
  • Achieves state-of-the-art robustness against catastrophic forgetting with minimal trainable parameters
  • Shows that prompt tuning enables continual visual learning — freezing base parameters while learning lightweight prompts prevents catastrophic forgetting at minimal parameter cost
Paper
ICLR 2024ML

SEA: Sparse Linear Attention with Estimated Attention Mask

Heejun Lee, Jina Kim, Jeffrey Willette, Sung Ju Hwang

  • Proposes SEA — estimates attention with linear complexity via kernel-based attention, then creates sparse matrix with top-k selection
  • Achieves better perplexity than full attention baselines while using roughly half the memory on OPT-1.3B
  • Achieves better-than-full-attention quality at half the memory — a practical path to serving large LLMs on memory-constrained hardware
Paper
ICLR 2024ML

DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models

Sohyun An, Hayeon Lee, Jaehyeong Jo, Seanie Lee, Sung Ju Hwang

  • Applies diffusion-based generative modeling to neural architecture search, directly generating task-optimal architectures conditioned on desired performance properties.
  • Achieves up to 35x speedup over traditional NAS by replacing iterative search with one-shot conditional generation of high-performing architectures.
  • Shifts NAS from search to generation — one-shot conditional architecture generation is orders of magnitude faster than iterative search, enabling rapid model customization.
Paper
ICLR 2024ML

Self-Supervised Dataset Distillation for Transfer Learning

Dong Bok Lee, Seanie Lee, Joonho Ko, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

  • Proposes self-supervised dataset distillation that creates compact synthetic datasets preserving transferable features for efficient transfer learning.
  • Distills large datasets into small representative subsets enabling fast adaptation to downstream tasks with minimal data, bridging pre-training and fine-tuning efficiency.
  • Enables data-efficient transfer learning — compact synthetic datasets that preserve transferable features allow rapid domain adaptation without massive task-specific corpora.
Paper
ICLR 2024CV

Progressive Fourier Neural Representation for Sequential Video Compilation

Haeyong Kang, Jaehong Yoon, DaHyun Kim, Sung Ju Hwang, Chang D. Yoo

  • Proposes progressive Fourier neural representation for continuously accumulating video representations across sequential encoding sessions using adaptive sparse sub-modules.
  • Ensures lossless decoding of previously learned representations while incrementally encoding new video content in Fourier space.
  • Enables lossless incremental video encoding — progressive Fourier representations allow AI systems to continuously accumulate visual knowledge without degrading prior content.
Paper
EMNLP 2023NLP

Learning to Verify Knowledge-Augmented Language Models

Jinheon Baek, Soyeong Jeong, Minki Kang, Jong C. Park, Sung Ju Hwang

  • Develops a verification mechanism for knowledge-augmented language models that detects and corrects factual errors in retrieved information.
  • Trains a dedicated verifier to assess consistency and relevance of retrieved knowledge, significantly improving reliability of knowledge-grounded generation.
  • Essential for trustworthy RAG systems — a dedicated verification layer prevents agents from acting on inaccurate retrieved knowledge, reducing hallucination-driven errors.
EMNLP Findings 2023NLP

Co-training and Co-distillation for Quality Improvement and Compression of Language Models

Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Hongbo Zhang, Sung Ju Hwang, Alexander Min

  • Combines co-training and co-distillation techniques to simultaneously improve quality and compress language models through mutual learning.
  • Two models learn from each other during training, producing a smaller model that matches or exceeds the performance of larger counterparts.
  • Shows that mutual learning produces better compressed models — co-distillation is a practical recipe for deploying high-quality small LMs in resource-constrained settings.
EMNLP Findings 2023NLP

Test-Time Self-Adaptive Small Language Models for Question Answering

Soyeong Jeong, Jinheon Baek, Sukmin Cho, Sung Ju Hwang, Jong C. Park

  • Enables small language models to self-adapt at test time for question answering without additional training by self-generating relevant context.
  • Closes the performance gap with much larger models by adjusting behavior dynamically based on each specific question at inference time.
  • Demonstrates that small models can punch above their weight — test-time self-adaptation closes the gap with much larger models, making lightweight AI agents more practical.
NeurIPS 2023NLP

Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

Minki Kang, Seanie Lee, Jinheon Baek, Kenji Kawaguchi, Sung Ju Hwang

  • Distills reasoning capabilities from large language models into smaller ones for knowledge-intensive tasks, augmented with external knowledge retrieval.
  • Enables small models to perform multi-step reasoning that typically requires much larger models, achieving strong knowledge-intensive task performance.
  • Foundational for compact reasoning agents — distilling both reasoning chains and retrieval augmentation into small models enables complex knowledge-intensive tasks without large-model overhead.
NeurIPS 2023ML

Generalizable Lightweight Proxy for Robust NAS against Diverse Perturbations

Hyeonjeong Ha, Minseon Kim, Sung Ju Hwang

  • Develops a lightweight proxy for neural architecture search that efficiently evaluates robustness against diverse perturbations without expensive full evaluation.
  • Enables efficient discovery of architectures inherently robust to various noise types and adversarial attacks, generalizing across perturbation types.
  • Enables robustness-aware neural architecture search — efficiently finding architectures that are inherently resilient to noise and adversarial inputs without expensive per-perturbation evaluation.
NeurIPS 2023ML

Effective Targeted Attacks for Adversarial Self-Supervised Learning

Minseon Kim, Hyeonjeong Ha, Sooel Son, Sung Ju Hwang

  • Designs effective targeted adversarial attacks specifically for self-supervised learning models, exposing vulnerabilities in contrastive and other SSL methods.
  • Provides critical insights for building more robust representation learning systems by systematically characterizing SSL failure modes.
  • Provides critical insights for AI safety research — systematic characterization of SSL attack surfaces helps the community build more robust foundation models.
NeurIPS 2023CV

STXD: Structural and Temporal Cross-Modal Distillation for Multi-View 3D Object Detection

Sujin Jang, Dae Ung Jo, Sung Ju Hwang, Dongwook Lee, Daehyun Ji

  • Proposes structural and temporal cross-modal distillation from LiDAR to camera-based 3D object detection, transferring geometric and temporal knowledge.
  • Significantly improves multi-view camera-only 3D detection by leveraging the superior spatial understanding of LiDAR-based teacher models.
  • Advances sensor-efficient 3D perception — cross-modal distillation enables camera-only systems to approach LiDAR-level 3D understanding, reducing hardware requirements for autonomous systems.
ICCV 2023CV

Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models

Jaewoong Lee, Sangwon Jang, Jaehyeong Jo, Jaehong Yoon, Yunji Kim, Jin-Hwa Kim, Jung-Woo Ha, Sung Ju Hwang

  • Develops a text-conditioned sampling framework that improves text-to-image generation quality in masked generative models with better text-image alignment.
  • Achieves higher fidelity and consistency between generated images and text descriptions through improved sampling strategies during inference.
  • Improves text-to-image faithfulness in masked generative models — better sampling strategies ensure generated images closely match their text descriptions.
Interspeech 2023Speech

ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models

Minki Kang, Wooseok Han, Sung Ju Hwang, Eunho Yang

  • Enables zero-shot emotion-controllable speech synthesis without emotion-labeled training data by combining diffusion and style-based models.
  • Generates speech with specified emotional tones, adaptable to unseen speakers and emotions at inference time for flexible voice generation.
  • Brings emotional expressiveness to AI voice interfaces — zero-shot control over speaker identity and emotion opens new possibilities for natural human-AI interaction.
ACL 2023NLP

Direct Fact Retrieval from Knowledge Graphs without Entity Linking

Jinheon Baek, Alham Fikri Aji, Jens Lehmann, Sung Ju Hwang

  • Proposes direct fact retrieval from knowledge graphs without the traditional entity linking step, simplifying the KG querying pipeline.
  • Improves robustness by avoiding error propagation from entity linking failures, enabling more reliable knowledge access at scale.
  • Simplifies the knowledge graph querying pipeline — removing the error-prone entity linking step makes KG-based reasoning more robust and easier to deploy.
ACL 2023NLP

Language Detoxification with Attribute-Discriminative Latent Space

Jinmyung Kwak, Minseon Kim, Sung Ju Hwang

  • Develops language detoxification using an attribute-discriminative latent space that separates toxic attributes from content in learned representations.
  • Enables selective removal of harmful language while preserving meaning and fluency through disentangled latent manipulation.
  • Provides a fine-grained detoxification mechanism — disentangling toxic attributes in latent space removes harmful content while preserving meaning, a key building block for safe AI.
ACL Findings 2023NLP

A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Model

Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Sung Ju Hwang, Alexander Min

  • Studies knowledge distillation from weak teachers, revealing when and how smaller teacher models can still provide useful training signals for larger students.
  • Uncovers surprising effectiveness of weak-teacher distillation for scaling up pre-trained language models, challenging conventional teacher-student assumptions.
  • Challenges conventional wisdom on knowledge distillation — even weak teachers can effectively guide larger students, opening new possibilities for efficient model scaling.
ACL Findings 2023NLP

Phrase Retrieval for Open Domain Conversational Question Answering with Conversational Dependency Modeling via Contrastive Learning

Soyeong Jeong, Jinheon Baek, Sung Ju Hwang, Jong C. Park

  • Applies phrase-level retrieval to conversational question answering, modeling conversational dependencies through contrastive learning.
  • Directly retrieves answer phrases from a large corpus rather than generating them, improving accuracy and grounding in open-domain dialogue.
  • Improves grounded conversational AI — phrase-level retrieval with dependency modeling produces more precisely sourced answers in multi-turn dialogues.
ICML 2023AI4Sci

Exploring Chemical Space with Score-based Out-of-distribution Generation

Seul Lee, Jaehyeong Jo, Sung Ju Hwang

  • Uses score-based generative models to explore chemical space beyond the training distribution via systematic out-of-distribution molecule generation.
  • Enables discovery of novel molecular structures with desired properties, expanding the frontier of computational drug discovery and materials design.
  • Pushes the frontier of AI-driven molecular discovery — systematic OOD generation lets researchers explore uncharted chemical regions for novel drug candidates and materials.
Paper
ICML 2023ML

Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation

Jeffrey Willette, Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho Lee, Sung Ju Hwang

  • Develops scalable set encoding with universal mini-batch consistency, producing identical representations regardless of how sets are partitioned.
  • Provides unbiased full-set gradient approximation enabling training on very large sets that don't fit in memory while maintaining mathematical correctness.
  • Solves a fundamental challenge in processing large, variable-size data — mini-batch consistent encoding ensures identical representations regardless of how data is partitioned.
Paper
ICML 2023ML

Personalized Subgraph Federated Learning

Jinheon Baek, Wonyong Jeong, Jiongdao Jin, Jaehong Yoon, Sung Ju Hwang

  • Introduces personalized subgraph federated learning where each client learns on its own subgraph while benefiting from global knowledge sharing.
  • Addresses unique challenges of non-IID graph data distributions across federated clients, enabling effective graph learning without centralized data.
  • Enables privacy-preserving graph learning across organizations — federated subgraph training lets distributed parties collaborate on graph tasks without sharing raw data.
Paper
ICML 2023ML

Margin-based Neural Network Watermarking

Byungjoo Kim, Suyoung Lee, Seanie Lee, Sooel Son, Sung Ju Hwang

  • Proposes margin-based neural network watermarking for intellectual property protection, embedding verifiable signatures into model parameters.
  • Watermarks remain robust against model modifications and fine-tuning while maintaining full model performance on primary tasks.
  • Addresses the growing need for AI model IP protection — robust watermarks survive fine-tuning and modification, enabling verifiable model ownership and provenance tracking.
Paper
ICML 2023ML

Continual Learners are Incremental Model Generalizers

Jaehong Yoon, Sung Ju Hwang, Yue Cao

  • Demonstrates that continual learners are incremental model generalizers — sequential task learning naturally develops superior transfer to unseen tasks.
  • Shows continual learning creates representations that generalize better than traditional multi-task training, reframing continual learning as a generalization paradigm.
  • Reframes continual learning as a generalization strategy — sequentially trained models develop superior transfer capabilities, suggesting that lifelong learning inherently builds general intelligence.
Paper
CVPR 2023CV

The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation

Beomyoung Kim, Joonhyun Jeong, Dongyoon Han, Sung Ju Hwang

  • Proposes point-guided mask representation for weakly semi-supervised instance segmentation using minimal point annotations as guidance.
  • Achieves strong segmentation performance with significantly reduced labeling costs, making high-quality segmentation accessible with sparse supervision.
  • Drastically reduces annotation cost for instance segmentation — point-level supervision achieves strong results, making high-quality visual understanding accessible without expensive pixel-level labels.
Paper
ICLR 2023MLspotlight

Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

Hayeon Lee, Sohyun An, Minseon Kim, Sung Ju Hwang

  • Develops a meta-prediction model that estimates architecture performance after knowledge distillation on unseen datasets, unifying NAS with distillation.
  • Enables efficient distillation-aware architecture search that finds architectures optimized for compressed deployment, awarded Spotlight at ICLR 2023.
  • Unifies NAS and knowledge distillation — meta-prediction of post-distillation performance enables discovery of architectures purpose-built for efficient compressed deployment.
Paper
ICLR 2023ML

Self-Distillation for Further Pre-training of Transformers

Seanie Lee, Minki Kang, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi

  • Applies self-distillation during further pre-training of transformers, where the model learns from its own predictions on new domain data.
  • Improves domain adaptation efficiency and downstream performance without requiring additional teacher models, streamlining the pre-training pipeline.
  • Streamlines domain adaptation for pretrained models — self-distillation during further pre-training improves specialization without needing separate teacher models.
Paper
ICLR 2023ML

Sparse Token Transformers with Attention Back Tracking

Heejun Lee, Minki Kang, Youngwan Lee, Sung Ju Hwang

  • Introduces attention back-tracking for sparse token transformers, allowing recovery of previously pruned tokens when they become relevant in later layers.
  • Improves accuracy-efficiency trade-off by making token pruning decisions reversible, preventing information loss from premature pruning.
  • Improves the accuracy-efficiency frontier for token pruning — reversible sparsity decisions prevent permanent information loss, enabling aggressive compute reduction without sacrificing quality.
Paper