About Us

About MIMIC Lab

The Multimodal Interactive Machine Intelligence Creation Laboratory is dedicated to advancing human-interactive, multimodal AI.

Our focus is on creating AI that can understand, communicate with, and empathize with humans.

Taehoon Kim

Education

2018 - 2021 Ph.D (M.S integrated) in Computer Science, Sogang University

2012 - 2018 B.S in Computer Science & Communications, Sogang University

Career

Aug 2024 - Current Assistant Professor, Graduate School of Metaverse, Sogang University

Mar 2021 - Aug 2024 Research Scientist, Vision Lab, LG AI Research

Feb 2020 - Jan 2021 Research Intern, Clova AI, Naver Corp.

Jan 2017 - Dec 2017 Machine Learning Engineer, Nosith Inc.

Field of Interest

• General machine learning, computer-vision, and large scale model training.

• Specialized in large multimodal model (LMM), vision-language, quantization, and network architecture design.

• Application of machine learning algorithms on various multimodal and computer vision tasks.

Projects

Egocentric Multimodal AI Agent

• Academic Partnership with Project Aria, Meta Reality Lab. (On-going)

• Developing an Egocentric Multimodal AI Agent leveraging real-time visual inputs from Aria Glass, integrating camera streams, Visual SLAM, and eye-tracking data to enable personalized and context-aware interactions.

• Designing end-to-end multimodal AI architecture optimized for egocentric perception, combining Large Multimodal Models (LMM) with Speech-to-Text (STT) and Text-to-Speech (TTS) for immersive, real-world applications.

Large Multimodal Model (LMM)

• Lead of Image-to-Text LMM (EXAONE Atelier Image-to-text) Project.

• Developed Bidirectional Image-Text Transformer architecture for efficient large-scale vision-language model training.

• Optimized model inference and corresponding backend architecture for commercialization.

• Designed end-to-end backend architecture for general-purpose multimodal agent (EXAONE Atelier Multimodal) by integrating large multimodal model (LMM) and large language model (LMM) with instruction prompt engineering.

Quantization and Network Architecture Search

• Cooperative project with CLOVA AI, Naver Corp.

• Developed a straightforward optimization methods StatAssist & GradBoost which enables the scratch quantization-aware-training in various computer vision tasks : classification, object detection, semantic segmentation, and style transfer.

• Experiments on various tasks showed comparable or often better performance than their floating-point baselines.

Privacy Preserving Image Anonymization

• Project supported by the Institute for Information and Communications Technology Promotion (IITP) Grant funded by the Korea Government (MSIT) (A Development of Deidentification Technique Based on Differential Privacy)

• Developed a latent-space-level image anonymization framework (PPAPNet & PPSGAN) based on Generative Adversarial Networks (GANs) and Differential Privacy to potentially protect images from Model Inversion Attacks.

• Experiments on various datasets showed that PPAPNet & PPSGAN can effectively convert a sensitive image into a high-quality and attack-immune synthetic image while preserving its utility as training data.