Blog

Here, some simple introductions to our research papers and AI popularization content are presented. This section is intended for science communication purposes only and has not undergone rigorous scientific validation. If there are any imperfections, please let us know.

2024

终于等来能塞进手机的文生图模型！十分之一体量，SnapGen实现百分百的效果

Xijie Huang PhD Student

Dec 25, 2024 Dec 25, 2024

tiny ML computer vision

近些年来，以 Stable Diffusion 为代表的扩散模型为文生图（T2I）任务树立了新的标准，然而，目前的这些文生图（T2I）扩散模型受限于模型尺寸和运行时间，仍然很难直接部署到移动设备上。最近，来自 Snap 研究院的 Creative Vision 研究团队提出了 SnapGen，从头训练了一个仅有 379M 参数的文生图模型，并且在 iPhone 16 Pro-Max 上仅需 1.4s 就可以生成超高质量的 1024x1024 图片。

2023

提升三维医学影像分割效能：当CNN遇见MLP

Yi Lin PhD Student

Mar 23, 2023 Mar 23, 2023

medical image computer vision

In this work, we propose a novel permutable hybrid network for Vol-MedSeg, named PHNet, which capitalizes on the strengths of both convolution neural networks (CNNs) and MLP. PHNet addresses the intrinsic isotropy problem of 3D volumetric data by employing a combination of 2D and 3D CNNs to extract local features.

解决LLaMA、BERT等部署难题：首个4-bit浮点量化LLM来了

Shih-yang Liu PhD Student

November 17, 2023 November 17, 2023

tiny ML LLM

大语言模型 (LLM) 压缩一直备受关注，后训练量化（Post-training Quantization) 是其中一种常用算法，但是现有 PTQ 方法大多数都是 integer 量化，且当比特数低于 8 时，量化后模型的准确率会下降非常多。相较于 Integer (INT) 量化，Floating Point (FP) 量化能更好的表示长尾分布，因而越来越多的硬件平台开始支持 FP 量化。这篇发表在 EMNLP 2023上的文章给出了大模型 FP 量化的解决方案。

用于密集图像预测任务的卷积辅助高效图推理Transformer

Dong Zhang Research Assistant Professor

November 24, 2023 November 24, 2023

computer vision

In this paper, we propose an auxiliary and integrated network architecture, named Convolutional-Auxiliary Efficient Graph Reasoning Transformer (CAE-GReaT), which joints strengths of both Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) into a uniform framework. The paper is published in the International Journal of Computer Vision (IJCV).

2022

MedISeg：面向医学图像语义分割的技巧、挑战和未来的方向

Dong Zhang Research Assistant Professor

September 22, 2022 May 01, 2023

medical image computer vision

本文收集了一系列医学图像分割的技巧，适用于不同的模型实现阶段。分别是预训练模型、数据预处理、数据增强、模型实现、模型推理和结果后处理)，并通过大量的实验结果探讨了这些技巧在一致性的基准模型上的有效性。