AI Sharing Circle

Day arching a pawn and sharing for the king!
DeepSeek 发布了 v3 模型的首个开源版本,现阶段拥有(国产)最强代码能力

DeepSeek released the first open source version of its v3 model, now with the strongest code capabilities (in China)

DeepSeek-V3是一款强大的混合专家(Mixture-of-Experts, MoE)语言模型,拥有6710亿总参数,针对每个token激活37亿参数。该模型采用了一种创新的多头潜在注意力(Mu...
7mos ago
03.2K
CogAgent:智谱开源的智能视觉语言模型,实现图形界面自动化操作

CogAgent: Smart Spectrum's open source intelligent visual language model for automating graphical interfaces

综合介绍 CogAgent是由清华大学数据挖掘研究组(THUDM)开发的开源视觉语言模型,旨在实现跨平台的图形用户界面(GUI)自动化操作。该模型基于CogVLM(GLM-4V-9B),支持中英文双语...
7mos ago
01.6K
达摩院“寻光”视频创作平台全面评测

Dharma Institute's "Searchlight" Video Creation Platform Full Review

今天早些时候收到“寻光”内测申请通过的通知,睡前发个简单的评测。 该平台定位是达摩院的“视觉技术能力应用平台”,目前应用较少(对比发布会)期待逐步开放更多视觉类应用。 寻光分为两个地址: https...
7mos ago
01.1K
DisPose:生成人体姿态精准控制的视频,创作跳舞的小姐姐

DisPose: generating videos with precise control of human posture, creating dancing ladies

综合介绍 DisPose是一个创新的开源人工智能项目,专注于可控的人物图像动画生成。该项目由研究团队开发并在GitHub上开源,采用先进的深度学习技术,通过分解骨骼姿态信息来实现精确的人物动画控制。D...
7mos ago
01.3K
Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Comprehensive Introduction Smolagents is a lightweight intelligent agent library developed by HuggingFace that focuses on simplifying the development process of AI agent systems. The project is known for its clean design philosophy, with only about 1000 lines of core code, yet provides powerful feature integration capabilities. It is most ...
7mos ago
02.2K
通过视觉提取文档为Markdown格式文档的组合提示词指令

Combined cue word commands by visually extracting documents as Markdown formatted documents

该指令来源于 Vision Parse 项目,分为两步提取markdown文档。 图像分析提示词 (img_analysis.prompt): Analyze this image and retur...
7mos ago
01.2K
Napkin AI 中文入门指南

Napkin AI Chinese Beginner's Guide

How to start generating visual content with Napkin AI ? (Account creation, visual generation, export to pdf or image file...) Welcome to Napkin AI, a tool that makes it easy to convert your text into beautiful visuals. This guide will walk...
7mos ago
01.6K
Vision Parse: Intelligent Conversion of PDF Documents to Markdown Format Using Visual Language Models

Vision Parse: Intelligent Conversion of PDF Documents to Markdown Format Using Visual Language Models

综合介绍 Vision Parse是一个革命性的文档处理工具,它巧妙地结合了最先进的视觉语言模型(Vision Language Models)技术,能够将PDF文档智能转换为优质的Markdown格...
7mos ago
01.5K
InvSR:开源图像超分辨率项目,提升图像分辨率质量

InvSR: Open source image super-resolution project to improve the quality of image resolution

General Introduction InvSR is an innovative open-source image super-resolution project based on diffusion inversion techniques capable of converting low-resolution images into high-quality, high-resolution images. The project utilizes the rich a priori knowledge of images embedded in pre-trained large-scale diffusion models to support, through a flexible sampling mechanism, the...
7mos ago
01.6K
Infinity:生成高分辨率图像的比特自回归建模,实现无限制高分辨率图像生成

Infinity: bitwise autoregressive modeling for generating high-resolution images for unlimited high-resolution image generation

综合介绍 Infinity是一个开创性的高分辨率图像生成框架,由FoundationVision团队开发。该项目通过创新的位级视觉自回归建模方法,突破了传统图像生成模型的限制。Infinity的核心特...
7mos ago
01.5K