LOADING

Sharenet
  • AI hands-on tutorials
  • AI utility commands
  • Course materials
  • AI Knowledge Base
  • AI Answers
  • I want to contribute
    • Top AI Recommendations
    • Latest AI tools
    • AI Article Writing
    • AI image generation
    • AI Video Tools
    • AI Office Efficiency
    • AI Programming Tools
    • AI chat assistant
    • AI Intelligence
    • AI design tools
    • AI Audio Tools
    • AI search engine
    • AI Learning Resources

    Document Extraction and Cleaning

    Total 67 articles posts
    Sorting
    releaseupdateViewsLike
    Parsio:自动从 PDF、电子邮件和其他文档中提取关键结构化数据

    Parsio: Automatically Extract Key Structured Data from PDFs, Emails and Other Documents

    综合介绍 Parsio 是一款基于 AI 技术的文档和邮件数据提取工具,能够自动从 PDF、电子邮件及其他文档中提取结构化数据。该平台提供强大的 PDF 解析器和 OCR 功能,支持多种文档类型,包括...
    Latest AI tools# Document Extraction and Cleaning
    8mos ago
    01.5K
    Chonkie:轻量级RAG文本切块库

    Chonkie: a lightweight RAG text chunking library

    综合介绍 Chonkie 是一个轻量级且高效的 RAG(Retrieval-Augmented Generation)文本切块库,旨在帮助开发者快速、简便地对文本进行分块处理。该库支持多种分块方法,包...
    Latest AI tools# AI Java Open Source Projecct# Document Extraction and Cleaning
    5mos ago
    01.5K
    TextIn:通用文档转换,PDF转Markdown工具

    TextIn: Universal Document Conversion, PDF to Markdown Tool

    综合介绍 TextIn是一款专业的PDF转Markdown工具,旨在帮助用户高效地将PDF文档转换为Markdown格式。该工具支持多种文件格式,操作简单,转换速度快,能够保留原始PDF的格式和内容...
    Latest AI tools# Document Extraction and Cleaning
    9mos ago
    01.3K
    文本提取API(text-extract-api):视觉提取文本信息,匿名化的PDF提取工具

    Text Extraction API (text-extract-api): visual extraction of text information, anonymized PDF extraction tool

    Comprehensive Introduction The Text Extraction API (text-extract-api) is a powerful tool designed to extract and parse content from a variety of document formats (e.g. PDF, Word, PPTX, etc.). The API utilizes state-of-the-art Optical Character Recognition (OCR) technology and Ol...
    Latest AI tools# AI Java Open Source Projecct# OCR# Document Extraction and Cleaning
    6mos ago
    01.6K
    Datalab:专用OCR识别AI模型,PDF转Markdown(开源/API)

    Datalab: dedicated OCR recognition AI model, PDF to Markdown (open source/API)

    综合介绍 Datalab 提供了一系列先进的AI模型,专注于OCR、布局分析、PDF转Markdown等功能。这些模型不仅性能卓越,而且易于使用,并且是开源的。平台上的Marker模型可以快速准确地将...
    Latest AI tools# AI Open Services# AI Java Open Source Projecct# OCR
    9mos ago
    01.6K
    MinerU:PDF文档提取转换为多模态Markdown格式,支持电子书OCR扫描

    MinerU: PDF document extraction and conversion to multimodal Markdown format, support e-book OCR scanning

    综合介绍 MinerU是由上海人工智能实验室OpenDataLab团队开发的一款开源数据提取工具,专注于从复杂的PDF文档、网页和电子书中高效提取内容。它能够将包含图片、公式、表格等元素的多模态PDF...
    Latest AI tools# AI Java Open Source Projecct# OCR# Document Extraction and Cleaning
    10mos ago
    01.9K
    Marker:快速将PDF转换为Markdown的开源工具

    Marker: quickly convert PDF to Markdown open source tools

    综合介绍 Marker 是一个基于深度学习的文档处理工具,旨在将 PDF 文件快速准确地转换为 Markdown 格式。它支持多种文档类型,特别优化了书籍和科学论文的转换。Marker 能够去除页眉页...
    Latest AI tools# AI Java Open Source Projecct# Document Extraction and Cleaning
    5mos ago
    01.9K
    Mathpix:PDF和图片文档结构化转换软件,支持多终端

    Mathpix: PDF and image documents structured conversion software, support for multi-terminal

    综合介绍 Mathpix 是一款强大的 AI 驱动文档自动化工具,专为科研人员、开发者和企业设计。它能够快速准确地将 PDF 和图像转换为可搜索、可导出和机器可读的文本。Mathpix 提供了多种功能...
    Latest AI tools# AI Open Services# Document Extraction and Cleaning
    11mos ago
    01.8K
    Unstructured:开源预处理非结构化文档,无结构数据处理的利器

    Unstructured: open source preprocessing unstructured documents, unstructured data processing tools

    综合介绍 Unstructured-IO 提供了一系列开源组件,用于处理和预处理图像和文本文档,如 PDF、HTML、Word 文档等。其主要目标是简化和优化数据处理工作流程,特别是为大语言模型(LL...
    Latest AI tools# AI Java Open Source Projecct# Document Extraction and Cleaning
    11mos ago
    01.5K
    Reader API:网页内容提取工具,HTML转换为Markdown格式

    Reader API: Web page content extraction tool, HTML to Markdown format conversion

    综合介绍 Jina AI的Reader项目是一个开源工具(Reader 开源地址),可将任何URL通过添加前缀https://r.jina.ai/转换成适合大型语言模型(Large Languag...
    Latest AI tools# AI Java Open Source Projecct# Document Extraction and Cleaning
    10mos ago
    01.7K
    No more
    Sharenet
    Sharenet.ai, the best and most complete AI learning guide and tool navigation. Dedicated to helping learners in the field of artificial intelligence from scratch, step by step towards proficiency! Sharenet also provides convenient access to resources. AI era, sharing is king! Ctrl + D or ⌘ + D Bookmark this site to your browser bookmark bar ❤️

    Friendly Link Applicationstatement denying or limiting responsibilityAdvertisement CooperationAbout Us

    Copyright © 2025 Sharenet 
    en_USEnglish
    en_USEnglishzh_CN简体中文 ja日本語 ko_KR한국어 es_ESEspañol de_DEDeutsch fr_FRFrançais pt_BRPortuguês do Brasil
    posts
    poststoolsappbook