Chinese Bill Dataset

Project Overview:


The project’s objective is to create a vast and meticulously annotated dataset of Chinese legislative documents. This will serve as a foundational resource for researchers, policymakers, and AI applications, enabling a deeper understanding of legislative trends, language, and policy developments in China.


The scope of the project encompasses a comprehensive array of legislative documents from various levels of government, ranging from national to local. It includes a diverse mix of legislation types, covering areas such as economic policy, social welfare, environmental regulations, and more.

Chinese Bill Dataset
Chinese Bill Dataset
Chinese Bill Dataset
Chinese Bill Dataset


  • Data is sourced from official government publications, public records, and verified transcripts of legislative sessions. This ensures the accuracy and authenticity of the dataset, while adhering to legal and ethical standards of data collection.
Chinese Bill Dataset
Chinese Bill Dataset

Data Collection Metrics

  • Total Documents Collected: 8,000
  • Range of Years Covered: 2000-2024
  • Types of Documents: Bills, Amendments, Parliamentary Debates, Committee Reports
  • Subject Areas Covered: 20+, including Economics, Healthcare, Environment, Education

Annotation Process


  1. Text Extraction: Digitizing physical documents and extracting text from digital formats.
  2. Content Tagging: Annotating key sections, clauses, and legal terms.
  3. Contextual Analysis: Linking bills to related legislative activities and historical context.

Annotation Metrics

  • Total Documents Annotated: 8,000
  • Total Annotations: 1,000,000+
  • Average Annotations per Document: 125
Chinese Bill Dataset
Chinese Bill Dataset

Quality Assurance


QA Metrics

  • Accuracy of Text Extraction: 99.5%
  • Precision in Content Tagging: 98%
  • Consistency in Contextual Analysis: 97%


The Chinese Bill Dataset project represents a significant contribution to the field of legislative studies and AI in legal analytics. By providing a rich, annotated dataset, it lays the groundwork for sophisticated analysis of Chinese legislation, aiding in policy research, legal education, and AI-driven legal tech development.

quality dataset

Quality Data Creation

Guaranteed TAT​

Guaranteed TAT

ISO 9001:2015, ISO/IEC 27001:2013 Certified​

ISO 9001:2015, ISO/IEC 27001:2013 Certified

HIPAA Compliance​

HIPAA Compliance

GDPR Compliance​

GDPR Compliance

Compliance and Security​

Compliance and Security

Let's Discuss your Data collection Requirement With Us

To get a detailed estimation of requirements please reach us.

Scroll to Top