Federated Learning for Privacy-Focused NLP
CryptIn Progress2025-26

Federated Learning for Privacy-Focused NLP

Project Overview

A federated learning pipeline trains models on local legal documents, sharing encrypted updates to preserve confidentiality.

About This Project

Federated NLP system to detect risky legal clauses without centralizing confidential contracts.

Tech Stack

Federated LearningNLPPrivacy

Team

Mentors

  1. Abhimanyu
  2. Chaitanya Menon
  3. Shreya

Mentees

  1. Dhruva
  2. Emerin
  3. Tanmay
  4. Rohith

Methodology

The project follows a structured implementation approach that includes Local document preprocessing and clause extraction, Transformer embeddings with legal-domain models, Federated training with encrypted model updates, and Flower/PySyft style privacy-preserving orchestration. These steps are executed iteratively to validate assumptions, improve performance, and ensure reliable delivery of the final solution.

Expected Outcome

By the end of this project, the team is expected to deliver Cross-client learning without data sharing, and Compliance-friendly risk analysis of contracts. Together, these outcomes reflect both technical feasibility and practical value for demos, evaluation, and future scaling.