Research | Med.AI

Model Type A

Supervised ML

Approach

Uses TF-IDF to turn sentences into numbers that show how important the words are. Logistic Regression then learns from labeled examples to decide whether a sentence is important or not, helping the system select the most useful sentences for summarization.

Technique Stack

NLTK Tokenization TF-IDF (Bi-grams) Logistic Regression Cosine Similarity

01

Learning from
Ground Truth.

By training on labeled datasets where "gold standard" summaries exist, our Logistic Regression model learns specific feature weights—such as sentence position, length, and keyword density—that denote high-value information.

// Probability Prediction

P(y=1|x) = 1 / (1 + e^(-(β0 + β1x1 + ... + βnxn)))

x1: Term Freq

x2: Sentence Loc

x3: Doc Length

02

Graph-Based
Ranking.

When no training data is available, we treat the document as a connected graph. Sentences are nodes, and similarity scores are edges. The most "connected" sentences are mathematically determined to be central to the topic using the PageRank algorithm.

Model Type B

Unsupervised ML

Approach

Uses TF-IDF to turn sentences into numerical features that show the importance of their words. Then, TextRank looks at how sentences are connected and ranks them to find the most important ones.

Technique Stack

TextRank (PageRank) Bucket Selection Sublinear TF Similarity Matrix

Generative AI Layer

Interactive Intelligence
with Gemini

While our extractive models provide factual summaries, the Gemini API adds a layer of reasoning. It allows users to query the document contextually and translate findings instantly.

Q&A Engine: Ask specific questions about the uploaded PDF/Image.
Translation: Translate summaries into multiple languages instantly.

Can you explain the contraindications mentioned in Section 4?

Based on Section 4, the primary contraindications are active liver disease and severe renal impairment (GFR < 30).

System Architecture

End-to-End Processing Pipeline

01

Input & Config

User uploads PDF/Text/Image and selects summary length and tone.

02

Processing Core

ML Model Selection (Supervised/Unsupervised), Text Cleaning, and Ranking.

03

Result Dashboard

Split view output, PDF export, Gemini-powered translation and chat.

Model Type A

Supervised ML

Approach

Technique Stack

Learning from Ground Truth.

Graph-Based Ranking.

Model Type B

Unsupervised ML

Approach

Technique Stack

Interactive Intelligence with Gemini

System Architecture

Input & Config

Processing Core

Result Dashboard

Learning from
Ground Truth.

Graph-Based
Ranking.

Interactive Intelligence
with Gemini