TF-IDF Calculator

Calculate Term Frequency-Inverse Document Frequency for multiple documents

Documents

Add multiple documents to calculate TF-IDF scores. Each document should contain text with words separated by spaces.

What is TF-IDF?

TF-IDF (Term Frequency-Inverse Document Frequency) is a numerical statistic that reflects how important a word is to a document in a collection of documents.

Formula:

TF-IDF(t, d) = TF(t, d) × IDF(t)

Term Frequency (TF): How often a term appears in a document

TF(t, d) = (Number of times term t appears in document d) / (Total number of terms in document d)

Inverse Document Frequency (IDF): How rare or common a term is across all documents

IDF(t) = log(Total number of documents / Number of documents containing term t)

Interpretation:

  • Higher TF-IDF scores indicate terms that are important to a specific document
  • Terms that appear frequently in one document but rarely in others get high scores
  • Common terms that appear in all documents get low scores
  • TF-IDF is widely used in information retrieval, text mining, and search engines
How to Use
  1. Add Documents: Click "Add Document" to create text input areas
  2. Enter Text: Type or paste text content into each document area
  3. Calculate: Click "Calculate TF-IDF" to analyze the documents
  4. View Results: See top keywords and detailed TF-IDF scores in the table
  5. Remove Documents: Use the "Remove" button to delete individual documents

Example Use Cases:

  • Compare multiple articles or documents to find distinctive terms
  • Identify keywords that characterize each document
  • Analyze text collections for information retrieval
  • Extract important terms for text summarization