How AI Analyzes Your Code to Extract Skills

Ever wondered how Checkmarked can look at your GitHub repositories and automatically understand what technologies you use, what you've built, and what skills you have?

In this post, we'll dive into the technical details of how our AI-powered code analysis works.

The Challenge

Extracting meaningful information from code is harder than it sounds. We need to:

Identify technologies and frameworks from code patterns
Understand the purpose and architecture of projects
Determine individual contribution levels in team projects
Verify claims about what someone built

Let's break down how we approach each of these.

Technology Detection

The first step is understanding what technologies are used in a repository.

Beyond Package.json

While package.json, requirements.txt, and similar files give us dependency lists, they don't tell the whole story. A project might list React as a dependency but barely use it.

Our analysis looks at:

Import statements - What's actually being imported and used
File patterns - .tsx files suggest React with TypeScript
Code patterns - Hooks usage, component patterns, API structures
Configuration files - tsconfig.json, .eslintrc, tailwind.config.js

Confidence Scoring

Not all technology usage is equal. We assign confidence scores based on:

How extensively a technology is used
Whether it's in production code or just dev dependencies
The complexity of usage (basic vs. advanced patterns)

Contribution Analysis

For team projects, we analyze git history to understand individual contributions.

What We Look At

Commit history - Who wrote what code
File ownership - Primary authors of key files
Code impact - Not just lines changed, but meaningful changes

Handling Edge Cases

We account for:

Pair programming (shared commits)
Code reviews and refactoring
Initial scaffolding vs. feature development

Claim Verification

When you say "Built the authentication system," we verify it by:

Finding auth-related files in your commits
Analyzing the complexity of your contributions
Checking for patterns like OAuth implementations, JWT handling, etc.

If we find evidence, the claim is marked as verified. If we can't verify but the claim is plausible, it's marked as suggested.

Privacy First

Important note: We never store your source code.

Our analysis happens through the GitHub API, and we only store:

Metadata about technologies detected
Contribution statistics
Verification results

Your actual code stays on GitHub.

Continuous Improvement

Our AI models are constantly learning from:

New framework patterns
Emerging technologies
Edge cases and corrections

The more portfolios we analyze, the better our detection becomes.

Want to see what our AI finds in your code? Connect your GitHub and find out.