How RunPact Predicts Your Race Time (Without AI Hype)

In an era where "AI" has become a buzzword plastered on every product, we want to be transparent about what actually powers RunPact's race time predictions. Spoiler: it's not a large language model, it's not a deep neural network, and it doesn't use any of the trendy AI techniques you're probably thinking of.

What We Actually Use

RunPact uses a gradient boosting machine learning model - specifically, XGBoost. Think of it as a sophisticated statistical model that learns patterns from historical race data to make predictions.

But What Does That Actually Mean?

At its core, our model is trying to find a mathematical function that maps your inputs (distance, elevation, your fitness level, terrain type) to an output (your predicted finish time). It's fundamentally about approximation and interpolation - finding patterns in data we've already seen to make educated guesses about new situations.

Here's what makes it different from the "AI" you might be worried about:

No Language Understanding: Our model doesn't read, write, or "understand" anything. It purely works with numbers.
No Neural Networks: We're not using deep learning or neural networks. It's closer to an advanced version of the statistical models you might have learned about in school.
Transparent Decisions: The model uses a series of decision trees. We can actually see and explain why it made a particular prediction based on which factors (elevation gain, distance, grade, etc.) it weighted most heavily.
Deterministic: Give it the same inputs twice, you'll get the exact same output. No randomness, no "creativity," no hallucinations.

How Does It Work?

1. Training Data

We train our model on real race results from actual ultra races. The model learns patterns like:

"When elevation gain increases by X meters, finish time tends to increase by Y minutes"
"Steeper average grades have a non-linear impact on pace"
"Different terrain types affect pace in predictable ways"

2. Feature Engineering

We don't just throw raw numbers at the model. We carefully craft features that capture the physics and physiology of trail running:

Distance-based features: Total distance, flat-equivalent distance
Elevation features: Total gain, total loss, net elevation change
Grade analysis: Average grade, maximum grade, percentage of steep sections
Terrain complexity: Number of significant climbs, descent patterns
Segment-level analysis: Breaking the course into chunks to capture variability

3. The Model Itself

XGBoost builds a collection of decision trees, where each tree tries to correct the errors of the previous ones. It's like having hundreds of experienced runners each giving their opinion on your finish time, then averaging their predictions with smart weighting.

The model asks questions like:

"Is the elevation gain more than 3000m? If yes, add 45 minutes to base time."
"Is the average grade steeper than 8%? If yes, reduce pace by 15 seconds per km."
"Are there sustained descents? If yes, subtract 20 minutes from predicted time."

But it does this thousands of times with much more nuance and interactions between factors.

Why This Approach?

Transparency

We can inspect which features matter most. Currently, the top predictors are:

Total distance (obviously)
Elevation gain
Average grade
Maximum sustained grade
Terrain type

Reliability

Statistical models like this are stable and predictable. They don't have "off days" or make wild guesses. The predictions might not always be perfect, but they're consistent.

Privacy

Your data stays your data. The model doesn't need to phone home to some cloud service to make predictions. Once trained, it runs entirely on our servers with your race data.

Interpretability

If you're wondering why you got a particular prediction, we can actually explain it. "Your predicted time is longer because the course has 5000m of elevation gain and sustained grades over 12%."

Limitations and Honesty

Let's be real about what our model can't do:

It can't predict the unpredictable: Weather changes, nutrition disasters, mental struggles - these human factors aren't in the model.
It's only as good as its training data: If we haven't seen many races similar to yours, predictions will be less accurate.
It assumes you're properly trained: The model predicts what a well-trained runner would do on that course. If you haven't trained properly, you'll be slower.
Individual variation matters: Some runners handle elevation better, others excel on technical terrain. The model gives an average prediction.

The Bottom Line

RunPact uses legitimate machine learning - the same techniques used in science, engineering, and statistics for decades. It's not magical, it's not sentient, and it won't write you poetry about your race.

It's a tool that learns from data to make informed predictions. Nothing more, nothing less.

And honestly? In a world of AI hype, we think that's refreshing.

Want to see how the model predicts your race? Upload your GPX file and try our Race Intelligence feature.