⬅️Read Part 2: Training the Model
➡️Read Part 3: Real-Time Inference — Watching JumpNet Come Alive
🤔 Why Build a Data Pipeline in the First Place?
Before any model could learn to play, it first needed to observe. That meant building a full pipeline to capture what the game sees — and how a human reacts.
My goal was to create a dataset that records:
- What was on the screen (image)
- Whether a jump action was taken (label)
- Which key was pressed (multi-hot key vector)
- How long the key was held (duration)
- The phase (press/release)
- The frame index (for video sync)
You can explore how I built the actual snipping, key-logging, and GUI tools in detail here:
🔗 Modular Snip Recorder – Part 1
🔗 Modular Viewer – Part 2
🔗 Github for Data Tool
🧱 Step-by-Step Data Pipeline
🎯 Step 1: Extracting Keypress Events (Positive Samples)
I started with raw .npz
files generated by my snipping tool. Then, I cleaned and extracted only the first press–release
pairs using the script below:
# Data_preproccer.py
...
entry_press = (
press_image, # (227, 227, 3)
1.0, # label = jump
press_multi_hot, # keys_raw
hold_vec, # how long the key was held
"press", # phase
press_frame # video sync frame
)
This step ensures that we only feed clean, clear decision points into the model.
🌀 Step 2: Data Augmentation for Robustness
Because the number of positive samples was relatively small, I applied basic image augmentations to multiply the diversity:
- Horizontal flip
- Brightness variation
- Gaussian noise
- Horizontal shift
# Data_argumentation.py
if random.random() < 0.5:
aug_img = cv2.flip(aug_img, 1) # Flip
...
aug_img = np.clip(aug_img + noise, 0, 1) # Add noise
The final positive dataset was tripled through these augmentations.
❌ Step 3: Extracting Negative Samples from Video
I wanted to teach the model not to jump as well. So I went back to the raw video and excluded ±2 frames around every actual jump. Every other frame became a negative:
# Data_Negativ_scrapping.py
if frame_idx not in excluded:
entry = [frame_rgb, 0.0, [0.], [0.], "press", frame_idx]
These "do nothing" frames are crucial — without them, the model would jump all the time.
✂️ Step 4: Downsampling the Negative Data
Since there were too many negative frames, I randomly selected 500 of them for balance:
# Data_reducier.py
indices = np.random.choice(total, 500, replace=False)
🧬 Step 5: Merging All Data into One File
I then merged positive + negative data and shuffled everything:
# Data_merge_final.py
all_data = np.concatenate([pos_data, neg_data], axis=0)
np.random.shuffle(all_data)
np.savez(output_path, data=all_data)
📦 Final Dataset Snapshot
Here's what a few samples look like:
📦 Total Samples: 1778 (1278 positive + 500 negative)
📍 Entry #1
image.shape : (227, 227, 3)
label : 1.0
keys_raw : [1.]
hold_duration : [0.294]
phase : press
frame_index : 7720
📍 Entry #3
label : 0.0
hold_duration : [0.0]
frame_index : 872
Each entry is a tuple of:
(
image, # RGB, normalized
label, # 1 = jump, 0 = no jump
keys_raw, # multi-hot key state
hold_duration, # seconds
phase, # "press"
frame_index # for video mapping
)
🤪 Bonus: File Explorer + Sample Inspector
Once data was collected, I needed a way to explore it visually and catch any outliers. That’s where my Viewer tool came in handy.
🔗 Check out how I built it here:
Modular Viewer – Data Explorer Tool
🏯 Wrap-up
With this pipeline, I went from chaotic video footage to a neat, clean, and labeled dataset tailored for behavior cloning. Whether the model needs to jump or stay still, now it has the context to make a decision — just like a human would.
In the next post, I’ll break down how I trained the model, tuned hyperparameters, and evaluated its performance. Stay tuned!
🔜 Coming Up Next: Part 2 – Model Training & Evaluation
In the next post, we’ll move from data to decisions.
I’ll walk you through how I built the model that actually uses this data — covering architecture, training process, and performance evaluation.
Part 2 – From Pixels to Policy: Training JumpNet to Make the Right Move 🚀
📂 Repo & Resources
*🔗 GitHub: Dataset Tooling + Viewer
*🔗 GitHub: Data Tool, Train and Simulation Codes
"Garbage in, garbage out."
This is the phase where we make sure the input is not garbage. 🚀