🎮 (JumpNet) Part 1: From Raw Gameplay to Labeled Intelligence — Building the Data Foundation for JumpNet
Ertugrul

Ertugrul @ertugrulmutlu

Joined:
Mar 10, 2024

🎮 (JumpNet) Part 1: From Raw Gameplay to Labeled Intelligence — Building the Data Foundation for JumpNet

Publish Date: Aug 4
0 0

⬅️Read Part 2: Training the Model
➡️Read Part 3: Real-Time Inference — Watching JumpNet Come Alive

🤔 Why Build a Data Pipeline in the First Place?

Before any model could learn to play, it first needed to observe. That meant building a full pipeline to capture what the game sees — and how a human reacts.

My goal was to create a dataset that records:

  • What was on the screen (image)
  • Whether a jump action was taken (label)
  • Which key was pressed (multi-hot key vector)
  • How long the key was held (duration)
  • The phase (press/release)
  • The frame index (for video sync)

You can explore how I built the actual snipping, key-logging, and GUI tools in detail here:
🔗 Modular Snip Recorder – Part 1
🔗 Modular Viewer – Part 2
🔗 Github for Data Tool


🧱 Step-by-Step Data Pipeline

🎯 Step 1: Extracting Keypress Events (Positive Samples)

I started with raw .npz files generated by my snipping tool. Then, I cleaned and extracted only the first press–release pairs using the script below:

# Data_preproccer.py
...
entry_press = (
    press_image,      # (227, 227, 3)
    1.0,              # label = jump
    press_multi_hot,  # keys_raw
    hold_vec,         # how long the key was held
    "press",          # phase
    press_frame       # video sync frame
)
Enter fullscreen mode Exit fullscreen mode

This step ensures that we only feed clean, clear decision points into the model.


🌀 Step 2: Data Augmentation for Robustness

Because the number of positive samples was relatively small, I applied basic image augmentations to multiply the diversity:

  • Horizontal flip
  • Brightness variation
  • Gaussian noise
  • Horizontal shift
# Data_argumentation.py
if random.random() < 0.5:
    aug_img = cv2.flip(aug_img, 1)  # Flip
...
aug_img = np.clip(aug_img + noise, 0, 1)  # Add noise
Enter fullscreen mode Exit fullscreen mode

The final positive dataset was tripled through these augmentations.


❌ Step 3: Extracting Negative Samples from Video

I wanted to teach the model not to jump as well. So I went back to the raw video and excluded ±2 frames around every actual jump. Every other frame became a negative:

# Data_Negativ_scrapping.py
if frame_idx not in excluded:
    entry = [frame_rgb, 0.0, [0.], [0.], "press", frame_idx]
Enter fullscreen mode Exit fullscreen mode

These "do nothing" frames are crucial — without them, the model would jump all the time.


✂️ Step 4: Downsampling the Negative Data

Since there were too many negative frames, I randomly selected 500 of them for balance:

# Data_reducier.py
indices = np.random.choice(total, 500, replace=False)
Enter fullscreen mode Exit fullscreen mode

🧬 Step 5: Merging All Data into One File

I then merged positive + negative data and shuffled everything:

# Data_merge_final.py
all_data = np.concatenate([pos_data, neg_data], axis=0)
np.random.shuffle(all_data)
np.savez(output_path, data=all_data)
Enter fullscreen mode Exit fullscreen mode

📦 Final Dataset Snapshot

Here's what a few samples look like:

📦 Total Samples: 1778 (1278 positive + 500 negative)

📍 Entry #1
  image.shape     : (227, 227, 3)
  label           : 1.0
  keys_raw        : [1.]
  hold_duration   : [0.294]
  phase           : press
  frame_index     : 7720

📍 Entry #3
  label           : 0.0
  hold_duration   : [0.0]
  frame_index     : 872
Enter fullscreen mode Exit fullscreen mode

Each entry is a tuple of:

(
  image,          # RGB, normalized
  label,          # 1 = jump, 0 = no jump
  keys_raw,       # multi-hot key state
  hold_duration,  # seconds
  phase,          # "press"
  frame_index     # for video mapping
)
Enter fullscreen mode Exit fullscreen mode

🤪 Bonus: File Explorer + Sample Inspector

Once data was collected, I needed a way to explore it visually and catch any outliers. That’s where my Viewer tool came in handy.

🔗 Check out how I built it here:
Modular Viewer – Data Explorer Tool


🏯 Wrap-up

With this pipeline, I went from chaotic video footage to a neat, clean, and labeled dataset tailored for behavior cloning. Whether the model needs to jump or stay still, now it has the context to make a decision — just like a human would.

In the next post, I’ll break down how I trained the model, tuned hyperparameters, and evaluated its performance. Stay tuned!


🔜 Coming Up Next: Part 2 – Model Training & Evaluation

In the next post, we’ll move from data to decisions.
I’ll walk you through how I built the model that actually uses this data — covering architecture, training process, and performance evaluation.

Part 2 – From Pixels to Policy: Training JumpNet to Make the Right Move 🚀


📂 Repo & Resources

*🔗 GitHub: Dataset Tooling + Viewer
*🔗 GitHub: Data Tool, Train and Simulation Codes


"Garbage in, garbage out."
This is the phase where we make sure the input is not garbage. 🚀

Comments 0 total

    Add comment