βœ‚οΈ From 2500+ to 10+ Lines: Automated Modular Code Refactor πŸ€–
Madhurima Rawat

Madhurima Rawat @madhurima_rawat

About: πŸš€ Passionate about tech & growth | πŸ’» Data Scientist & Frontend Dev | C, C++, Python, R | πŸ“Š InfluxDB, Grafana | πŸ€– ML Enthusiast | πŸ™ GitHub & Open Source | πŸ“š Learning & Sharing

Location:
Bhilai, Chhattisgarh, India
Joined:
Apr 11, 2024

βœ‚οΈ From 2500+ to 10+ Lines: Automated Modular Code Refactor πŸ€–

Publish Date: May 2
11 0

Hey awesome people! πŸ‘‹

Thank you so much for the recent engagement on my articles β€” it really made my day! πŸ’–

So, I had actually planned to write the next article in my ☁️ Local Cloud Computing with LocalStack + Docker + AWS CLI πŸ’‘ (6 Part Series) series. But something interesting and super useful came up while working on my major project for college β€” a Stock Market Prediction Tool β€” and I thought this deserves a quick write-up!

You can check out the project here πŸ”—:

GitHub logo madhurimarawat / Stock-Market-Prediction

This repository began as a 7th-semester minor project and evolved into our 8th-semester major project, "Advanced Stock Price Forecasting Using a Hybrid Model of Numerical and Textual Analysis." It utilizes Python, NLP (NLTK, spaCy), ML models, Grafana, InfluxDB, and Streamlit for data analysis and visualization.

Stock-Market-Prediction

This repository began as a 7th-semester minor project and evolved into our 8th-semester major project, "Advanced Stock Price Forecasting Using a Hybrid Model of Numerical and Textual Analysis." It utilizes Python, NLP (NLTK, spaCy), ML models, Grafana, InfluxDB, and Streamlit for data analysis and visualization.

Repo Size GitHub Stars GitHub Forks GitHub Issues Closed Issues Open Pull Requests Closed Pull Requests GitHub Discussions GitHub Contributors Top Language License Last Commit Repository Age Workflow Status GitHub Watchers

πŸ“ˆ Stock Market Illustration

πŸ’‘ Real Time Prediction

Project Description

The Advanced Stock Price Forecasting Using a Hybrid Model of Numerical and Textual Analysis project involves a comprehensive approach to predicting stock prices using both numerical data and textual analysis. The project components include:

  1. Data Collection and Storage: We gathered historical stock data of major companies and stored it in an InfluxDB database to efficiently handle large-scale time-series data.

  2. Data Visualization: A Grafana dashboard has been set up for real-time visualization of stock prices and analysis results, enhancing data interpretation and decision-making processes.

  3. Textual Analysis for Enhanced Forecasting: We utilized Natural Language Processing (NLP) libraries, such as NLTK…


πŸ“š Table of Contents

  1. What’s the Context?
  2. Problem Faced 😡
  3. Why Modularization? πŸ’‘
  4. Flowchart Overview πŸ“Š
  5. Modularization Steps 🧩
  6. Pros and Cons βš–οΈ
  7. GitHub Issue and PR Links πŸ”—
  8. Before and After Code πŸ“
  9. Final Thoughts πŸ’¬

What’s the Context?

For my college major project, I built a Stock Market Prediction app using Streamlit. The app includes interactive visualizations, model evaluation, live predictions, and dashboards powered by tools like Grafana and Power BI.

Initially, the Streamlit app had over 2500+ lines of code β€” all in a single file 😡. I had separate versions for local and deployed apps because some file paths were broken in the production environment.

This made even small changes (like adding a new function) a complete nightmare.


Problem Faced 😡

The original code structure looked like this:
πŸ”— Original Code (before refactoring)

It had all functions β€” data handling, UI logic, visualizations, predictions β€” bundled into one giant file. This made debugging, testing, and adding new features quite challenging and error-prone.

While the code was already modular in terms of function structure, everything lived in a single place. So I decided to properly refactor it β€” extracting each functional block into its own dedicated file. That’s why the term "refactoring" is used in the cover image, and "modularization steps" are outlined throughout the article.

So, I took the modular functions and placed them in separate files, organizing the code for better structure and clarity. πŸš€


Why Modularization? πŸ’‘

Codebase Modularization means breaking a large, messy file into independent, reusable, and manageable modules.

🏒 Visualizing the Idea

Imagine your codebase as a large building β€” with a central hall and multiple rooms where each person (function) lives. Initially, everything is connected, but it’s a bit noisy, and the boundaries are blurry.

πŸ” Now, you refactor (transform) the setup:
You redesign the structure into separate flats. Each flat is self-contained, cleaner, and still linked through a central lobby (like a main() function). People (functions) still collaborate, but now with clearer separation, more peace, and organized communication.

This transformation reflects how we modularize code: splitting logic into individual function files, cleaning dependencies, and improving maintainability β€” all while preserving the original connections.

πŸ–ΌοΈ This is what I’m trying to visualize in my cover image β€” like transforming a shared hall into peaceful, connected flats. 🏒➑️🏘️

Curious if the cover made sense and looked good to you! πŸ‘€βœ¨


Pros and Cons βš–οΈ

βœ… Pros (in My App’s Context)

  • πŸ§ͺ Easier Debugging and Testing: With functions and modules isolated, testing becomes straightforward, allowing for easier identification and resolution of bugs in specific areas, without impacting other parts of the app.

  • 🧹 Better Readability and Code Structure: Modularization ensures the code is clean and well-organized, improving readability. Each module is purpose-driven, making it simple for anyone β€” including future developers β€” to understand the logic and flow.

  • πŸ” Independent Development/Deployment of Modules: Each module is developed and deployed independently, allowing updates or changes in one part of the project without affecting the rest of the app. This separation streamlines the workflow and keeps the process efficient.

  • πŸ“ˆ Scalable for Future Improvements: The modular structure is built for growth, making it easy to expand the app by adding new features without disturbing the existing code. This design ensures the app is ready for future enhancements and scaling.

❗ Cons of Modularization (in my app’s context)

While modularization comes with a lot of benefits, there are a few challenges I encountered during this refactor:

  • ⏱️ Initial setup time: Setting up the modular structure took quite a bit of effort initially β€” especially extracting functions, handling dependencies, and preparing import aggregators.
  • 🧡 Managing imports and dependencies: Every individual function file needs to have its required imports β€” which adds complexity, especially when working with libraries like streamlit, pandas, and plotly.
  • 🌐 Local vs Deployed sync: I had to maintain two separate sets of files β€” one for local and one for deployment β€” because some paths and behaviors differ in production (as I explained in this GitHub issue).
  • * πŸ” Circular dependencies: Refactoring functions into separate modules means being extra cautious to avoid circular imports (e.g., two modules importing from each other), which can break the app.
  • * πŸ—‚οΈ Managing directory structures: Maintaining a clean and logical directory hierarchy (like features_functions_local/ and features_functions_deployed/) is necessary but requires planning, otherwise it gets messy fast.

Even though this was from my project app perspective, it’s something that applies to most projects too! 🌍


Flowchart Overview πŸ“Š

πŸ“Œ Codebase Modularization Flowchart
Codebase Modularization Flowchart


Modularization Steps 🧩

Here’s what I did to cleanly modularize the Streamlit app:

Step 1: Function_Splitting.py

πŸ“€ Extracts each function into its own file.
πŸ“ Saved in:

  • features_functions_local/
  • features_functions_deployed/

Code πŸ‘©β€πŸ’»

"""
Author: Madhurima Rawat

Script to split top-level functions (excluding 'main') from a large Streamlit app file
into separate Python files, preserving:
- Any comment immediately above the function
- Any global variable used within the function

Each function is saved as a standalone `.py` file inside the `split_functions/` directory.
"""

import ast
import os
import astunparse

# === CONFIGURATION ===
INPUT_FILE = "Streamlit_app_local_combined.py"  # The source Python file to extract functions from

# For Local Running functions
# OUTPUT_DIR = "feature_functions_local"  # Directory to store individual function files

# For Deployed Functions
OUTPUT_DIR = (
    "feature_functions_deployed"  # Directory to store individual function files
)

# Ensure the output directory exists
os.makedirs(OUTPUT_DIR, exist_ok=True)

# Read the entire source file content
with open(INPUT_FILE, "r", encoding="utf-8") as f:
    source = f.read()

# Parse the source code into an abstract syntax tree (AST)
tree = ast.parse(source)
lines = source.splitlines()

# Containers to hold relevant nodes
function_nodes = []  # Functions to extract
global_vars = []  # Global variables (assignments) at top level
import_lines = []  # All top-level import statements

# Classify top-level elements in the AST
for node in tree.body:
    if isinstance(node, (ast.Import, ast.ImportFrom)):
        import_lines.append(node)  # Collect import lines
    elif isinstance(node, ast.Assign):
        global_vars.append(node)  # Track top-level assignments
    elif isinstance(node, ast.FunctionDef) and node.name != "main":
        function_nodes.append(node)  # Collect functions except 'main'


# Extract comment immediately above a function, if present
def get_leading_comment(node):
    lineno = node.lineno - 2
    while lineno >= 0 and lines[lineno].strip().startswith("#"):
        return lines[lineno].strip()
    return ""


# Get all variable names used in a function
def get_used_names(node):
    return {n.id for n in ast.walk(node) if isinstance(n, ast.Name)}


# Prepare a mapping of global variable name -> source code
all_globals = {
    t.targets[0].id: astunparse.unparse(t).strip()
    for t in global_vars
    if isinstance(t.targets[0], ast.Name)
}

# Create a separate file for each function
for func_node in function_nodes:
    func_name = func_node.name
    func_code = ast.get_source_segment(
        source, func_node
    )  # Full source code of the function
    used_names = get_used_names(func_node)  # Variables used in the function

    leading_comment = get_leading_comment(
        func_node
    )  # Optional comment above the function
    globals_needed = [all_globals[name] for name in used_names if name in all_globals]

    file_path = os.path.join(OUTPUT_DIR, f"{func_name}.py")

    with open(file_path, "w", encoding="utf-8") as out_file:
        if leading_comment:
            out_file.write(leading_comment + "\n")  # Write leading comment if present
        for g in globals_needed:
            out_file.write(g + "\n")  # Write any required globals
        out_file.write("\n" + func_code + "\n")  # Finally, write the function code

print(f"βœ… Done! Functions split into {OUTPUT_DIR}")
Enter fullscreen mode Exit fullscreen mode

Step 2: Dependency_Adder.py

πŸ” Uses AST (Abstract Syntax Tree) to detect and inject the right import statements into each function file.

Code πŸ‘©β€πŸ’»

"""
Author: Madhurima Rawat

Script to analyze individual Python function files inside the 'split_functions' directory
and prepend necessary import statements based on used modules, functions, or objects.

The goal is to ensure each function file is self-contained by including all relevant
imports at the top. This script uses a heuristic approach, looking for known identifiers
to determine which modules are needed.
"""

import os
import ast
from pathlib import Path

# === CONFIGURATION ===

# Path to the folder containing the split function files(Local)
# functions_folder = Path("feature_functions_local")

# Path to the folder containing the split function files (Deployed)
functions_folder = Path("feature_functions_deployed")

# Mapping of commonly used identifiers to their respective import statements
# Includes one-line comments categorized for clarity and maintainability

import_suggestions = {
    # --- STREAMLIT APP and VISUALIZATION FRAMEWORK ---
    "st": "# Importing Streamlit for building the web-based interactive application framework\nimport streamlit as st",
    "plt": "# Importing Matplotlib for generating static plots and charts\nimport matplotlib.pyplot as plt",
    "go": "# Importing Plotly for creating interactive and dynamic visual plots\nimport plotly.graph_objects as go",
    "sns": "# Importing Seaborn for enhanced data visualizations\nimport seaborn as sns",
    # --- DATA HANDLING and MANIPULATION ---
    "pd": "# Importing Pandas for data manipulation and analysis\nimport pandas as pd",
    "np": "# Importing NumPy for numerical computations and array operations\nimport numpy as np",
    "os": "# Importing OS module for handling file and directory paths\nimport os",
    "datetime": "# Importing datetime for working with timestamps and date ranges\nfrom datetime import datetime, timedelta",
    "base64": "# Importing base64 for encoding and decoding binary data\nimport base64",
    # --- MACHINE LEARNING and MODELING ---
    "pickle": "# Importing Pickle for loading/saving pre-trained machine learning models\nimport pickle",
    "LinearRegression": "# Linear Regression model\nfrom sklearn.linear_model import LinearRegression",
    "RandomForestRegressor": "# Random Forest Regressor\nfrom sklearn.ensemble import RandomForestRegressor",
    "SVR": "# Support Vector Machine Regressor\nfrom sklearn.svm import SVR",
    "mean_squared_error": "# Importing evaluation metrics from Scikit-learn\nfrom sklearn.metrics import mean_squared_error, r2_score, precision_score, recall_score, f1_score",
    "MinMaxScaler": "# For scaling data to a 0–1 range\nfrom sklearn.preprocessing import MinMaxScaler",
    "TfidfVectorizer": "# Text feature extraction\nfrom sklearn.feature_extraction.text import TfidfVectorizer",
    # --- DEEP LEARNING (PyTorch) ---
    "torch": "# Importing PyTorch for building and training deep learning models\nimport torch",
    "nn": "# Importing PyTorch's neural network module\nimport torch.nn as nn",
    # --- NATURAL LANGUAGE PROCESSING (NLP) ---
    "TextBlob": "# Importing TextBlob for basic natural language processing tasks\nfrom textblob import TextBlob",
    # --- FINANCIAL DATA and UTILITIES ---
    "yf": "# Importing yfinance for fetching historical stock data from Yahoo Finance\nimport yfinance as yf",
    "webbrowser": "# Importing webbrowser module to open URLs in the default browser\nimport webbrowser",
    "openpyxl": "# Importing openpyxl to enable writing Excel files (.xlsx)\nimport openpyxl",
}


def get_required_imports(source_code):
    """
    Analyze source code of a function to detect required imports
    based on the presence of known identifiers.
    """
    tree = ast.parse(source_code)
    imports = set()

    for node in ast.walk(tree):
        # Detect simple names like 'pd', 'st', etc.
        if isinstance(node, ast.Name):
            if node.id in import_suggestions:
                imports.add(import_suggestions[node.id])
        # Detect attribute access like 'plt.plot'
        elif isinstance(node, ast.Attribute):
            value_id = getattr(node.value, "id", None)
            if value_id in import_suggestions:
                imports.add(import_suggestions[value_id])

    return sorted(imports)


# === MAIN PROCESS ===

changed_files_count = 0  # Counter for changed files
changed_lines_count = 0  # Counter for changed lines

# Iterate through each Python file in the target folder
for filepath in functions_folder.glob("*.py"):
    with open(filepath, "r", encoding="utf-8") as file:
        original_code = file.read()

    # Determine needed imports based on code analysis
    needed_imports = get_required_imports(original_code)

    # Combine imports and original content
    updated_code = "\n".join(needed_imports) + "\n\n" + original_code

    # Only write back to the file if changes were made
    if updated_code != original_code:
        with open(filepath, "w", encoding="utf-8") as file:
            file.write(updated_code)

        # Count the changes
        changed_files_count += 1
        changed_lines_count += updated_code.count("\n")

# Output results
print(
    f"βœ… Done! {changed_files_count} file(s) updated. {changed_lines_count} lines changed."
)
Enter fullscreen mode Exit fullscreen mode

Step 3: Function_Importing.py

πŸ“¦ Generates a unified import file to easily access all modular functions in one place.

Code πŸ‘©β€πŸ’»

"""
Author: Madhurima Rawat

Script to generate a single import file (Import_Functions.py) that aggregates
function imports from individual .py files inside the 'split_functions' directory.

Each file is assumed to define a function with the same name as the filename.

The output file includes:
- A docstring at the top explaining the purpose
- Individual import statements
- A summary comment with total number of imported functions
"""

import os

# === CONFIGURATION ===

# For Local
# FUNCTION_DIR = "feature_functions_local"
# IMPORT_FILE = "Import_Functions_Local.py"

# For Deployment
FUNCTION_DIR = "feature_functions_deployed"
IMPORT_FILE = "Import_Functions_Deployed.py"

# List all Python files in the function directory
function_files = [f for f in os.listdir(FUNCTION_DIR) if f.endswith(".py")]
total_imports = len(function_files)

# === GENERATE THE IMPORT FILE ===
with open(IMPORT_FILE, "w", encoding="utf-8") as f:
    # Write top-level docstring to the output file
    f.write('"""\n')
    f.write(
        f"This file was auto-generated to import all functions from '{FUNCTION_DIR}/'.\n"
    )
    f.write(
        f"Each function file is expected to define a function named after the filename.\n"
    )
    f.write(f"Total functions imported: {total_imports}\n")
    f.write('"""\n\n')

    f.write("# === FUNCTION IMPORTS ===\n")

    # Write each import statement
    for filename in function_files:
        module_name = filename[:-3]  # Remove .py extension
        f.write(f"from {FUNCTION_DIR}.{module_name} import {module_name}\n")

    # Footer summary comment
    f.write(f"\n# βœ… Total functions imported: {total_imports}\n")

print(
    f"βœ… '{IMPORT_FILE}' created with {total_imports} function imports from '{FUNCTION_DIR}/'"
)
Enter fullscreen mode Exit fullscreen mode

Step 4: Split_Clean_Main_Code.py

🧼 Cleans the main app file (app_cleaned.py) by extracting embedded functions and retaining only the core logic. Also removes unused dependencies for better clarity and maintainability.

Code πŸ‘©β€πŸ’»:

"""
Author: Madhurima Rawat

Script to clean a Streamlit app by removing all top-level functions except 'main'.
Also removes unused import statements.
Preserves:
- All used import statements
- Global variables
- Top-level code (outside functions)
- The 'main' function (if present)

Saves the result as 'app_cleaned.py'.
"""

import ast
import black

# For Deployed
INPUT_FILE = "Streamlit_app_combined.py"
CLEANED_FILE = "Streamlit_app.py"

# Read source
with open(INPUT_FILE, "r", encoding="utf-8") as f:
    source = f.read()

lines = source.splitlines()
tree = ast.parse(source)

# Step 1: Remove non-main top-level functions and their header comments
lines_to_remove = set()
for node in tree.body:
    if isinstance(node, ast.FunctionDef) and node.name != "main":
        comment_line = node.lineno - 2
        if 0 <= comment_line < len(lines):
            lines_to_remove.add(comment_line)
        for i in range(node.lineno - 1, node.end_lineno):
            lines_to_remove.add(i)

cleaned_lines = [line for i, line in enumerate(lines) if i not in lines_to_remove]
cleaned_code = "".join(line + "\n" for line in cleaned_lines)


# Step 2: Parse cleaned code to remove unused imports
class ImportUsageAnalyzer(ast.NodeVisitor):
    def __init__(self):
        self.imports = {}
        self.used_names = set()

    def visit_Import(self, node):
        for alias in node.names:
            self.imports[alias.asname or alias.name] = node.lineno

    def visit_ImportFrom(self, node):
        for alias in node.names:
            name = alias.asname or alias.name
            self.imports[name] = node.lineno

    def visit_Name(self, node):
        self.used_names.add(node.id)


# Analyze the cleaned code
tree = ast.parse(cleaned_code)
analyzer = ImportUsageAnalyzer()
analyzer.visit(tree)

# Identify unused imports
unused_import_lines = set()
for name, lineno in analyzer.imports.items():
    if name not in analyzer.used_names:
        unused_import_lines.add(lineno - 1)  # Convert to 0-based

# Final clean-up: remove unused imports
final_lines = [
    line
    for i, line in enumerate(cleaned_code.splitlines())
    if i not in unused_import_lines
]
final_code = "".join(line + "\n" for line in final_lines)
formatted_code = black.format_str(final_code, mode=black.FileMode())

# Write the cleaned, formatted file
with open(CLEANED_FILE, "w", encoding="utf-8") as f:
    f.write(formatted_code)

print(
    f"βœ… Cleaned and formatted file saved as '{CLEANED_FILE}' "
    "(only 'main' retained, unused imports and comments removed)."
)
Enter fullscreen mode Exit fullscreen mode

Step 5: Run Final Modular App

🎯 Fully cleaned and modularized version β€” ready for production deployment!

πŸ“ Final updated file:
πŸ”— Updated Streamlit App


Pros and Cons βš–οΈ

🟒 Pros πŸ”΄ Cons
Cleaner code, easier to navigate Initial time investment
Independent testing of modules Manual path and import checks
Better team collaboration Requires discipline to maintain separation
Easier debugging and refactoring May need more files to manage

GitHub Issue and PR Links πŸ”—

πŸ”Ž Issue: #13 – Need to Modularize and Refactor Streamlit Code

πŸ“Œ Refactor & Modularize Codebase for Streamlit App Deployment #13

The current monolithic codebase needs to be modularized to improve readability, maintainability, and deployment workflow. This issue involves splitting out individual functions into logically separated files and preparing the cleaned main app for deployment.


βœ… Tasks to be Completed

  1. πŸ”§ Function Splitting
    Run Function_Splitting.py to break down the monolithic app into smaller reusable function files saved in:

    • features_functions_local (for development)
    • features_functions_deployed (for production-ready deployment)
  2. πŸ“¦ Add Dependencies
    Use Dependency_Adder.py to automatically inject all necessary import statements at the top of each function file.

  3. πŸ“₯ Create Aggregated Import File
    Execute Function_Importing/Import_Functions.py to generate a single file for importing all extracted functions efficiently.

  4. 🧹 Clean the Main File
    Run Split_Clean_Main_Code.py to generate app_cleaned.py, which contains only the cleaned main() logic without function definitions.

  5. 🧠 Insert Custom Class
    Add the following class to the display_real_time_stock_prediction.py file in both local and deployed folders:

    # Class for real time stock data fetching and prediction
    
    # --- CLASS DEFINITION STARTS ---
    class StockPricePredictor:
        ...
    Enter fullscreen mode Exit fullscreen mode
  6. πŸ”— Keep Directory Paths Consistent
    In the deployed version, make sure these constants are preserved:

    DATASET_DIR = "Codes/Historical_Data_Analysis/Preprocessed_Dataset"
    DATASET_DIR_1 = "Codes/Historical_Data_Analysis"
    Enter fullscreen mode Exit fullscreen mode

🧾 Why This Is Needed

  • πŸš€ Improves Deployment Readiness – Easier to push updates without disturbing working components.
  • 🧱 Enforces Separation of Concerns – Logic is modular and testable.
  • πŸ’‘ Enhances Developer Experience – Clear structure helps future contributors understand and work on the code quickly.

πŸ“Ž Additional Notes

This issue will lead to a PR that includes:

  • Modular function files
  • A cleaned and refactored app_cleaned.py
  • Updated deployed code with integrated class structure

πŸ”§ PR: #14 – Modularize and Refactor Codebase for Local and Deployed Versions

✨ Refactor & Modularization Process for Streamlit App Deployment #14

This structure modularizes the codebase while retaining the original functionality

πŸ“ Folder Naming Convention

The function components have been cleanly separated and organized using two key directories:

  • features_functions_local – for local development and testing.
  • features_functions_deployed – for final cleaned and deployable code.

βš™οΈ Step-by-Step Breakdown of the Refactoring Workflow:

  1. πŸ”§ Function Extraction
    Run Function_Splitting.py
    ➀ This extracts all functions from the original app into individual Python files inside the respective features_functions_* folders.

  2. βž• Dependency Injection
    Run Dependency_Adder.py
    ➀ Automatically prepends necessary import statements to each function file by analyzing its contents using ast.

  3. πŸ“₯ Import Organizer
    Navigate to Function_Importing/Import_Functions.py
    ➀ Generates an aggregated import script from all modularized functions for use in the final build.

  4. 🧹 Main Code Cleaner
    Run Split_Clean_Main_Code.py
    ➀ Extracts and saves the core main() logic as app_cleaned.py, removing previously defined function bodies.


🧠 Post-Cleaning Instructions:

After completing the steps above:

  • βœ… Copy and insert this block into the file display_real_time_stock_prediction.py inside both features_functions_local and features_functions_deployed folders:

    # Class for real time stock data fetching and prediction
    
    # --- CLASS DEFINITION STARTS ---
    class StockPricePredictor:
        ...
    Enter fullscreen mode Exit fullscreen mode

    ℹ️ Ensure this is appended after the function definitions inside the same file, or integrated in a logically correct place based on usage.

  • βœ… In the deployed app, ensure the following paths are preserved:

    # Directory containing the preprocessed datasets
    DATASET_DIR = "Codes/Historical_Data_Analysis/Preprocessed_Dataset"
    
    # Directory containing the original historical datasets
    DATASET_DIR_1 = "Codes/Historical_Data_Analysis"
    Enter fullscreen mode Exit fullscreen mode
  • βœ… Finally, execute the split main function logic in the cleaned app_cleaned.py within the deployed directory.


βœ… Outcome:

We now have a fully modular, self-contained, and deployment-ready Streamlit application with clearly separated concerns for:

  • Functionality (features_functions_*)
  • Import management
  • Reusable core logic
  • Class-based structure where appropriate

Closes #13

I’ve explained the reasoning and implementation there in detail from my project’s point of view!


Before and After Code πŸ“


πŸ› οΈ Customize for Your Workflow: Using This for Your Own Project

This was originally tailored to my own project setup, but you can easily adapt it to fit yours. Here's how:

1. Extract Functions

Use the first step to extract all functions and store them in any directory of your choice. Clean them up as needed.

2. Add All Dependencies

In the Dependency_Adder.py file, list all your required libraries in the following format for clear and automatic management of dependencies:

Example Format:

import_suggestions = {
    # --- STREAMLIT APP & VISUALIZATION FRAMEWORK ---
    "st": "# Importing Streamlit for building the web-based interactive application framework\nimport streamlit as st",
    "plt": "# Importing Matplotlib for generating static plots and charts\nimport matplotlib.pyplot as plt",
    "go": "# Importing Plotly for creating interactive and dynamic visual plots\nimport plotly.graph_objects as go",
    "sns": "# Importing Seaborn for enhanced data visualizations\nimport seaborn as sns",

    # --- DATA HANDLING & MANIPULATION ---
    "pd": "# Importing Pandas for data manipulation and analysis\nimport pandas as pd",
    "np": "# Importing NumPy for numerical computations and array operations\nimport numpy as np",
    "os": "# Importing OS module for handling file and directory paths\nimport os",
    "datetime": "# Importing datetime for working with timestamps and date ranges\nfrom datetime import datetime, timedelta",
    "base64": "# Importing base64 for encoding and decoding binary data\nimport base64",

    # --- MACHINE LEARNING & MODELING ---
    # Continue with any additional dependencies...
}
Enter fullscreen mode Exit fullscreen mode

How It Works:

  • πŸ› οΈ Organize Dependencies: Define each library with a user-friendly alias and a short description in the import_suggestions dictionary.
  • πŸ”„ Auto-Search & Add: The Dependency_Adder.py will scan this dictionary, automatically adding the correct imports to your code files.

3. Import Functions & Organize Main Logic

Once your dependencies are added:

  1. πŸ”„ Run Function_Importing.py β€” This scans your codebase and inserts the appropriate imports automatically into each function file.

  2. 🧹 Clean up your main logic using Split_Clean_Main_Code.py β€” This separates the logic from the main file and organizes it into modular functions.

  3. 🧩 Finally, import all your functions in one line into your main file like so:

# Importing all functions
from Import_Functions_Deployed import *
Enter fullscreen mode Exit fullscreen mode

This way, your main file stays clean, and all logic is neatly modularized and connected!

4. Design Considerations

  • Ensure you have a central main() function to coordinate the workflow. This helps prevent circular dependencies.
  • Your code should already be modular β€” meaning, key logic should be inside functions. If it's not, this process won’t be effective.
  • This method doesn’t extract classes or global variables used by functions. You’ll need to either:

    • Add those manually, or
    • Ensure they exist within the main file, or are defined as needed before calling the extracted functions.

5. Run the Main File

Once the functions are organized, you can run your project from the main file just like in my setup.


Final Thoughts πŸ’¬

So this was my whole code flow β€” from my app’s point of view πŸ”„.
Let me know if you found this helpful! πŸ’¬

πŸ”† Explore the Project

πŸ” Explore on GitHub

I’ve added comments and documentation to my code, so the article doesn't become a novel. Each file could have its own article! Let me know if you'd like a detailed breakdown of any part.

πŸ› οΈ Customize for Your Workflow: Using This for Your Own Project
I’ve also added a dedicated section to help you adapt this setup for your own codebase.
If you have any questions or need help implementing it, feel free to reach out β€” happy to help! 😊

I'm also in the process of finalizing my thesis. Once it's complete, I'll publish a full series on this stock market prediction project β€” covering the concept, execution, my teammates, challenges, and more. Let me know if you're excited for it! πŸš€

Comments 0 total

    Add comment