3
Python Script to Merge GitHub Repository Python Files into a Markdown File
(self.the_lemmington_post)
import os
import re
def get_python_files(directory):
python_files = []
for root, dirs, files in os.walk(directory):
for file in files:
if file.endswith(".py"):
python_files.append(os.path.join(root, file))
return python_files
def read_file(file_path):
with open(file_path, "r", encoding="utf-8") as file:
contents = file.read()
return contents
def write_markdown(file_paths, output_file):
with open(output_file, "w", encoding="utf-8") as md_file:
for file_path in file_paths:
file_name = os.path.basename(file_path)
md_file.write(f"`{file_name}`\n\n")
md_file.write("```python\n")
md_file.write(read_file(file_path))
md_file.write("\n```\n\n")
def main():
github_repo_path = input("Enter the path to the GitHub repository: ")
python_files = get_python_files(github_repo_path)
output_file = "merged_files.md"
write_markdown(python_files, output_file)
print(f"Python files merged into {output_file}")
if __name__ == "__main__":
main()
Here's how the script works:
- The
get_python_files
function takes a directory path and returns a list of all Python files (files ending with.py
) found in that directory and its subdirectories. - The
read_file
function reads the contents of a file and returns it as a string. - The
write_markdown
function takes a list of file paths and an output file path. It iterates over the file paths, reads the contents of each file, and writes the file name and contents to the output file in the desired markdown format. - The
main
function prompts the user to enter the path to the GitHub repository, calls the other functions, and outputs a message indicating that the Python files have been merged into the output file (merged_files.md
).
To use the script, save it as a Python file (e.g., merge_python_files.py
), and run it with Python. When prompted, enter the path to the GitHub repository you want to process. The script will create a merged_files.md
file in the same directory containing the merged Python files in the requested format.
Note: This script assumes that the repository only contains Python files. If you want to include other file types or exclude certain files or directories, you may need to modify the get_python_files
function accordingly.