Step 1: Understanding File Types
- .py – Python code file, executable in Python interpreter.
- .txt – Plain text file for storing data or notes.
- .csv – Comma-separated values, useful for tables or spreadsheets.
- requirements.txt – List of Python packages for your project.
Step 2: Reading a Text File
Assume the file data.txt is saved in C:/Python Tutorial/Day 5/. Open it using open():
'r' – Read (default), file must exist.
'w' – Write, creates or overwrites file.
'a' – Append to file.
'rb'/'wb' – Read/write in binary mode.
file_path = "C:/Python Tutorial/Day 5/data.txt" # full path
file = open(file_path, "r") # open file in read mode
content = file.read() # read entire content
print(content) # display content
file.close() # close the file
Hello World!
Python is fun.
Step 3: Writing to a Text File
Append or overwrite data:
file_path = "C:/Python Tutorial/Day 5/data.txt"
file = open(file_path, "a") # open file in append mode
file.write("This is a new line.\n") # write new line
file.close() # close the file
Step 4: Working with CSV Files
You can use csv or pandas. Here we use csv:
import csv
# Writing CSV
with open("C:/Python Tutorial/Day 5/data.csv", "w", newline="") as file:
writer = csv.writer(file) # create writer object
writer.writerow(["Name", "Age"]) # header row
writer.writerow(["Alisha", 30]) # data row
writer.writerow(["John", 25])
# Reading CSV
with open("C:/Python Tutorial/Day 5/data.csv", "r") as file:
reader = csv.reader(file) # create reader object
for row in reader:
print(row) # print each row
Explanation: We open CSV with with so it auto-closes. writerow writes a list as a row. reader reads each row as a list.
Tip: For complex CSVs, you can use pandas:
import pandas as pd
df = pd.read_csv("C:/Python Tutorial/Day 5/data.csv")
print(df.head())
['Name', 'Age']
['Alisha', '30']
['John', '25']
Step 5: PyCharm Project Example – File Handling
MyPythonProject
├── file_demo.py
├── data.txt
└── data.csv
# file_demo.py
with open("C:/Python Tutorial/Day 5/data.txt", "r") as f:
print(f.read()) # read content
with open("C:/Python Tutorial/Day 5/data.txt", "a") as f:
f.write("Added line from Python\n") # append a new line
Step 6: Web Scraping & Data Extraction
1. Installing Required Packages
Open command prompt/terminal:
pip install requests beautifulsoup4 lxml pandas pytube SpeechRecognition
2. Reading a Webpage
Using requests and BeautifulSoup to get title and text:
import requests
from bs4 import BeautifulSoup
url = "https://www.example.com"
response = requests.get(url) # fetch webpage
soup = BeautifulSoup(response.text, 'lxml') # parse HTML
print(soup.title.text) # print page title
print(soup.get_text()[:300]) # print first 300 chars of visible text
Tip: If you prefer faster parsing and stricter HTML handling, you can also use html.parser or lxml as parser.
3. Extracting All Links
for link in soup.find_all('a'):
print(link.get('href')) # print href attribute
4. Downloading & Converting YouTube Video to Text
Use pytube to download and SpeechRecognition to convert audio:
from pytube import YouTube
import speech_recognition as sr
yt = YouTube("https://www.youtube.com/watch?v=exampleID")
stream = yt.streams.filter(only_audio=True).first() # get audio only
stream.download(filename="video_audio.mp4")
r = sr.Recognizer()
with sr.AudioFile("video_audio.mp4") as source:
audio = r.record(source) # read audio file
text = r.recognize_google(audio) # convert audio to text
print(text[:500]) # first 500 chars
Extra Tips
- For structured HTML tables,
pandas.read_html() can extract tables directly.
- You can combine
requests + pandas for CSV data online.
- Always respect website’s
robots.txt and scraping policies.
✔ End of Day 5 – File handling & advanced web scraping mastered. You can now handle local files, CSVs, scrape webpages, extract links, and convert YouTube videos to text!