+91 9873530045
admin@learnwithfrahimcom
Mon - Sat : 09 AM - 09 PM

Day 5: Python File Handling & Web Scraping


Step 1: Understanding File Types

  • .py – Python code file, executable in Python interpreter.
  • .txt – Plain text file for storing data or notes.
  • .csv – Comma-separated values, useful for tables or spreadsheets.
  • requirements.txt – List of Python packages for your project.

Step 2: Reading a Text File

Assume the file data.txt is saved in C:/Python Tutorial/Day 5/. Open it using open():

  • 'r' – Read (default), file must exist.
  • 'w' – Write, creates or overwrites file.
  • 'a' – Append to file.
  • 'rb'/'wb' – Read/write in binary mode.
file_path = "C:/Python Tutorial/Day 5/data.txt" # full path
file = open(file_path, "r") # open file in read mode
content = file.read() # read entire content
print(content) # display content
file.close() # close the file
Hello World!
Python is fun.

Step 3: Writing to a Text File

Append or overwrite data:

file_path = "C:/Python Tutorial/Day 5/data.txt"
file = open(file_path, "a") # open file in append mode
file.write("This is a new line.\n") # write new line
file.close() # close the file

Step 4: Working with CSV Files

You can use csv or pandas. Here we use csv:

import csv

# Writing CSV
with open("C:/Python Tutorial/Day 5/data.csv", "w", newline="") as file:
  writer = csv.writer(file) # create writer object
  writer.writerow(["Name", "Age"]) # header row
  writer.writerow(["Alisha", 30]) # data row
  writer.writerow(["John", 25])

# Reading CSV
with open("C:/Python Tutorial/Day 5/data.csv", "r") as file:
  reader = csv.reader(file) # create reader object
  for row in reader:
    print(row) # print each row

Explanation: We open CSV with with so it auto-closes. writerow writes a list as a row. reader reads each row as a list.

Tip: For complex CSVs, you can use pandas:

import pandas as pd
df = pd.read_csv("C:/Python Tutorial/Day 5/data.csv")
print(df.head())
['Name', 'Age']
['Alisha', '30']
['John', '25']

Step 5: PyCharm Project Example – File Handling

MyPythonProject
├── file_demo.py
├── data.txt
└── data.csv
# file_demo.py
with open("C:/Python Tutorial/Day 5/data.txt", "r") as f:
    print(f.read())  # read content

with open("C:/Python Tutorial/Day 5/data.txt", "a") as f:
    f.write("Added line from Python\n")  # append a new line

Step 6: Web Scraping & Data Extraction

1. Installing Required Packages

Open command prompt/terminal:

pip install requests beautifulsoup4 lxml pandas pytube SpeechRecognition

2. Reading a Webpage

Using requests and BeautifulSoup to get title and text:

import requests
from bs4 import BeautifulSoup

url = "https://www.example.com"
response = requests.get(url) # fetch webpage
soup = BeautifulSoup(response.text, 'lxml') # parse HTML
print(soup.title.text) # print page title
print(soup.get_text()[:300]) # print first 300 chars of visible text

Tip: If you prefer faster parsing and stricter HTML handling, you can also use html.parser or lxml as parser.

3. Extracting All Links

for link in soup.find_all('a'):
  print(link.get('href')) # print href attribute

4. Downloading & Converting YouTube Video to Text

Use pytube to download and SpeechRecognition to convert audio:

from pytube import YouTube
import speech_recognition as sr

yt = YouTube("https://www.youtube.com/watch?v=exampleID")
stream = yt.streams.filter(only_audio=True).first() # get audio only
stream.download(filename="video_audio.mp4")

r = sr.Recognizer()
with sr.AudioFile("video_audio.mp4") as source:
  audio = r.record(source) # read audio file
text = r.recognize_google(audio) # convert audio to text
print(text[:500]) # first 500 chars

Extra Tips

  • For structured HTML tables, pandas.read_html() can extract tables directly.
  • You can combine requests + pandas for CSV data online.
  • Always respect website’s robots.txt and scraping policies.
✔ End of Day 5 – File handling & advanced web scraping mastered. You can now handle local files, CSVs, scrape webpages, extract links, and convert YouTube videos to text!