Beautiful Soup

Beautiful soup is indeed beautiful!

I wanted to parse an HTML page containing a table and import it into a MySQL table in an automated way. Upon my friend Kumar’s advice, I came to know about Beautiful Soup. Today was the day to explore Beautiful Soup. Being new to python, I had to do a bit of python reading side-by-side. Finally, I was able to successfully pass an HTML file to my script and get a CSV output.

f = open("input_file.html","r")
g = open("outfile_file.csv,"w")
soup = BeautifulSoup(f)
t = soup.findAll('table')
for table in t:
    rows = table.findAll('tr')
    for tr in rows:
        cols = tr.findAll('td')
        for td in cols:
            g.write(td.find(text=True))
            g.write(",")
        g.write("\n")

This script parses a simple HTML table without looking for any special tags or anything. Now that this is working, I have to make this more stronger and parse an uglier table, my task for tomorrow.

Advertisements